Why does having several records of something that’s a plurality go “against the simplest forms of DB normalization”? If we have more than one cause of death we need more than one record. What are you proposing?
It’s a THEMIS decision. Will be published soon.
@Christian_Reich, what plurality do you have? Do you have plural deaths, or plural death dates, or plural death types? As I understand, no. You have multiple causes. And it’s, of course, OK to have multiple records for multiple causes. But it’s not OK to duplicate all other data just because of multiple causes. All other data remains the same. Why not to duplicate? Because of https://en.wikipedia.org/wiki/Database_normalization
What I propose is to have a separate table for death causes with following columns:
I agree with @pavgra that our DEATH table should be unambigious in its definition of death. Just as the PERSON table can only have one record per person, because you only live once, so too should the DEATH table only allow up to one record per person, because you can only die once.
If there are data that contain multiple causes of death, then we need an appropriate data model solution to preserve this information and disambiguate it from the death itself. @pavgra’s proposal to have a separate CAUSE_OF_DEATH table is consistent with what the FDA Sentinel CDM has done (just without a standardized vocabulary): https://www.sentinelinitiative.org/sentinel/data/distributed-database-common-data-model/sentinel-common-data-model
Perhaps when we develop the Feline Data Model, we can allow for up to nine records in the DEATH table. But as long as we narrow our focus to people, one should suffice:)
Let’s firstly decide whether we are OK with multiple causes of death. And for this we need to collect use cases of how people want to analyze death.
We have now that the death domain contains the clinical event for how and when a Person dies.
I.e if person committed suicide we store ‘Suicide’ as the cause. It’s OK from some social and/or clinical
perspective. But any pathology report contains ‘suicide’ as cause of death: it can be asphyxia, cerebral edema, intoxication etc.
So what I think is that we definitely need to allow multiple causes of death because we may be interested in different levels of granularity. And when we are speaking about how to store - I agree with @pavgra but would like to see also something similar to condition_status_concept_id from CONDITION_OCCURRENCE table
Oh boy. Where were you guys when the issue was listed out for debate?
Here is my pragmatic approach to this:
- You want to allow more than one cause of death, for the reasons @Eldar listed.
- This problem really is not common. On the contrary. We usually badly lack information about death. To create two tables for reasons of ideological modeling purity appears to be a complete overkill to me.
So, tell me if there is a use case that would suffer from one table with repetitive death records. Unless there is one let’s keep it the way the community has decided.
Why don’t we modify the current Death table so that it will store the causes of death and put the fact that a person died in Person table? I feel like in two years it will be hard even to remember all the names of CDM tables.
What to do with NULL Death dates in OMOP?
Fabulous idea, @aostropolets!
We have death data from the state registry and it contains multiple contributing factors. Most of this data sits outside our CDM because of the restrictions. This idea would allow us to add in all the contributing factors while still maintaining the “cause” of death without duplicating the data.
@aostropolets, so what you propose is to have the discussed
cause_of_death table, but get rid of the original death table (logically; not focusing here on naming). Although the solution lacks the
ideological modeling purity which for some reason @Christian_Reich doesn’t like, not favoring fundamentals stated by Computer Science, it is better than storing multiple death facts per a person in the death table.
That indeed is a wonderful idea. Even purist @pavgra should also carol this, since it abolishes the formerly parallel table with a one-to-one relationship to PERSON. Not sure why this is wrong again.
Can you put it as a proposal into the CDM WG?
@Christian_Reich, I was misled by the
death_type_concept_id field which I assumed to be a characteristic of death fact and treated as a single record per person. But, thanks for your clarification, the type concept is, in fact, represents a source of death cause, so clearly it goes into the cause of death table because it has 1:1 relation with cause. Now the picture is clear for me and causes no more confusion. The only negative consequence is that we lose the link between the death date and the cause which the date was based on.
Continued thinking on the overall cause of death topic and following question appeared in my head - @Christian_Reich, why OMOP stores three separate fields (cause_concept_id, cause_source_value, cause_source_concept_id), while those basically represent a condition and could be replaced with a new record in Conditions table + a single field in death cause table, which is a reference to the new condition record? May be even death_type_concept_id can be abstracted into condition’s condition_type_concept_id field? Then, theoretically, we could totally remove the death / cause_of_death table
What to do with NULL Death dates in OMOP?
That is an even better idea!! We have the death date in PERSON, and the cause, as a Condition, in CONDITION_OCCURRENCE, and the fact that it comes from a death certificate in the Condition Type. Clean.
Can you make this proposal to the CDM WG?
Given the changes to the OMOP Vocabulary TYPE, what types would you use for death? Since DEATH_TYPE concept class isn’t a thing any more.
Probably need a few more than just these:
32815 Death Certificate
32885 US Social Security Death Master File
Death from medical record? Death at discharge?
See the post here. The death is recorded in the condition status fields and the provenance is recorded in the condition type field.
I’m trying to follow the other thread unsuccessfully . . .
So in CDM v5.3.2 how do I record a death at discharge record. This is how I would do it:
- Write a record to the DEATH table, with DEATH_TYPE = 32823 - EHR discharge record
- The visit that contained the death at discharge, you could have the DISCHARGE_TO be 32218 - “expired” we could also set SOURCE_VALUE as “expire”
See @clairblacketer issue:
Non-standard concept from the UB04 vocabulary?
This is patient centric, not hospital centric. When a patient dies it goes into the DEATH (v5) or PERSON (v6) tables. Not anywhere else. A Visit is defined as a setting of healthcare provision. Dead patients don’t receive healthcare.
I know it’s non-standard but I requested it be changed to standard
Isn’t it common practice for hospitals to discharge patients as ‘expired’ if they died in the hospital? Why couldn’t we record that information on the VISIT_OCCURRENCE record? There would still be a record either in DEATH or a value in DEATH_DATE in the PERSON table but we would just be associating a visit with the death event.