OHDSI Home | Forums | Wiki | Github

One-to-many source to concept mappings - creating two records in OMOP for one record in the source?

We’ve been following IMEDS conventions for stuffing OMOP but I am currently leaning strongly in the direction of departing from the convention of creating multiple rows in exposure tables if there is a one-to-many map between the source concept and the OMOP concept. Because this step would not be transparent to analysts, they may misinterpret as two distinct events. We would need to have a “source row id” or something to make it clear that a record was split during ETL.

I would much prefer this kind of heavy-handed processing be done when the ERA tables are constructed, because those tables are clearly defined as derived and rules may differ for different analysis cases.

Anyone made a similar departure?

1 Like

@Daniella_Meeker:

You have been following what? :slight_smile: I feel like a turkey now.

Not sure what you mean. In the claims world, a lot of things which are the same event are recorded through different records. For example, in Procedure records for drug administration - #13 by jenniferduryea Jennifer explains in detail how hospital-administered drugs are coded as combination of several codes.

The splitting of a code happens rarely, and usually only for those codes which are deemed truly combinations of several things. Not sure what the analyst would lose. Do you have a use case?

1 Like

I think there is a larger issue at play here. If we are talking about claims data and reconstructing what really happened to patient from claims - it make sense to create multiple rows.

The problem is that there are CDM adopters that do not start with claims data - but start with EHR data. Their view of the world may be different. EHR is closer to the patient than claims and expanding event into multiple may not be desired.

CDM seems to me (in some aspects of the model) very much claims centric. Also claims are relatively well behaved - compared to variability of EHR data…

I am also facing a similar question regarding to “one-to-many mapping”. I am working with information from EHR for a cohort of AIDS patients and I face situations like the one described in “data model conventions” (OMOP-CDM documentation):
ICD-9-CM code 070.43 ‘Hepatitis E with hepatic coma’ maps to the SNOMED concept for ‘Acute hepatitis E’ and a second SNOMED concept for ‘Hepatic coma’, in which case multiple CONDITION_OCCURRENCE records will be generated.
That is to say, it is necessary to enter two records in CONDITION_OCCURRENCE table: 197490 acute hepatitis E (SNOMED) and 377604 hepatic coma (SNOMED). Two records that represents a unique EHR source event!
I wonder: does this (decoupling) mechanism reflect the reality that the information in EHR conveys to us? Is it possible to apply any kind of additional mechanisms through which relink both facts (at least indirectly)?
Would it be possible (and recommended / necessary) to stablish a relationship through the FACT_RELATIONSHIP table, for instance, in this case, by using the relationships:
44818890 Finding associated with (SNOMED)
44818770 Has associated finding (SNOMED)

That is to say, a couple of records in FACT_RELATIONSHIP table:
Acute hepatitis E Finding associated with Hepatic coma
Hepatic coma Has associated finding Acute hepatitis E
I think in that way, CDM records may reflect much better the information contained in EHR.
Is this a correct approach? Is it a good practice?
Is there any other strategy to “relink” that kind of “decoupled” facts in CDM?
Many thanks in advance.

Correct approach to do what? What is the use case?

Hepatitis can cause an hepatic coma, but that is not a patient-specific fact. It is generic, and should therefore be part of the reference (Vocabularies) tables, rather than in FACT_RELATIONSHIP. ICD-9-CM has that as a pre-coordinated concept composed of two distinct conditions (one a disease of the liver, the other of the brain). For whatever reason they decided to have such a pre-coordination here, but they certainly don’t have all possible conditions that could be linked because one causes the other.

But again: What are you trying to achieve?

t