OHDSI Home | Forums | Wiki | Github

How to Determine Event Type Concept when Vocabulary Maps to a Different Domain

The event (condition, procedure, …) type should be determined independent of the target domain. For example, if a procedure code in the facility header record is mapped to an observation, the event type should be concept id 42865905, Facility Header, where the domain is ‘Type Concept’ and the vocabulary id is ‘Procedure Type’, even though the event will reside the Observation table.

The Themis Group is proposing the convention that the event type is determined by the source of the data without regard to the target domain of the source code, and that any concept in the ‘Type Concept’ domain is a valid event type regardless of the table the event resides in.

Themis Issue: How to Determine Event Type

I heard from @Christian_Reich that all the type concepts will become domain unspecific or, in other words, you will be able to put any type in any table. Don’t you feel that this solution will be more elegant and less confusing?

And @Christian_Reich says: “What’s your question, @DTorok?”

No question. This is a proposed Themis rule, and I posted here as part of the public comment period. The reason I am pushing this is because we have customers that insist that the only acceptable type concept for a table are those that match the vocabulary name, i.e. only concepts that belong to vocabulary ‘Observation Type’ should be used for Observation type concept id.

Just saying that ‘all the type concepts will become domain unspecific or, in other words, you will be able to put any type in any table.’ does not provide guidance to ETL developers. Reason for saying the event type should be determined by the source data is that the type will provide information about where the data was in the source, provenance. From my example, if a procedure code in the facility header record is mapped to an Observation, the event type should be concept id 42865905, Facility Header, where the domain is ‘Type Concept’ and the vocabulary id is ‘Procedure Type’, even though the event will reside the Observation table.

You’ll have to decide on your own what concept to use. There will be single vocabulary_id (“Type” or something) and you’ll pick whatever you want for observation, procedure_occurrence and so on.

Moving all type concepts into a single vocabulary will be counter-productive, because you will lose information about the source of the data. Again using my example, concept 42865905 (Facility header) provides some information about the origin, but the vocabulary, Procedure Type lets you know the data came from a field that typically holds proc codes. If you move everything into a single vocabulary then you would need to create a new concept for ‘Proc Code from Facility Header’ to convey the same information.

The proposal is not to change the current structure for type concepts. Just to make it clear that the event type is not dependent on the domain of the ‘Mapped to’ concept.

Two questions:

  • And what are you going to lose if you don’t know it came from a proc field?
  • How do you now know it came from a proc field? Because the domain defines which table it goes, and if it goes into a DRUG_EXPOSURE table you cannot use a Procedure Type Concept on it.
  • And what are you going to lose if you don’t know it came from a proc field?
    Will lose provenance. Is provenance important? Was important enough that the ability to track it has been maintained from OMOP v3 until now.

  • How do you know know it came from a proc field?
    Cannot think of an ETL we have done where column in the source does not indicate that the expected value should be a procedure, condition, or medication.

  • it goes into a DRUG_EXPOSURE table you cannot use a Procedure Type Concept on it.
    Why not? Because of convention? And the proposal is to change the convention.

This is not the first time this topic has been discussed see New Observational Types? Here @Patrick_Ryan says “If we’re talking about a claims database, then the origin of the data is from an inpatient/outpatient medical claims diagnosis codes and/or procedure codes. In which case, the existing _TYPE values, as they are used for CONDITION and PROCEDURE domain should be used in the OBSERVATION table when the _CONCEPT_ID has that domain.”

I think the Themis proposal is simply moving this from an obscure forum post to ETL guidance.

Well, we use the type concept to track provenance categories, so we can decide if we believe it or not. I don’t think we are tracking field names. The Facility Header and Facility Detail thing will die when we implement this THEMIS issue. Unless somebody objects with a legitimate use case, @DTorok. :slight_smile:

Strange. Because the field name does NOT determine that. The domain_id of the Concept does.

That’s the whole idea. That we not longer have domain-specific type concepts. To solve the problem you seem to be bringing up.

Or did I get that wrong?

The inpatient/outpatient aspect of it does not belong to the Type Concept. That’s Visit stuff. We will kick that out.

Exactly. The plan is to capture the general source of information (e.g. claims, type of EHR - patient reported, physician reported, auto-reported, etc.). Not sure where we put the priority (primary, first, second, etc.)

Any good thoughts?

I’m not sure if we agree or disagree. All we initially wanted is a statement that the event type should be determined without regard to the domain of the ‘Mapped to’ target concept.
As a result:

  • There will be records where the vocabulary id of the type concept is different from domain associated with the table. E.g. For a condition mapped to an Observation the record will be in the Observation table and the vocabulary id of the type concept will be ‘Condition type’.
  • All concepts used as event types (condition_type_concept_id, procedure_type_concept_id) should be standard concepts belonging to the ‘Type Concept’ Domain. The event type vocabulary id provides information about the source of the data. For example, the data came from a field in the source data that holds condition codes.

That’s the plan. In fact, we want to abolish the domain-specific Type Concepts, but just have generic Type Concepts. Like in “Patient reported”, and “From order”, “From EHR”, “From claim, primary”, that kind of thing. Not yet sure what it will look like.

There is no such a thing. The source fields contain ICD-10-CM or HCPCS codes. Whether or not they are Conditions or Procedures gets decided by the OMOP vocabularies. More than half of the HCPCS codes are not procedures, but some other justifications to get paid.

Many EHR data holders have “historical” data. Historical data is generally defined as data that was recorded in an EHR system prior to the current EHR in use at a hospital. Not all data transfers over, so it isn’t the complete record of data for a person. However, this is important data that is converted to the CDM, but it is of potentially questionable quality because it’s incomplete. We need a generic type_concept_id for this.

Might not be the best forum topic for your request for a new event type as this forum topic is about trying to define a Themis rule for how to determine the event type during ETL.

@MPhilofsky: Are you now opening the same subject several times? Is this a cry for attention? :slight_smile:

I had a fabulous idea for a new type_concept_id while reading the above thread and accidentally hijacked @DTorok’s thread. Sorry!

IMHO, I do think you both have the same idea stated differently. And I agree it is best to have

and

Friends:

Has this proposal been ratified? I haven’t seen any opposition on this thread or on the Github issue posted in Themis

Colorado has another pressing need for a domain agnostic type_concept_id.

We have some immunization data that we know comes from multiple sources (state registry, administered, patient reported, dispensed, etc.). However, we do not have the data that distinguish where each individual record came from. Per Athena, there isn’t a generic “from EHR record” in the concept table with vocabulary_id = ‘drug type’.

From my reading, there is agreement that the event type should not be bound by the where the record ends up (the domain). Where I see the difference is that my proposal is a short term fix that does not involve changing the current event type concepts, it simply says any concept in the domain ‘Type Concept’ is valid for fields with the suffix ‘_type_concept_id’. Where @Christian_Reich and @aostropolets are proposing refactoring the Type Concepts removing the domain specific information. So that there will only be one concept saying ‘Claim record first position’ instead of multiple type concepts that say ‘Claim record first position’ where one record has the vocabulary id ‘Procedure Type’ and another similar record has the vocabulary id ‘Condition Type’.

Not sure if either solution addresses your immediate need since you need a type concept for ‘state registry’ which does not currently exist.

@DTorok

Yours is a Good short term solution. However, allowing exceptions to the “data only lives in its assigned domain” rule would cause confusion amongst those that aren’t living and breathing the OMOP CDM.

This would require people to re-map their type concept ids, but would definitely cause less confusion.

Actually, I need a generic “from EHR record” type concept id because we are unable to identify which records in our immunization table come from the state registry vs. an administered drug vs. a patient reported immunization, etc.

@Christian_Reich & @aostropolets thoughts?

We will make a proposition shortly.

1 Like
t