**edited because copy & paste didn’t copy all from the Word Doc this originated
I have a microbiology use case, too Some of this data is duplicative of more granular data contained in the same source table. However, the less granular data is useful when the data with finer granularity is NULL.
Easy ones first:
- specimen source/specimen_type is represented as a snomed code. Examples of string terms for the code = Structure of urinary tract proper, Lower respiratory tract structure, Catheter. This will map to SPECIMEN.specimen_concept_id & assc. columns
*origin of specimen code/body or device site is represented as a snomed code. Examples of string terms for the code = Urine specimen obtained via indwelling urinary catheter, Sputum specimen obtained by aspiration. This will map to SPECIMEN.anatomic_site_concept_id & assc. columns
Now I have multiple attributes for one *_concept_id. How do I represent it? Do I duplicate the *_concept_id to add in the attributes?
- drug susceptibility test/antiobiotic_test is represented as a LOINC code. Example of string terms for the code = Bacteria identified in Isolate by Culture, or Bacteria identified in Unspecified specimen by Culture. This will map to MEASUREMENT. concept_id
- type of bacteria represented as a snomed code. Examples of string terms for the code = E.Coli, shingella, no growth. This should/would “seem” to logically map to MEASUREMENT.value_as_concept_id for the above test, but the domain_id for the concept = Observation. Same issue @Dymshyts pointed out above
How do I represent the attributes of the bacteria? And are these attributes even necessary? Isn’t morphology a defining characteristic of a type of bacteria? Who’s the microbiology expert in OMOP? I want to create a 2nd field mapping for the type of bacteria (snomed code above) to the OBSERVATION.Observation_concept_id, but the following attributes won’t fit into one row of the OBSERVATION table:
- “status” of the bacteria. The 2 values are detected & not detected.
- morphology of the bacteria. Examples of string terms (no associated code) = branching, chains, mucoid. Is this necessary?
- oxidase status – test to help identify bacteria. Only 2 string terms = positive or negative. Is this necessary?
Then I have the susceptibility data for the above bacteria:
- drug susceptibility test/susceptibility represented as a LOINC code. Examples of string terms for the code = Aztreonam [Susceptibility], Ampicillin+Sulbactam [Susceptibility], Vancomycin [Susceptibility]. This will map to the MEASUREMENT.measurement_concept_id
- RxNorm drug code- Duplicative of the above line, there are RxNorm codes for the drug being tested for bacterial susceptibility. Since the above column is populated, we don’t need to include these RxNorm codes, but it would be nice to think of a solution on how to represent the drug codes as NOT Drug Exposures, but in relation to the bacterial susceptibility when this is the only susceptibility data provided by the source
- susceptibility data. The two values = Susceptible & resistant. This will map the MEASUREMENT.value_as_concept_id
Then there are data that I am not sure about. Are they useful? Are they duplicative of other data enumerated above? Do they even fit in the CDM without some ugly use of the Observation table?:
-
colony operator for the below fields. Values include <, >, and ~
-
colony count. This is a numeric value and is populated when the colony operator = < or >
-
colony count high and colony count low. These are a numeric values and are populated when the colony operator = ~
-
betalactamase test susceptability is represented as a LOINC code
-
betalactamase test susceptability result is represented as “positive” or “negative”
All the records will need to be linked via the Fact Relationship table using standard relationship_id concepts. However, I need to correctly field map the above before petitioning the Themis & vocabulary team to add more standard concepts
I’ll take a look at @cukarthik proposal, too. But this is needed now.
Thoughts? @nzvyagina @Christian_Reich @Dymshyts