This is of one longest existing topics on a forum, and I think to solve it, we lack a good use case, I’m actually surprised @Christian_Reich didn’t asked that before.
So maybe someone has a good use case of how they would like to analyze microbiology data? What research questions you want to ask?
This will help us to shape the data in the OMOP CDM.
It’s pretty much any ID question. What are the outcomes of patients with MRSA? Does one antibiotics work better than another? And the list goes on, including characterization, comparative effectiveness and quality measures.
I think All Of Us came up with a model for microbiology data they use, but I don’t know the details.
Is there an update or decision on best practice regarding how to capture microbiology data?
Microbiology data in general, or the specific question of germ sensitivity to antibiotics? The former never was a problem, the latter needs finishing.
Both.
That means, linking the records of the specimen table and measurement table via the fact relationship table is currently the way to go for samples/specimen and isolates/measurements until an updated version of the CDM includes the possibility for a more direct relationship.
Where will the decision for the Antibiogram be published? In the documentation of the CDM?
In CDM v5.4 both the Observation and Measurement table now have had columns added to allow the Observation or Measurement to link back to another table.
meas_event_field_concept_id, Put the CONCEPT_ID that identifies which table and field the MEASUREMENT_EVENT_ID came from.
measurement_event_id, If the Measurement record is related to another record in the database, this field is the primary key of the linked record.
This is the more direct relationship.
Hello,
I am relatively new to the OHDSI community and found this thread regarding microbiology susceptibility. Our group is working on a phenotype that will rely on bacterial cultures and antibiotic susceptibility test results (MIC values). Is there a defined convention for this in OMOP v5.3.1 and 5.6?
Thank you for any assistance or updates on this matter. I greatly appreciate it.
Kind regards,
Alyssa
Hello @ARBeck and welcome to OHDSI!
FYI: There isn’t a v5.6 OMOP CDM. Most of the OHDSI community is on version 5.4 of the CDM with some sites still on v5.3.1.
The most recent discussion I see is on GitHub for the CDM from October 2021 states there will be pre-coordinated concepts added to the vocabulary. However, up above in this thread, in June 2023 @Christian_Reich states it still needs finishing.
@Christian_Reich @clairblacketer @cukarthik Where does this stand? If it’s finished, we (CDM WG or Themis WG) should write up some guidance for ETLers and researchers.
I am also relatively new to OHDSI and interested in the current status of microbiology within the CDM. From this thread it appears there are several solutions over time which might be useful for a more standardized implementation.
Hello, is there an update on the domain question regarding how/where to store organisms? I couldn’t find any further information or conclusion, and I have the same issue concerning the Meas value domain of organisms.
My problem is that less than 20% of the organisms present in our data can be mapped to the Meas Value domain. The rest didn’t have a match in this domain but in the observation domain instead. And it would also be possible to map all the organisms that I mapped to the Meas Value domain to the Observation domain. In other words, the vocabulary used for the Meas Value domain (LOINC) is not complete at all, but the vocabulary used for the Observation domain (SNOMED) is almost complete.
Having the data mapped to two different vocabularies does not seem appropriate when I could just map everything to SNOMED which is in the observation domain.
Now I am wondering what is the preferred way to handle this situation.
- Using the measurement table, mapping everything to the observation domain
- Using the observation table, mapping everything to the observation domain
- Using the measurement table, mapping as much as possible to the Meas Value domain and the rest to the observation domain and having mixed terminologies for the organisms
- Split the data in the measurement and observation tables based on the domain to which the data can be mapped
Thank you very much for your help and opinion on this topic! I am looking forward to learning more about the approaches and mindset.
Excellent questions, @HeideNei!
I am pinging @Christian_Reich @clairblacketer @cukarthik to help answer the question for you.
Also, in addition to the above question, @Christian_Reich, should @HeideNei submit the request for domain change to the Vocab team’s community contribution? How should they proceed?
A short answer is what @willgarneau pointed to. There are different approaches (as you can see by the length of this thread) to microbiology data in the community: creating an extension table, mapping organisms to Meas Value organisms, mapping test + organism to pre-coordinated terms (note that it not only requires new concepts but also ETL work to combine fields) and more.
A very pragmatic solution is to implement something you can use to enable your research right here and right now.
A larger scale proper solution (I think, based on what I read so far) is to move forward with the proposal for pre-coordinated susceptibility testing + organism terms as per this GitHub. This requires some resources/time/funding/prioritization to make vocabulary changes happen.
I don’t see a problem in using the measurement table as it is right now, as I only want to store organisms and no additional information like germ sensitivity to antibiotics, for example. The only problem I have is a vocabulary/domain problem.
As the measurement table is currently defined (v5.4), the detected organism would be included in the value_as_concept_id attribute. This attribute has the restriction that the categorical value should be mapped to a standard concept in the ‘Meas Value’ domain. However, only few organisms can be found in this domain. On the other hand, all the organisms can be found in the Observation domain.
I am wondering what the standard procedure is for this case.
Friends:
Sounds like somebody, or any of you, might want to step up. This keeps coming back every now and then.
A couple points:
- Put the organisms into the Measurement, not the Meas Value. So, “Streptococcus in blood culture” - “detected”/“not detected”, rather than “Blood culture” - “streptococcus” (which I believe is what you have in mind, @HeideNei). Reason: The analyst should find what’s important in one place and not having to search around.
- Same is true for the antibiotic sensitivity: “Sensitivity to Vancomycin” - “Sensitive”/“Not sensitive”
- There is a question if you even need a negative result, and whether positive blood cultures and sensitivities is sufficient for the use cases. I would always err on the side of simplicity.
- Leaves us with the question of how to combine them, i.e. that the strep is sensitive to vancomycin. One way is to use FACT_RELATIONSHIP to connect the otherwise two independent records. Alternatively, you create all meaningful combinations, like “Streptococcus sensitive to vancomycin”. Shouldn’t be that many. I think @cukarthik might have such a list as a starting point.
Once that is ready, we can add it to the Vocabularies and maybe put some Themis rules in place how to do that.
Yes, what I had in mind was “Microorganism identified in Specimen by Culture”-“Candida catenulata”, for example.
For my use case, the antibiotics are fortunately not important at the moment.
I don’t yet understand why I should put the organism into the measurement.
Firstly, this doesn’t seem to make the mapping easier. From an initial quick search on Athena, I get the impression that there are even fewer organisms as Measurement than as Meas Value. I couldn’t find your “Streptococcus in blood culture” either. Or did you mean that we should create all these terms and add them to the vocabulary?
Secondly, this differs from the microbiology results that a lab would report. So even if the lab uses a standardized terminology, we always would have to do a mapping.
Thirdly, it makes it harder in querying the data later, as all the blood cultures don’t share a common concept.
Precisely. The current situation is confusing.
I don’t think lab reports anything in a standard fashion except LOINC. And that has the organisms in the measurement, e.g. Streptococcus pneumoniae DNA [Presence] by NAA with non-probe detection in Positive blood culture. The result is either Detected or Not detected.
And that’s how it should be. The Measurement represents the specific tests, the value is either numerical+unit or a categorical yes/no.
Remember: There are all sorts of ways how to test for strep: agglutination, antibody presence, PCR, etc. They don’t easily allow reversing like you suggest: PCR?-Strep/Staph/Meningo etc. They are all different tests of their own kind.
So, in EAV whether we say “Blood culture”?-“Strep” or “Strep in blood culture?”-“Yes” makes no substantial difference. We should just stick to the existing convention.
Hello @HeideNei ,
We are now facing the exact issue that you have described above. I was wondering if you had adopted an approach for how to proceed, and how that is working?
Thanks so much
At the moment, I am using the following approach. However, this is not a convention that has already been decided on.
measurement_concept_id → microorganism (SNOMED, observation domain, organism class)
value_as_concept_id → Detected/Not detected (SNOMED, Meas Value domain, Qualifier Value class)
measurement_event_id → link to corresponding specimen
meas_event_field_concept_id → 1147049 specimen.specimen_id
What are your findings and thoughts on this?
Sorry for the delayed response - thank you so much for sharing your approach!
We’ve had a bit of a pause in approaching this area, but we will definitely take your approach as a good starting guideline, thank you!