OHDSI Home | Forums | Wiki | Github

How to map source of infection in body structure

Dear experts,
In the field of microbial infections, I need to map the following information: focus of infection located in a body structure. That is, for example: focus of infection + in the Ear, nose and throat structure (concept_id = 3512886). In my opinion there should be some observation domain concept (observable entity class) to combine it with the body structure as “value_as_concept_id”.
Continuing with this idea, I am only able to locate the concept “source of infection” (concept_id = 4084383), of domain observation, BUT of class “attribute”. Would it be correct to use this concept from a semantic point of view? I’m not sure about it. Could you suggest me an alternative?.
Thank you very much for your help.

Hello, @mpascualc. Your approach is correct. An alternative can be concept like infection of ear, which belongs to Condition domain in Snomed vocabulary, where you can put both “focus of infection” and “in the ear” in one concept.
4044878 129127001 Infection of ear Clinical Finding Standard Valid Condition SNOMED

Thank you very much for your reply.
Indeed, that proposal that you make me is the one that I have finally adopted. Infections (disease) provide the “Finding site” relationship that links to the body structure that acts as a focus.
Best regards.


You can do that - technically. But you are not solving your problem of making those facts available for analytics, especially in the network, where the analyst tends not to see the data and can’t “find” those combinations.

Instead, facts in OMOP follow the Closed World assumptions, which means, they are taken from a list of predefined facts (concepts). A record with one of those means concepts means something happened, otherwise it did not. If you want to record an ear infection you need to put ONE such record into the CONDITION table, nothing else. There are concepts for bacterial ear infection, but not for infection by individual bacterial species.

What are you trying to analyze and how?

Thanks for your advice.
I completely agree with you: the model and the terminologies should not be used as a “shoehorn”. There is no point in storing data if it is not going to be properly located later.
The work that I am telling you about revolves around research on infections caused by multi-resistant bacteria. The situation is classic: the doctors involved have proposed a set of variables/values for the dataset. It is intended to persist the data in an OMOP instance. I am reviewing the set of variables/values to get an adequate mapping of concepts and the proposal of consistent concept constructs (events) to be persisted in OMOP and that these events are “exploitable”.
One of the proposed variables/values refers to the location of the infection focus: infection focus (variable) and body locations (values). The first idea was to propose a MAPS_TO/ MAPS_TO_VALUE structure based on an observation (“source of infection” 4084383, SNOMED 246083009) and concepts referring to body locations as values. However, although the observation concept says what is intended to be recorded (concept_name) and the domain is adequate for the strategy (observation), analyzing the hierarchy of the “source of infection” concept, it is detected that it is not semantically adequate. Assuming that the exploration of semantically interoperable instances (such as OMOP) must be done through ontologies and their hierarchical and inter-hierarchical relationships, no one intending to explore this OMOP instance could encounter this event. Therefore, I have proposed a refocus of the variable towards what you (and another forum colleague) have proposed: recording (as a single event in the condition table) of the infection condition including the infection site. In this way, subsequently, through the “finding site” relationship, the source of infection could be reached as a body structure (which is the variable/value sought by the study), or on the contrary, from body structures as sources of infection, get the related condition events.
It is necessary to make researchers understand that sometimes it is not possible to directly (semantically) map their variables/values. If we want to work with semantically interoperable and reusable datasets, eventually, we have to refocus these variables/values and, it will be in the exploitation phase (the use cases) where such variables/values have to be reconstructed from datasets.
Thank you again.

I know where you are coming from. The traditional habit is to create a de-novo database with the “variables” that are needed for research, and then collect them. And then nothing happens, either because the collection turns out not feasible during normal care, or the other clinical data cannot be had that same way. And the OHDSI Vocabularies are way too frightening to even consider. But we could try:

If I correctly understand you need concepts that combine:

  • The condition of a bacterial infections,
  • Their anatomical sites,
  • The exact bacterial agents causing them.

Is that true? The first too typically are captured well, e.g. Bacterial otitis media. If you want the agent pre-coordinated SNOMED becomes very spotty (Haemophilus influenzae otitis media and Tuberculous otitis media only).

What we could do is an OMOP Extension to SNOMED: permute all bacterial infections of some anatomy available with all the typical pathogenic agents, and make them descendent concepts. You could then purge this list to only those that actually happen in the real world (there is probably not a yersinia infection of the middle ear).

Want to give it a try?

Thanks for your comments.
I have also worked on transformation scenarios to OMOP-CDM of existing databases for secondary uses based on “ad-hoc” data models; in general, these scenarios are more complex since it is often impossible to undo ambiguities.
In the scenarios of creating new databases based on semantically interoperable models and common schemas such as OMOP, it is in which it is possible and essential to make the maximum effort to persist the data properly; mainly, refocusing the variables/values according to the terminologies/ontologies, the persistence schemes and keeping in mind thefeasibility of exploitation, and all this prior to the data collection from the sources. In addition, requiring high doses of pedagogy to manage the powerful inertias that have traditionally guided these works.
The concepts that I need to combine are only two: the bacterial infection and the anatomical site; however, the proposal you are making is very interesting and could significantly increase the possibilities of exploiting the data.
As you point out, in SNOMED there are few concepts in which there is a pre-coordination of the three domains (bacterial infection, anatomical site and causative bacteria). However, I thought that through an approach based on ontological relations it could offer an additional possibility.
I have been analyzing in a basic way (ECL expressions), if there are “causative agent” relationships in the bacterial infection concepts:

  1. Infectious diseases that link non-generic causative agent (bacteria domain):
    << 87628006 | Bacterial infectious disease (disorder) | :
    246075003 | Causative agent (attribute) | = < 409822003 | Domain Bacteria (organism) |

2056 results

  1. Infectious diseases that link non-generic causative agent (bacteria domain) and anatomical site:
    << 87628006 | Bacterial infectious disease (disorder) | :
    246075003 | Causative agent (attribute) | = < 409822003 | Domain Bacteria (organism) | AND
    363698007 | Finding site (attribute) | = <<91723000 | Anatomical structure (body structure) |

1440 results

  1. Infectious diseases that link non-generic causative agent (bacteria domain) and specific anatomical site (lung):
    << 87628006 | Bacterial infectious disease (disorder) | :
    246075003 | Causative agent (attribute) | = < 409822003 | Domain Bacteria (organism) | AND
    363698007 | Finding site (attribute) | = <<39607008 | Lung structure (body structure) |

107 results.

  1. Infectious diseases that link non-generic causative agent (bacterial domain) and specific structure of the lung:
    << 87628006 | Bacterial infectious disease (disorder) | :
    246075003 | Causative agent (attribute) | = < 409822003 | Domain Bacteria (organism) | AND
    363698007 | Finding site (attribute) | = <<113254000 | Structure of interstitial tissue of lung (body structure) |

4 results

I have the impression that the scope offered by SNOMED to link bacterial infection-anatomical site-causative agent, although more extensive than pre-coordination, is still too limited to be approached through ontological relationships.

Another observation is that the related organisms (causative agent) refer to a greater extent to families of bacteria, which is probably the most correct way to characterize the causative agents of infection, although a limitation is generated to link the infection with a specific bacterium detected in the culture.

In this situation, the proposal that you make about an OMOP extension for SNOMED in this field deserves a reflection. I will discuss it with my colleagues and I will try it. It would bring benefits not only in mapping events but also in exploitation.

Thanks again for your comments.