OHDSI Home | Forums | Wiki | Github

Limitations of ATHENA for Food Allergy Data Commons

We are building a data commons for food allergy research. The current terminologies (ICD, SNOMED, DO) have lots of holes in this area. We are finding that the normal diagnosis tree does not fit food allergy well. We probably don’t want to create a hundred more food allergy diagnoses. Rather, modeling food allergies as a simple food allergy concept with the trigger compound. See early draft Food Allergy Tree Lucidchart.pdf (151.4 KB). We also cannot find any terminology that separates the food at a fine enough detail. For example, “Egg cooked into a matrix” is immunologically differently from “incompletely cooked egg”.

We could use some advice on OMOP ontology extensions to handle food allergies. Has anyone had to deal with anything similar in the past and could offer some advice?

@wlodarmt

The way food allergy is modeled is to put it in Observation table, using concept_id 4188027 (Allergy to food), and then put the actual food name into either value_as_string or value_as_concept_id fields. If there is a concept_id for the specified food, e.g., 4026741 (Boiled egg), then put into value_as_concept_id column. If concept_id does not exist, then put the food, e.g., egg cooked into a matrix, into value_as_string column.

Not sure if this answers your question.

@wlodarmt and Justin:

Your structure is not bad, you are distinguishing the response from the actual allergens. However, you seem to be mixing together a number of different dimensions:

  • Types of intolerance (hypersensitivity involving the immune system, non-immune mediated, like Lactose intolerance)
  • Molecular mechanisms of hypersensitivity (IgE-mediated, eosinophilic)
  • Allergic disposition (what people mean when they say they are allergic) and allergic disorder (dermatitis, asthma, etc)

So, SNOMED is actually not that bad. It lays it all out somewhat cleanly. The actual allergens, if known, can be described as @QI_omop does above.

But the real question is: What is the use case? What questions do you want to answer? That should determine the representation of the data.

t