OHDSI Home | Forums | Wiki | Github

Where to store derived data?

(Robert Miller) #1


We are working on a GIS extension to OHDSI and are trying to figure out where to put derived variables in regards to the CDM.

One of our goals is for users to be able to derive new variables using their own location data and external sources. A simple use case would be “distance to closest park” for each location. Another would be “how many supermarkets are within a 10 mile radius”. In a sense, they seem to be observations about locations. A more complicated scenario would be “what was the travel time from residence to care site for this visit”.

Should these exist outside of the CDM? Is there a standard way of incorporating external data?

(Christian Reich) #2


All good questions. Generally, the rule is that an unmodeled so far type of data goes into the OBSERVATION table. There, you can put in anything you want. In this case it would be observation_concept_id with a Concept that stands for “Longitude of home” and “Latitude of home” and the value_as_number. The Concept is either available (usually SNOMED, which has a lot of things), or we’d create it.

If you want to model it you’d add e.g. two fields longitude_of_home and latitude_of_home in the person table. That way you get it fully normalized. All other variables, like travel time etc. are outside the CDM, since they have nothing to do with the patient’s data.

(Robert Miller) #3

I understand that the CDM is person-centric but it seems odd to me that coordinates would belong in the person table. During development we have kept our coordinates in an altered version of the location table (latitude and longitude columns).

Lets say we wanted to store the concept of “distance to closest park” for all locations, including both residences and care sites. In this case, it seems more intuitive to relate the attribute to the location entity itself.

Are there standard ways of attributing facts to entities other than person? Locations, care sites, providers, etc.

(Christian Reich) #4


If there is a use case we can bring them in. People have mentinoed that before, so I wouldn’t be too shocked by it.


Put them into those tables. Everything else is patient-centric, as you said.

When you are done, do you want to make a official proposal to the WG in the proposal section?

(Vikram) #5

Hi @Christian_Reich

I was about to post a query when the suggestion showed me this question. Seems to address my issue , though i have a clarification.

As Christian mentioned in the post any unmodeled data goes into OBSERVATION table. I need to store the cause of an Adverse event .
My understanding is I would make an entry in the OBSERVATION table with the “observation_type_concept_id” column being the adverse event concept and the value being stored in the “value_as_concept_id” column.

This id for storing the adverse event related to the study .
Any suggestions are welcome.


(Christian Reich) #6


This might not be the right discussion panel here, but you probably want to go about this slightly differently:

  • If your cause is already a record in the database, like a DRUG_EXPOSURE or DEVICE_EXPOSURE record, you want to store the adverse event in CONDITION_OCCURRENCE and link the two through a FACT_RELATIONSHIP record. So, no Observation.
  • If your cause is not part of the data you still want to keep the adverse event in CONDITION_OCCURRENCE but link it to a record in OBSERVATION where observation_concept_id (not observation_type_concept_id) reflects the cause.