OHDSI Home | Forums | Wiki | Github

How to store date of onset and date of diagnosis for a disease?


Hello world!

My example is the following:

Disease: Multiple Sclerosis
I have a documented date of onset and date of diagnosis (these two differ). The OMOP CDM tells me that the first date is seen as Observation, the second as CONDITION_OCCURENCE. So far so good.

When I want to insert data into the OBSERVATION table, I’m not sure whether observation_concept_id is the concept_id for “date of onset” (4181873) OR the concept_id for “Multiple Sclerosis” (374919).

How would the correct mapping be? Right now I’m working with CDMv5.2.2.

Thanks a lot!

(Qi Yang) #2

The concept_id for “Multiple Sclerosis” (374919) belongs to Condition domain so it should not be put into OBSERVATION table. Concept_id for “date of onset” (4181873) belongs to the Observation domain and is therefore more appropriate. When this observation record is created, a link between the diagnosis record in CONDITION_OCCURENCE table and the date of onset record in OBSERVATION table should be established via FACT_RELATIONSHIP table so that people know the date of onset of what disease.


Thank you, Qi Yang!
It is as I thought it was then. I will have a look at the FACT_RELATIONSHIP table, or more explicitly at examples of it. I’m not used to using this table, yet.

Thanks again!

(Qi Yang) #4

You’re welcome, @Qubit

Although I provided a solution to the issue using FACT_RELATIONSHIP table. I don’t really want to advocate using it. The reasons are these: First, nobody will discover this relationship. And then, Atlas does not know how to use it. Also one of the important goals for OMOP is to do network study. That is, the same SQL sent to multiple organization to get result back. It’s very hard to imagine that multiple parties will design CDM exact the same way by using FACT_RELATIONSHIP table to tackle this issue. As a result, the SQL using FACT_RELATIONSHIP table probably won’t work at all in other organizations.

So in summary, we should keep the usage of FACT_RELATIONSHIP table at a bare minimum, or as the last resort. For the issue of disease onset, the FACT_RELATIONSHIP table is a solution but certainly not the preferred solution. I will call @Christian_Reich for a better approach.

(Qi Yang) #5


Also in Observation table in CDM v6.0., Two new columns (observation_event_id and obs_event_field_concept_id) are added to provide a link to other event tables. So they can be used in following way:

observation_event_id : put Condition_Occurrence.Condition_Occurrence_id here
obs_event_field_concept_id : put field concept id for Condition_Occurrence.Condition_Occurrence_id here (1147127)

This is more straight forward than using FACT_RELATIONSHIP table. However, I still don’t think it is the best solution, because again, we have to assume all other organizations will use the same design as well. Once again, I would call @Christian_Reich to comment on it.

(Anna Ostropolets) #6

It doesn’t seem that the approach above is the right one. You don’t put date as a concept, you just use a date of corresponding condition.
In this case, you’d put BOTH events under MS concept_id in the CONDITION_OCCURRENCE table.
Then, if you want to find patients with newly diagnosed disorder you’d use ‘First occurrence’ argument in Atlas, if exacerbation - occurrences of diagnosis after the first instance of diagnosis.
If you still want to separate those two events on the concept level, you have to find appropriate concepts that represent the disorder itself, not a ‘date’.

Does it sound right?

(Michael Gurley) #7

This might be a job for the new EPISODE table that will hopefully be soon made part of an official CDM release.


Thanks, Qi Yang! Have you had the chance yet to talk to Christian about it?


Thanks for your input, Anna!
I think it’s not quite right since the date of onset is not the date of diagnosis. Especially in MS the time between first symptoms and the actual diagnosis can be quite long (I think about 2-3 years on average). That’s why that time span is so interesting/important. But it could still work with two entries in the condition_occurence table, I think.

(Christian Reich) #10


He is saying he would call Christian, meaning, you do it. :slight_smile:

As @aostropolets said. The onset of a disease is when first time it is recorded. If these guys have an explicit precise onset date make sure to put a record with that date in CONDITION_OCCURRENCE, and no records appear before. But there is no record “XYZ occurred at date 123” with a time stamp of today. The records record what’s happening and when, not when you say it happened and when.

Makes sense?