OHDSI Home | Forums | Wiki | Github


(Christian Reich) #21


Still on vacation, hence slow answer. Looks like the discussion progressed according to the following lines:

  1. Condition vs. Observation: According to the OMOP definition everything that defines the diseased state reported by the physician (diagnosis) or patient (sign, symptom) is a Condition. So, let’s not use Observations in here. That table should really be called GARBAGE_CAN because it contains everything that is not a condition, or a measurement, or a procedure, or a drug, or a device. So, if we are trying to solve the oncology problem we should model this properly where it belongs (the CONDITION table).

  2. Pre vs. post-coordination. I think we all agree that we cannot pre-coordinate the entire definition of a cancerous disease of a patient into a single code. Which means we have to do some post-coordination. Currently, the model does not allow true post-coordination (lining up a random number of codes with some logic between them). And I don’t know what SNOMED has in mind because @Vojtech_Huser’s link is dead. So, the only way I see is to add fields, that together with the condition_concept_id form the complete cancer record. How about this:

Wrt LOINC: Not sure how this helps us. All they did is to give a LOINC number to each choice of a panel. Makes it a little easier to ETL stuff into the vocabulary, but those panels aren’t exactly user-friendly if you want to find, say, stage IV amelanotic melanomas, unless you really know them by heart.

(Mark Danese) #22

Hi Christian:

I am not sure I understand #1. Condition should be location (breast) + histology (specific code) + behavior (malignant, in situ, benign). Maybe grade though that can be a measurement for prostate cancer (Gleason score) and I would suggest it is not part of the condition. That seems like a pretty concrete definition of a condition, and aligns with ICD-9 and ICD-10 (more or less). Not that such alignment makes it a condition – but it gives us an independent view of how cancers are defined.

Stage is an observation about the cancer (along with all the other TNM information). In fact most of the information about “extent of disease” is observation. A few may be measures (size, number of nodes, etc). In lymphoma, there is the issue of spleen involvement and extranodal involvement. There are many site-specific measures which are not quantitative (a specific number) but are qualitative (normal, abnormal, low, high, borderline, missing, etc.) I have a hard time thinking of these as conditions. But maybe that is not what you were suggesting.

One could argue that metastases are a separate condition (“secondary malignant neoplasm” is a separate ICD-9 and ICD-10 code). I can see that as a condition.

On LOINC, I find it tedious to use but seems very relevant for what I am calling “observation” data. If we can map the cancer conditions properly to SNOMED, that seems like a better idea for OMOP. As mentioned above, the existing but old SEER mapping from ICD-O-2/3 to ICD-10 and then your ICD-10 to SNOMED mapping might take care of 80% of the job.


(Michael Gurley) #23

I concur with @Mark_Danese line of thinking. For oncology, condition should be left at location + histology + behavior. Just a ICDO3 instructs. So hopefully the NCI will finish the map of ICD03 to ICD10 soon. Opening up the condition table to add columns for what are evidentiary observations and measurements about the cancer diagnosis would be a pandora’s box. I am sure other disease areas would get in line to pollute the condition table very quickly. Already, as mentioned by @rtmill, there is a notion of clinical versus pathological staging that would not be able to fit in to only one column like ‘tumor_stage_concept_id’. Moreover, as @Mark_Danese raises, prostate cancer Gleason grade would likely need special handling. It is a summation of 3 sub measurements, Gleason primary, Gleason secondary and Gleason tertiary. So would be very hard to fit into the condition table. The measurement table supports the notion of categorical values via the ‘value_as_concept_id’, so it already supports non-numeric data capture. Ideally, the concept/vocabulary tables would contain all the canonical vocabularies that @Mark_Danese mentions and the ODHSI/OMOP community could converge on best practices for what to use for oncology data.

(Christian Reich) #24


Not sure why you are so adamant about the Observation. We have a definition. Is there any reason to break it?

Whether you have a metastatic disease or not - I agree it could be one or two conditions, but it is a condition without doubt. Having your lymph nodes or other organs affected or not is part of the disease. That’s how cancer works: It grows locally, down the lymphatic draining system, and to remote sites. The degree to which that happened is part of the disease, not some external unrelated observation. Otherwise, you’d have to suppress things like “Severe depression” and put a condition “Depression” into the CONDITION and “Severe” into the OBERSATION tables. We are not doing such decomposition, and nobody else does.

The question is: How are we going to split up an otherwise over-pre-coordinated definition into the relevant pieces.

(Michael Gurley) #25

Actually, I am more inclined to use the measurement table over the observation table. Alternatively, maybe we could hang a table off of condition_occurrence. Maybe condition_occurrence_attirbute? Something structured very similar to the measurement or observation table. That way we respect the definition of condition, but support the richness of specification required by oncology. And don’t start walking down the road of disease-specific columns within the condition_occurrence table.

(Charles Bailey) #26

I have to disagree (courteously! :)) with @mdanese here. I think stage is a fact about the cancer, like histology and grade and molecular phenotype, but it’s an integral part of what gets one to a diagnosis. As a concrete example, stage 4 receptor negative breast carcinoma is a different diagnosis from stage 1 receptor positive breast carcinoma, from a clinical perspective: different biology, different therapy, different prognosis.

However, I don’t want to obscure the point that some uses cases will need to get at those facts independently. So I don’t have any particular objection to encoding particular facts as measurements or observations to make that easier. But I do think we need to end up with data in condition_occcurrence that integrates the key health service factors.

(Christian Reich) #27

That’s a good idea. However, I actually don’t think that adding “anatomical site” and “histology” is something disease specific. In fact, SNOMED has all diseases with these attributes (if it makes sense).

Should we maybe organize a session and sit down exploring how the various existing solutions have solved that problem, before jumping to the solution space?

(Mark Danese) #28

Clearly I am misunderstanding the distinction between conditions and observations. But I am fine if we dump most/all of it in the condition table. That is actually 100% consistent with how we handle it in our internal data model.

(Michael Gurley) #29


I would be happy to participate in a session. Yes, adding “anatomical site” seems disease agnostic. Coupling that with putting the ICDO3 morphology in the condition_concept_id would cover histology + behavior + location. But clinical staging, pathological staging and then (just for Prostate Cancer) Gleason Primary, Gleason Secondary, Gleason Tertiary, Extra Capsular Extension, Margins, Seminal Vesicle Invasion, Lymph Nodes Invasion, Vascular/Lymphatic Invasion, Perineural Invasion need to go somewhere. If it is inappropriate to put them in the measurement table or the observation table because they refine the nature of the patient’s condition, then either they can be added as columns to condition_occurrence or hang off a new condition_ocurrence_attirbute table. Sorry to jump to implementation. But I would vote for a new table.

(George Hripcsak) #30

I agree that I would be careful about new columns in the condition tables. If you add anatomical site, we’ll start to see sites post-coordinating currently standard terms. Eg infections will become infection with a site code.


(Christian Reich) #31


Actually, I was thinking of filling this information in as a matter of course. If you know the anatomical site you put it in. If not, SNOMED provides for the conditions a default:

4219977 Traumatic anosmia 4019908 Olfactory nerve structure
4096805 Post-surgical hypoparathyroidism 4006785 Parathyroid structure
381440 Extradural hemorrhage following injury without open intracranial wound AND with concussion 4028247 Intracranial structure

As a result, we’d have a reasonable anatomical structure no matter what, and the post-coordination is not creating contradictions.

We could do a similar thing for pathology.


(Andrew Williams) #32

Christian, your approach seems promising to me. I would be very interested in participating in a session on this.

(Andrew Williams) #33

@Christian_Reich Do you have a time-frame in mind for a session? Some key people at our shop will only be available to work on this over the next few weeks. So we are eager to develop a consensus about the proper approach ASAP.

(Christian Reich) #34

I’ll put a doodle in.

(Michael Gurley) #35


Is the session today at 2:00 to 3:00? If so, what time zone? What phone number?
Sorry for the questions I have not participated in any vocabulary working group sessions before.

(Christian Reich) #36

Sorry for the silence, friends. Still working with individuals to find a date. We need all be there who have use cases, data and experience.

(Michael Gurley) #37

This is a followup from the Oncology in OMOP CDM kick-off session.
Here is a link to the CAP Cancer Protocol Templates:

cancer protocols

The templates are broken up by anatomical site/sub-site. The templates cover many of the tumor/site specific data points that OHDSI folks have expressed an interest in placing within the CDM. The templates are in the form of question/answer checklists. Here is a link to the license-required XML version of the CAP Cancer Protocol Templates (named CAP eCC):


From what I have read, the CAP eCC contains mappings to SNOMED concepts/codes. I believe for both questions and answers. I am planning on talking to CAP in relation to my own project for a Prostate Cancer data repository at Northwestern that uses the OHDSI/OMOP CDM.

(Michael Gurley) #38

Here is another possible source of help in organizing oncology data within ODHSI/OMOP:


(Andrew Williams) #39

@Christian_Reich Is there a timeline for moving this forward?
I’m holding off on developing an ad hoc ETL so we conform to the official CDM solution. The timing of my grant may soon force my hand. We need an ETL that puts the NAACCR output of our institution’s cancer registry into the same CDM instance as our EHR data.

(Christian Reich) #40

Yes. I will spawn a Subgroup today. And I won’t be the bottleneck any longer. :frowning: