Family history extension model

Kang_Mira · July 19, 2023, 7:58pm

 Introduce

Previous research

All of us data’s Family History.
Convert FH of disease and person to observation_concept_id separately
disadvantage : multiple FH of disease & person data → difficultin accurately connecting disease and person

Aim of this study

Comparison of Conventional mapping method and New mapping method in single center medical check-up’s family history survey data conversion.

new method : FH of disease → observation_concept_id
in person → qualifier_concept_id

 Conculusion

New FH mapping method is possible to minimize ‘in person’ information loss → more accurate

New method showed 0 % of ‘in person’ information loss
Conventional method result in 100% information loss and show only 0.4% of uphill(broad) mapping is possible

Conventional method is complex & Labor intensive

Conventional method is need to multiple search process for proper concept_id

family history extension model_v1.pdf (1.8 MB)
family history reference set_v1.xlsx (142.8 KB)
family_histrory_extension_model_observation sample data_v1.xlsx (13.7 KB)

MPhilofsky · December 15, 2022, 3:12pm

Hello @Kang_Mira ,

I lead the newly-revived Themis work group, which is a sub-work group of the CDM WG. Themis’s role within the CDM WG is to define conventions on how the data are stored in the CDM. Once defined, we will post the conventions. The Themis WG will have a kick off meeting after the new year, date TBD. At that time, we will review the process for submitting, reviewing and ratifying conventions, along with a meeting schedule, and general logistics. Stay tuned!

Currently, we do not have any conventions on how to store family the data in the CDM. This is something Themis would like to define with conventions. I’ll reach out as we move forward with Themis.

Cheers,
Melanie

Andy_Kanter · December 15, 2022, 3:30pm

I have been collecting examples of where post-coordination cannot be reasonably maintained in the CDM. This seems to be another example where having an entry term that includes in person information gets lost. When looking at IMO’s terms which are used widely within the english-speaking EHR world, the maps from a term like “family history of alcoholism in sister” are mapped to “Family history of alcoholism (situation)” and to “Family history with explicit context pertaining to sister (situation)”. Linking these two two together inside the CDM would appear differently than keeping the person as a person. Have you looked at how this data would come out of the EHRs using IMO (90% of all EHRs in the US)?

MPhilofsky · December 15, 2022, 5:23pm

@Kang_Mira

What is the source of the family history data? Are they from IMO or ICD codes? Or do they originate from a free text or string text field?

Alexdavv · December 16, 2022, 12:30pm

Hi @Kang_Mira!
You’ve done tremendous work, thanks for the proposal.

At the same, there’s another recent proposal prepared by Marcel de Wilde and @Eduard_Korchmar.

And actually, we had a sort of convention. Not properly ratified, but the vocabularies are built exactly like that at the moment.

The key difference is whether we pre-coordinate, what, and how.

For the personal history, the decision was made and implemented some time ago in the vocabulary releases v20220510 and v20220829_major. But it’s true that we’re still missing the convention article.

For the family history, we can stick with the same model because:

It’s the same model and as Eddy/Marcel pointed out “we prefer generic approaches so we can also write generic algorithms on this”.
We avoid post-coordination of CDM records => no ugly fact_relationship, external keys or other unclear and slow heuristic.
All conditions are allowed to be the values and represent the family history => no need to maintain this part and make arbitrary decisions on what has a genetic component, or not.
The concepts that represent a family history in the actual relatives would be organized in the hierarchy that supports standardized analytics. If source data is not specific enough you’d map uphill to the “family history” top dog which would also be used if you don’t care about the level of relationship in your studies. To make this hierarchy rich and nice we’d recreate it using the SNOMED’s persons representation that supports the degrees and all the levels/details needed. I’d actually make it simpler even though organizing it into the hierarchy would resolve its massiveness. Such thing as the time context could be also addressed for some generic concepts if we’ll create the concepts like “FH of the first degree relative less than 50 years of age”. Even if we would add 5-10 permutations with different life-span periods for each concept in the hierarchy (not sure it’s needed), we’re still within the reasonable amount of concepts.

The only concern in this approach is the effort needed to compile a new hierarchy and handle an old one with the respective mappings from the existing source and Standard concepts.

MPhilofsky · December 16, 2022, 1:54pm

Would be nice if IMO would become a proprietary, OHDSI supported vocabulary considering 90% of those of us with EHR data have these codes

Currently, we have to use the IMO to SNOMED or the IMO to ICD mappings within our EHR. And these can be one IMO to very many SNOMED/ICD codes.

I haven’t seen any information/research on the topic of granularity loss when going from IMO to SNOMED or IMO to ICD. Do you know of any research?

mgkahn · December 16, 2022, 6:02pm

@MPhilofsky @Andy_Kanter
PEDSnet studied this exact issue
Burrows et al. - Standardizing Clinical Diagnoses Evaluating Alter.pdf (536.4 KB)

Burrows EK, Razzaghi H, Utidjian L, Bailey LC. Standardizing Clinical Diagnoses: Evaluating Alternate Terminology Selection. AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:71-79. PMID: 32477625; PMCID: PMC7233070.

Christian_Reich · December 16, 2022, 7:17pm

Now we know why IMO keeps it private. (Sorry, @Andy_Kanter, couldn’t help the snipe. You tried several times to overcome this.)

Kang_Mira · July 12, 2023, 1:03pm

The SNOMED CT vocabulary is updated monthly and many SNOMED concepts are inactivated or replaced with other concepts.
Although we suggested “sister of subject(person)” for sister in 2022, this concept has recently been inactivated and replaced with “Sister (person)”. Therefore, our team plan to use “Sister (person)”.instead of “sister of subject(person)”.
There is no one correct answer in policy. It doesn’t matter whether you use “Sister (person)” or “Family history with explicit context pertaining to sister (situation)”
The activity of SNOMED concepts can change any time and mapping policies are dependent on individual institutes , we always have to consider this condition and include all possible concepts in mind.

When you make your cohort definition in Atlas and add Qualifier Criteria in observation, if you search “sister”, then you get the results including “sister” context. You may include as many items with “sister” as you want. Thus you can include those concepts meaning sister as attributes : “Family history with explicit context pertaining to sister (situation)” , " Sister (person)", “FH: Sister”, ect.

Kang_Mira · July 13, 2023, 2:37am

IMO vocabulary is private and very popular in US. In Korea, all medical institutes have to use KCD for conditions (diagnosis), a Korean version of ICD, however, there is no government guideline for situation or finding concepts.
In Samsung Medical Center, we have our own vocabulary and we made a mapping relationship for family history between our data and SNOMED CT"

Kang_Mira · July 12, 2023, 10:25am

I think IMO has a plenty of concepts and it is a very nice vocabulary.
Unfortunately, IMO products are not public. You have to pay for IMOs. In contrary, ICD is public and SNOMED CT can be free in some conditions where your country has a national contract with International Health Terminology Standards Development Organisation. That would be the major reason why CDM adopted SNOMED/ICD.

MPhilofsky · July 19, 2023, 8:23pm

@Kang_Mira,

The work you have done on modeling family history is great! In order to move this idea to fruition, we need to put it through the Themis WG. Would you be able to create a GitHub issue here? Then the Themis WG will prioritize the issue and invite you to discuss the issue with family history data as it is now, your study results, and your suggestion on how to improve family history data modeling in the CDM.

Tagging @Alexdavv @mdewilde @Eduard_Korchmar @Christian_Reich seems you all have an interest

Andy_Kanter · August 7, 2023, 3:55pm

I think we would like to see how we can solve the post-coordination or interface terminology problem generically. IMO users would be able to use IMO in their implementations and others would be able to use their own level of specificity. The CIEL terminology designed for LMICs is one example that is open. Having data stored at the highest level of specificity but queryable/analyzable at the lowest common level of specificity (using SNOMED, for example) would still work for global network studies.