Hi All,
The slides that @rimma and @mgurley presented on 11/15 to the CDM working group are now on github. Please review in preparation for voting at the December 4th meeting.
Clair
OHDSI Home | Forums | Wiki | Github |
Hi All,
The slides that @rimma and @mgurley presented on 11/15 to the CDM working group are now on github. Please review in preparation for voting at the December 4th meeting.
Clair
I want to bring here the discussion of the EPISODE_EVENT table. We have done a lot of thinking and consulting for the optimal design of this table, and still there was an uncertainty. So, I made one more attempt in comparison of the two proposed designs (see below). I wrote use case queries against both of them, and, based on these queries, could not find any compelling arguments for either design. However, Design B is more economical and consistent with our approach to representation of many-to-many relationships to multiple tables in one, and I vote for this design.
Everyone who patiently and passionately participated in this design discussion (@gregk, @Christian_Reich, @mgurley, @Gowtham_Rao, @Andrew, @DTorok, @clairblacketer) and anyone else who cares, please check these out again, raise your concerns, and state your preference. We want to have a vote for the overall proposal on Dec 4th.
The ERD of the proposed relationship between the new EPISODE table and other tables in the CDM is depicted in the figure below (all the other details are in the proposal Oncology/Episode of Care Combined Proposal Ā· Issue #239 Ā· OHDSI/CommonDataModel Ā· GitHub) .
The issue is the most effective design of the EPISODE_EVENT table. Four possible designs have been proposed:
Design A. Use of FACT_RELATIONSHIP, with the event_id/event_table_concept_id foreign key reference method, requiring a case construct each time you build a join.
Design B. A new table similar but where the āoneā arm of the relationship is a proper FK construct, and the other arm is again event_id/event_table_concept_id.
DESIGN C. A new table with the Episode FK and a bunch of proper FK fields, one for Procedure, Drug, Condition, etc. All but one would be set to NULL in this construct.
DESIGN D. A bunch of new tables with Episode FK in all and a dedicated Drug FK, Condition FK. etc.
Design A was dismissed because we leaned towards a specialized table that is expected to be heavily utilized. Design D was rejected because it would result in a large number of new tables.
The primary argument against the use of Design B (or FACT_RELATIONSHIP for that matter) was a necessity of using the slow CASE WHEN clause for building joins. The choice of Design C was to avoid using CASE WHEN and directly reference each respective field and table. I am suggesting that for EPISODE_EVENT, we donāt need to use it. Instead, we should be using UNION ALL and constraining each participating statement on its respective table. Please see and compare below.
QUERY retrieving episode and its related events against Design B:
SELECT episode., condition_occurrence. condition_occurrence_start_datetime, condition_occurrence. condition_occurrence_concept_id
FROM episode
JOIN episode_event ON episode. episode_id = episode_event.epsiode_id
JOIN condition_occurrence ON episode_event.event_id = condition_occurrence. condition_occurrence_id
WHERE event_table_concept_id = ācondition_occurrenceā
AND relationship_concept_id = āDisease-Eventā
UNION ALL
SELECT episode., procedure_occurrence. procedure_occurrence_start_datetime, procedure_occurrence. procedure_occurrence_concept_id
FROM episode
JOIN episode_event ON episode. episode_id = episode_event.epsiode_id
JOIN procedure_occurrence ON episode_event.procedure_occurrence_id = procedure_occurrence. procedure_occurrence_id
WHERE event_table_concept_id = āprocedure_occurrenceā
AND relationship_concept_id = āDisease-Eventā
QUERY retrieving episode and its related events against Design C:
SELECT episode., condition_occurrence. condition_occurrence_start_datetime, condition_occurrence. condition_occurrence_concept_id
FROM episode
JOIN episode_event ON episode. episode_id = episode_event.epsiode_id
JOIN condition_occurrence ON episode_event.condition_occurrence_id = condition_occurrence. condition_occurrence_id
UNION ALL
SELECT episode., procedure_occurrence. procedure_occurrence_start_datetime, procedure_occurrence. procedure_occurrence_concept_id
FROM episode
JOIN episode_event ON episode. episode_id = episode_event.epsiode_id
JOIN procedure_occurrence ON episode_event.procedure_occurrence_id = procedure_occurrence. procedure_occurrence_id
If you agree with this querying approach, then, clearly, design B is more economical, elegant, and consistent with our design of other similar tables (COST, FACT_RELATIONSHIP). Please shout out if you disagree.
In any case, I want to thank @clairblacketer and @DTorok for their vigilance.
Thanks @rimma for pushing this forward. Design B is preferable, but Design D with dedicated tables to each domain might have a better performance. If there is no interest in adding a bunch of new tables to the CDM, I would vote for Design B.
I agree with B also.
B gets my vote too.
For simplicity I would go for option B. @rimma, sorry Iāve missed most of these discussions, but it looks like this model would also capture pregnancy āepisodesā. Am I correct? Thanks again for all your work on this.
@cukarthik, the EPISODE table should be able to house any types of episodes. Oncology WG has focused on the oncology use case for obvious reasons @Gowtham_Rao has the episode of care use case. For each new use case, new concepts, specific derivation methods, and ETL instructions will have to be created. I think that while the ownership and development of each specific use case would belong to individual contributors/groups, management of the resulting products (concepts, derivation algorithms, ETL conventions) should be centralized and done in a uniform way. I suggest that we should start thinking about this rather sooner than later.
I prefer B to C as well as others.
Thank you
Where can we find the updated proposal?
@DTorok this is the most recent I have on the gitub: https://github.com/OHDSI/CommonDataModel/issues/239 though I donāt think it includes design B for the EPISODE_EVENT table that we all have seemed to agreed on.
@DTorok, @clairblacketer,
I will adjust the proposal to reflect the change to Design B and include queries tomorrow.
@rimma What about adding episode_id to all the tables including visit_occurence,condiction_occurenceā¦?
@yabin.kang The idea is that a clinical event may possibly participate in many episodes. An entry, for example, in the DRUG_EXPOSURE table might contribute to a Treatment Episode and/or a Disease Episode and/or an Episode of Care Episode.
Hi all,
thanks a lot for your effort. Looking forward to see how this proposal is implemented in OMOP.
I have a question about the mappings between ICD-03 and SNOMED proposal. There should be mappings in the concept_relationship table with relationship_id = āHas associated morphologyā. I could not find any value with that relationship_id. Nevertheless I saw that (in my loaded version of the vocabularies) there are around 60k of relations with relationship_id = āHas asso morphā. The same number of relationships were found with relationship_id = āAsso morphā. Not sure if this is a redundancy in the concept_relationship table but I wanted to highlight it to the community.
Furthermore, is there any interest in creating standard concepts for tumor stages that are not covered by SNOMED? For instance, the BCLC (Barcelona clinic liver cancer stage) and the International Neuroblastoma Risk Group Staging System but the International Neuroblastoma Staging System are currently not described by SNOMED or any standard vocabulary in OMOP.
Ah, and by the way. I noticed that the workgroup is not listed in the OHDSI wiki (http://www.ohdsi.org/web/wiki/doku.php?id=projects:overview). Perhaps adding it there would increase the communityās interest on the wg
Thatās the same thing, Alberto. Our relationship_ids have to be shorter than 20 characters. So, we shorten them. But itās the same thing, and yes, they are also built to ICD-O-3-derived concepts if they donāt have an equivalent in SNOMED.
We actually are planning to work on this with SNOMED and the Nebraska group. We will go through cancer by cancer. Do you want to come in and help? Would be fantastic.
Thanks for pointing it out. Will fix.