Hannah Blau here from the Critical Path Institute (Tucson, AZ). I’m developing software to manage C-Path’s custom concepts and their concept relationships. Want to make sure the C-Path database schema is aligned with the current OHDSI thinking about custom concept relationship metadata. Looking at the spreadsheet for community contributions (template_4_adding_vocabulary.xlsx - Google Sheets) there is a column named predicate_id that can have values exactMatch, broadMatch, narrowMatch. This corresponds to a column named relationship_predicate_id in the concept_relationship_metatdata.csv that I downloaded from the February 2025 vocabulary release. Which column name should I adopt for the concept relationship metadata we are recording at C-Path?
My second metadata question concerns the mapping_justification field. This column is included in the community contribution spreadsheet but not in February’s concept_relationship_metatdata.csv. Are you planning to keep this column in the concept relationship metadata?
The allowable values for mapping_justification as indicated in the community contribution spreadsheet are: ManualMappingCuration, LexicalMatching, and DataDriven. The first two match two of the possible 13 values for mapping_justification in the SSSOM specification (mapping_justification - A Simple Standard for Sharing Ontology Mappings (SSSOM)). DataDriven is not on the SSSOM list. We recently found ourselves in a situation where the most accurate mapping_justification would be MappingChaining in the SSSOM scheme of things, whereas DataDriven does not really capture that information. Where should I be looking to find a definition or examples of what DataDriven is supposed to cover? Or might you be expanding the acceptable values for this field in the future?
Stepping back from these specific questions, could you please also give me some guidance regarding what I should consider the authoritative source of information about OHDSI expectations for concept and concept relationship metadata? That would be very helpful.
Sorry in advance, I don’t have answers to your questions regarding community contributions!
As you are working on developing software to manage C-Path’s custom concepts, I just wanted to make you aware of @Jared’s work at Tufts CTSI in support of the BRIDGE2AI CHORUS project:
Jared and @Polina_Talapova have used this to effectively manage large custom vocabularies for the CHORUS project (consortia of 12 hospitals mapping ICU terminology including waveform and imaging) as well as for the OHDSI Workgroups for Geographical Information Systems and Psychiatry.
Thanks @kzollove!
Even more projects are popping up, e.g., FinOMOP (@Javier and the team) leverage the USAGI extended format to organize mapping exchange across the national network.
@hannahblau Good question! There’s currently no OHDSI standard that has been ratified for implementing metadata. At this stage, we focused on implementing the most straightforward aspects of the SSSOM and began applying them to the top of the iceberg, which led to the release of two test metadata vocabulary versions. The DDL and the latest data dictionary are here, but we can’t give you more right now.
We’ll definitely need to collaborate more with our friends @matentzn@mellybelly and gain more experience in using SSSOM and best practices before we can release a full-fledged version in OHDSI. Then, we’ll apply it to the rest of the content, which is quite an extensive task.
Do you want to join the Vocabulary working group and brainstorm this?
Thank you @Alexdavv for the pointer to FinOMOP, I was unaware of this project. I have already studied the available OHDSI web pages and downloads related to metadata; the discrepancies among those resources prompted my inquiry in this forum. I now realize my question comes a little too early in the process of establishing the OHDSI metadata formats. I appreciate the invitation to participate in the Vocabulary working group. My team lead and I agree that would be out of scope for my current job responsibilities. Thank you for your help.