OHDSI Home | Forums | Wiki | Github

Methodology for Converting Source-To-Concept-Map to Concept/Concept_Relationship (2 billionaires)

One of the OKRs for the Healthcare System Interest Group (HSIG) is to create a methodology for moving the data in the SOURCE_TO_CONCEPT_MAP (STCM) table into the CONCEPT and CONCEPT_RELATIONSHIPS (C/CR) tables in the 2 billionaire range.

The STCM was originally created for OMOP v4 for the purposes of allowing the mapping of local codes to OMOP Standard Concepts. With the advent of OMOP v5, however, that functionality was transferred to the C/CR tables. However, for backward compatibility reasons, the STCM was not deprecated. The idea was (presumably) that folks would gradually migrate to the newer method.

Unfortunately, the STCM still remains in wide use for several reasons:

  1. The Book of OHDSI recommends its use.
  2. It is relatively simple to implement.
  3. USAGI (a standard tool that helps with mapping) works with the STCM format.
  4. Local mapping can be maintained by a separate team unfamiliar with OMOP.
  5. There is no standard or recommended way to maintain the C/CR method.
  6. Moving from the STCM to C/CR can involve a lot of ETL code modification.

Also unfortunately, there are several reasons to stop using SCTM:

  1. Codes mapped in SCTM are not visible in ATLAS (OHDSI’s cohort creation tool) and other standard tools.
  2. SCTM is not flexible enough to map the more subtle relationships available with C/CR like “Maps To Value”.
  3. Hierarchies are not supported using STCM.
  4. As an organization’s mapping becomes more robust, the SCTM will remain limited.

Additional questions:

  1. What are the costs, benefits, and ROI (return on investment) with this move?
  2. What is the recommended course for organizations new to OMOP to implement local codes? SCTM or C/CR?
  3. How can organizations using STCM best convert their process to C/CR?
  4. Is a hybridized method possible?
  5. Should the STCM table be deprecated in future major OMOP releases (>6.0)?
  6. Whatever happened to the Wide Mapping Table?

The HSIG would like feedback, stories, problems, solutions, opinions, etc. around this issue.

@MPhilofsky @Eduard_Korchmar @Daniel_Smith @Yacob_Tsegay_Gebrete @roger.carlson

Pertinent links:

1 Like

Cross posting this issue, pertinent to the adoption of a C/CR process in light of coordination with custom concepts across several institutions:

Creating a registry of custom concept_ids (2-Billionaire Club) to avoid collisions across networks - Vocabulary Users - OHDSI Forums

Melanie presented a poster back in 2020 on this exact topic and I feel it answers all of your questions: https://www.ohdsi.org/wp-content/uploads/2020/10/Melanie-Philofsky-Philofsky-Mapping-Source-Codes-Poster.pdf

TLDR: if you can do C/CR do it (you’ll need to learn some vocab rules so that your instance doesn’t violate basic conformance checks like having ‘Maps to’ only to standard concepts).

1 Like

While Melanie’s paper a useful basis for creation of STCM to C/CR conversion model, the actual process has never been formally described. And QA SQL script, although, once again, very useful, is not a good basis to base the logic on:

  1. Nor design decisions, nor implementation is explicitly documented;
  2. Uses concept_*_stage data model, which is not a part of OMOP CDM, and just used by Vocabulary Team as implementation detail for authoring.
  3. Does not cover modern cases – like “Maps to” and “Maps to value” targets domain consistency (“Meas Value” for “Measurement”)

What was discussed on HSIG call is a proposal to formally model the process of conversion, in documentation and/a as a conversion script. There is institutional inertia that keeps people using STCM – and if we want people to switch to C/CR, we need to answer some very crucial questions, not the least of which is:

The conversion is not a trivial task. I am familiar with OMOP, so I can imagine what a process would look like, but every OMOP instance has it’s own established solutions and conversion scripts, often oriented to target STCM to store and update mappings. Until we have a clear model in place, we can not convince people, how easy or difficult it is for any OMOPized dataset to make the jump to C/CR.

1 Like

Absolutely agree, we do need a formal description/convention. @Eduard_Korchmar, given your expertise with the vocabularies and OMOP CDM, would you like to initiate and lead such an endeavor? That would be much appreciated. Or, if somebody already volunteered, it would be good to know their name to send help and questions their way.

I do intend to do that. As a part of a planned OHDSI Python package, I want to include a module for a framework for converting data between STCM and 2billion concept space; this will by necessity include writing design documentation for the reference implementation. Once this exists, both the documentation and the library can be iterated upon following results of pilot projects. I plan to present a development proposal for this on the upcoming Open-Source Workgroup call on April 19th.

I do not have a proper announcement to link until then, but there is a long form forum message: