OHDSI Home | Forums | Wiki | Github

Confused by: select * from concept_synonym where concept_id == 40355052

concept_id 40355052 in the concepts table is Keratosis follicularis (aka Darier’s disease). Why are there rows for Milroy’s disease, Mongolian spots and various other things in the concept_synonym table that also have a concept_id value of 40355052? What am I misunderstanding?

Tim C

This is a bug in the extraction of concept_name that will be fixed in the next SNOMED release in OMOP.

SNOMED identified this concept as “Integument anomalies: [ichthyosis congenita] or [Darier’s] or [keratosis follicularis] or [Meige’s] or [Milroy’s] or [Mongolian spots] or [pseudoxanthoma elasticum] or [congenital NOS]” but now it’s deprecated by SNOMED due to its ambiguity.

Another thing is the wrong mapping we extracted from 7 possible:

It should be mapped to generic Disorder of integument or Congenital anomaly of integument (if every mentioned is congenital) but SNOMED doesn’t do that.

We need to carefully use “possibly_equivalent” links as “Maps to”.
@Christian_Reich @Eduard_Korchmar @Dymshyts

Hello! This is a known issue with older SNOMED concepts. They had awkward naming patterns that were not correctly processed in current version. Issue will not be encountered in new release since we revamped logic for name extraction (expected February).

Correct concept_name for this concept would be “Integument anomalies: [ichthyosis congenita] or [Darier’s] or [keratosis follicularis] or [Meige’s] or [Milroy’s] or [Mongolian spots] or [pseudoxanthoma elasticum] or [congenital NOS]”. Link leads to source term in SNOMED’s official browser.

All the synonyms for this concept are possible meanings included under the parent umbrella term. Modern SNOMED concepts do not allow inclusion of “possibly equivalent” terms as synonyms, but deprecated concepts may still have them.

As for poss_eq links processing in OMOP, we already have changes scheduled for next release this autumn. We need to carefully check if changing poss_eq links would cause problems, or rather significant problems, as they currently are used as sources for ‘Maps to’ relations creation for deprecated concepts.

Likewise:

select * from concept_synonym where concept_id == 40322471

Ah, OK, so you’re saying that the correct mapping is the first of those POSSIBLY EQUIVALENT TO elements (Darier’s Disease). The rest of them are not in any way synonyms for keratosis follicularis. But that means a curated decision needs to be made for all such cases of incorrect synonyms in the CONCEPT_SYNONYMS table (and there appear to be quite a lot of them). Is that correct? I may still be confused. I note that the other six POSSIBLY EQUIVALENT TO elements of the set you show above are related to keratosis follicularis by way of being either dermatological conditions or hereditary conditions, but none of them are synonyms for keratosis follicularis.

Now I am really confused… afaik, the only synonym for keratosis follicularis in the list above is Darier’s disease. My dermatology is very rusty, but I am very sure that Mongolian blue spots are a completely different condition to keratosis follicularis, and obviously “[congenital NOS]” can’t be a synonym for keratosis follicularis. Now, “[congenital NOS]” could arguably be a synonym for high-level subsumption of keratosis follicularis, but surely the intent of the CONCEPT_SYNONYM table is not to contain alternative names for (all possible, or some) subsumptions of OMOP concept, but rather just direct synonyms. The latter has obvious use-cases for concept search, but I can’t conceive of how the former would be helpful. In any case, Mongolian spots are definitely not a synonym of a subsumption of keratosis follicularis. Hopefully you can see why I am so confused!

Just delving into the CONCEPT_SYNONYM table a bit further to determine whether we can use it as part of the source-to-OMOP concept mapping system in our ETL processing…

So there are a lot of concept_ids which have large numbers of rows in the CONCEPT_SYNONYM table. The question is how many of these are currently erroneous?

To answer that, I did a series of spot checks on concept_ids with 50 or greater synonym rows in the CONCEPT_SYNONYM table. My conclusion is that the vast majority of those are fine and all the rows are true synonyms, albeit with a lot of redundancy, but that’s OK. None of the affected concepts are standard concepts, afaics. A few seem to include synonyms of descendants, which is not strictly desirable. But very few are outright incorrect, as with the examples given in the OP.

Given all this, we’ll proceed with using the CONCEPT_SYNONYM table in our ETL mapping code, but will check the next vocabulary release to see if erroneous synonyms have been removed.

Hi @Tim_Churches, I understand your confusion, indeed. Let us look into it and fix the synonyms that got there by mistake, as you correctly found.
Would you mind creating a github issue here?
Thanks - Mik (for the vocabulary team)

Curious, how and why would you use this as part of your ETL?

Training NER (named entity recognition) models for information extraction from free text columns and clinical documents, mainly.

1 Like

Done.

1 Like
t