OHDSI Home | Forums | Wiki | Github

Concept_synonym usage in concept searching

In 2015 the rule was introduced that any concept should have at least one synonym. And we simply use concept_name as a placeholder for concepts missing the real synonyms.

Currently, we have only 1463 concepts without the synonyms, and they’re mostly in the Metadata/Visit/Payer/Cost domains.

The thing is the approach is not consistent and the current picture is:

  • 7,693,299 concepts have only one synonym that is an imputed concept_name;
  • 1,073,940 concepts have >1 synonym, where one of them is imputed or match the concept_name;
  • 239,314 concepts have one or more synonyms, but none of them is an imputed concept_name.

Because of that, the search results of algorithms used in Athena or other tools may be affected: additional match within the synonyms will increase the matching score, but, in fact, the synonym is imputed.

Here is the proposal to be implemented:

  • Do not impute the synonyms. Leave the concepts without the synonyms if the source doesn’t provide such.
  • Do not allow the synonyms that match the concept_name, except they’re in the national language (language_concept_id <> ‘4180186’ English language).
  • Drop the existing synonyms according to these rules.

Once this implemented, there is the only possibility to string-search the concepts within both tables (concept + concept_synonym), while currently false confidence that concept_synonym is enough may exist.
We’d like to hear from the community how this could affect the string-search you used, especially withing OHDSI tools (Athena, Atlas, USAGI).

Tagging @Chris_Knoll @anthonysena @Yaroslav @schuemie @MaximMoinat @acumarav @Christian_Reich @Dymshyts

The thing that bothers me the most about it is that all our scripts for automated mapping subject to the existence of 2 tables (concept + concept_synonym). It means that we will need to rewrite them all :exploding_head:.

In 2015 the rule was introduced that any concept should have at least one synonym. And we simply use concept_name as a placeholder for concepts missing the real synonyms

Is there any way not to store single concept names without synonyms in the concept_synonym table?

  • Do not impute the synonyms. Leave the concepts without the synonyms if the source doesn’t provide such.
  • Do not allow the synonyms that match the concept_name, except they’re in the national language (language_concept_id <> ‘4180186’ English language).
  • Drop the existing synonyms according to these rules.

I like all these ideas but I do not understand why we cannot keep the concept_synonym table alive.

No :blush: We leave both tables alive. Concept_synonym becomes more clean.

Sure, the primary names live in the concept_name. The secondary names and translations live in the concept_synonyn_name.

1 Like

Agreed. I don’t think any script or tool (Athena, Atlas) should change. But we should ask. @Chris_Knoll, @Konstantin_Yaroshove?

Do we have a Github issue for this?

As I see ATLAS/WebAPI do not use concept_synonym table. In the same time I do not see issues for ATHENA logic. But it is widely used across many OHDSI components where my knowledge is limited:

I would involve more people in this discussion.

Not yet.

tagging @acumarav @Yaroslav

What I was thinking is this (correct me if I am wrong):

We have a search technology and a UI for searching in Athena. It runs off of the tables ProdV5. Athena as such is fine (well, it has issues, but nothing to do with this).

Atlas uses SQL and another UI. It is slower and it is confusing because it looks similar to Athena but behaves slightly differently. So, I am thinking to transplant the fast Athena search to Atlas for the Vocabulary search functionality. Of course, it will run on the local vocabulary tables.

However, we may find that Atlas does things better/more correctly/more intuitively than Athena. If that is the case we should also consider improving Athena.

Bottom line: create one optimal tech stack/functionality/UI for vocab searching and browsing, and deploy to both tools. Don’t have two parallel and almost identical solutions.

The plan is to implement this vocabulary fix in one of the next releases.

@MaximMoinat @schuemie @anthonysena @Chris_Knoll @wivern
Please let us know if this can affect the search logic in Atlas or USAGI.

Hi Alex. I don’t expect it to affect the search logic of usagi. It only adds synonyms to the index if they are not equal to the concept name.

I like the proposal by the way!

1 Like

This will be implemented within the next vocabulary release.

t