OHDSI Home | Forums | Wiki | Github

How does USAGI assign match score and why does the top target concept not having the highest score?

I am using USAGI to transform local procedure codes to OMOP. However I notice the #1 target concept USAGI suggested may not have the highest matching score? In fact, other suggestions in the result box have higher scores.
For example,
Source: Bypass, coronary arteries endoscopic approach with robotic telemanipulation of tools using autograft [e.g. saphenous]
USAGI top suggest: Aortacoronary artery bypass of three coronary arteries with saphenous vein graft (score = 0.41)
Other suggestions in the results box:
Robot assisted laparoscopic coronary artery bypass (score = 0.43)
Coronary artery bypass graft with saphenous vein graft (score = 0.42)

Even for this case, which would be the appropriate matching? Many of my local procedure codes have high granularity, in which one single OMOP standard is unable to represent. Is there a standard methodology to navigate this?

I can answer the first part of the question as I recently had to dive into the Usagi code. Basically, USAGI depends on an older version of Apache Lucene library, which imlpements TF-IDF scoring of standardized vocabulary content; for that purpose, all of:

  1. concept_name from concept.
  2. concept_synonym_name from join with concept_synonym
  3. concept_name from concepts that have valid Maps to relation to the original concept
  4. concept_synonym_name from such concepts.

Are considered valid “documents”, similarity is then computed against the search query (source term) by Lucene and results are ordered according to it, with the best score taken per target concept_id.

To answer why they may be inconsistent with the display order, it might be because custom normalized TF-IDF scores are displayed, and Lucene scores are used for ranking; there might be difference in methodology beyond normalization, but I have not researched this to confirm this suspicion.

In any case, in my experience any result score below 0.8 is not reliable to base any logic on; If the best matching score is 0.4, this warrants a manual review of the mappings.

t