OHDSI Home | Forums | Wiki | Github

Usagi

I’m using the same Vocab version. I’m going to try to reload the index.

  • Query: when selecting the “Query” radio button / text box it would be nice for the selected source term to auto populate in the query text box rather than the previous query text remaining.
  • When selecting a match from the results pane and clicking the “Add Concept” button for an entry that mapped to concept “0”, it would be great for the “0” concept to be automatically removed from the Target Concepts list.
  • Multiple mapped concepts: it’s possible to map multiple concepts to a single source term in the Target Concepts window. Shouldn’t this be limited to a single concept mapping?

Bill

  • I agree keeping the text from the previous search is probably not what you want, but I’m not sure if we should start with the source term. Let me think about it.

  • I recommend you use the ‘Replace concept’ button instead, which does do exactly what you want.

  • We did this deliberately, since sometimes there’s no avoiding mapping to multiple concepts. We see for example source codes like ‘Disease A and B’, whereas the target vocabulary only has concepts for A and B separately. That being said, this feature should be used only as an absolute last resort. I think I will add a warning popup if somebody approves a code that maps to more than on concept, and will make sure to mention this in the Wiki as well.

Another issue:

it appears that there may be a character set issue. The Results pane is having issues displaying some characters, but only in the “Concept name” column:

  • meniere’s disease: Vestibular active M�ni�re’s disease
  • SJOGRENS SYNDROME: Sj�gren’s syndrome

Bill

The Vocab files are using ISO-8859-1, but I was expecting UTF-8. In the next release of Usagi I’ll use ISO-8859-1.

Thanks!

Does OHDSI have a character set standard? I also assumed UTF-8 as the default because we always use that.

Bill

Opened a discussion on that topic here.

Martijn,

I’m working through a mapping from ICD9 to Snomed conditions. I’m seeing quite a bit of issues with incorrect matching between the terms with and without. I’ve hit this about 50 times on this run through 1600 ICD9 terms.

input: “Open wound of face, unspecified site, without mention of complication”
match (0.82 score): “Open wound of face with complication”
should be (0.75 score): “Open wound of face without complication”

thoughts?

Bill

Hi Bill,

I suspect you are already aware of this, but in case not: You can get complete ICD-9 to SNOMED CT mappings from:

Brandon

Brandon,

I have the NLM mapping pulled up while running through this process with Usagi.

I’m not familiar with the IHTSDO mapping and haven’t found it yet (that site loves their PDFs…)

Bill

Friends:

Can I ask for a favor: Let’s not have different mappings in parallel. That will derail us, because everybody will have a different reality when querying data. Instead, let’s find these systematic problems and fix them.

We have looked at the with and without extensively. We don’t want to do a plain mapping to these in SNOMED, because they are often pretty “dyssocial”, which means they don’t have or don’t have the right hierarchical relationships. Instead, we map to the actual conditions. If something isn’t mentioned - then we won’t mention it either.

I also would advise against the equivalence map. It is only a partial map, we used it as input to our map, and it only does equivalents. Many ICD9 codes are complex, and the equivalent concepts actually have problems: they sound the same, but they have very different children.

All together: I think USAGI is a great tool for efficient mapping of local codes. For mapping of hierarchies to each other it is too simple.

Bill: Can you give me examples of with/without mappings you don’t like, and we discypher?

C

You can get find the raw IHTSDO file in the international release; in the USA you can get this from: http://download.nlm.nih.gov/umls/kss/IHTSDO20140731/SnomedCT_Release_INT_20140731.zip (You’ll need a free UMLS license if you don’t have one.)

Alternatively you can browse the same map is in Snow Owl (under Mappings / ICD-9-CM equivalence complex map reference set) which you can get here: http://b2i.sg/download/

I’m not sure what the difference is between the international and US mappings; we have only looked at the former which is an equivalence map.

Brandon

Hi Bill,

that specific example has the following explanation:

The first match has a synonym “Open wound of face, unspecified site, complicated”, where the “unspecified site” matches part of your search string. Usagi (or actually the Lucene search engine) does not understand semantics, it just attributes a higher weight to the matching of “unspecified site” than it does to matching “without” (which is a very frequent word and therefore gets a low weight).

I’m not sure what to do about this. This is just the way the algorithm works.We could hack in rules, but that would make the behavior very unpredictable.

May be off topic, but Lucene indices without negation annotations are
dicey. Negex can be used to create these annotations and lower the weight
on that basis. But like everything, there’s more work involved. My 2
cents: For term search, as long as a human is picking something from a
list, it’s okay to have the occasional wrong choice appear up top so that
you don’t accidentally exclude stuff.

Several years of participating in TREC (Text REtrieval Conference) has given me immense respect for pure TF * IDF with cosine matching (basically what Usagi does): any modifications may solve some specific issues, but will bite you in other situations and will often reduce overall performance.

I want to use Usagi to build mapping from ICD10CM to SNOMED. Problem is after downloading and unzipping SNOMED vocabulary v5.0 from ATHENA and then building index using this vocabulary, it writes “Building index. This will take a while… Sorting vocabulary files.” And then Usagi don’t responce - I was waiting about one day.
Please help with this.

Are Usagi and the ATHENA files all on a local drive? (Running from a network drive would take forever)

The Vocabulary already has an excellent ICD10 to SNOMED map, so why not use that one?

@schuemie:

@Dymshyts is the source of the “excellent ICD10 to SNOMED” mapping. :slight_smile: He is now building the next one. However, ICD10 and ICD10CM has subtle differences, even when the code is the same. I know. It sucks. Welcome to our world.

Thanks for your answer. The problem was that I used an old version. I have downloaded the last one, and it works properly.

I like how it works making mapping to SNOMED. Now I want to make mapping to ICD10. It doesn’t build anything - only ‘0’ as a target concepts value. @Christian_Reich suggested that problem could be that ICD10 concepts are not defined as Standart, so I set in Standard_concept = S in ICD10 concept file. But it doesn’t work anyway. Is the problem about an empty ICD10’s concept_synonym table, or other reason?

The reason is I hard-coded the list of allowed vocabularies. Please try this new version where the list of vocabularies is derived from the vocab files instead. You’ll need to rebuild the index.

t