Usagi

wstephens · December 3, 2014, 4:42pm

I loved the demo of Usagi at the NY F2F. Any update on a potential release date for the tool? My team is starting several ETL processes and I think this tool would be a valuable addition.

Thanks,
Bill

Frank · December 3, 2014, 10:00pm

I believe @ericaVoss would also be able to point you to all things Usagi.

schuemie · December 4, 2014, 2:44am

The Usagi release is currently waiting for two things:

Documentation, which @ericaVoss has gracefully offered to help write
The release of the Vocab V5.

Let me check with @Christian_Reich and his team on a release date for Vocab V5 .

Vojtech_Huser · December 4, 2014, 3:27pm

Can someone, please, describe briefly what is Usagi and on which day (and which session; for finding it within the recording ) it was demoed?

(for people who were not at the F2F meeting)

ericaVoss · December 4, 2014, 4:28pm

I am planning on finishing up the documentation next week and will post to WIKI when I do. But I will paste the introduction here:

1. Introduction
Usagi is a software tool created by the Observational Health Data Sciences and Informatics (OHDSI) team and is used to help in the process of mapping codes from a source system into terminologies, preferably standard ones, stored in the Observational Medical Outcomes Partnership (OMOP) Vocabulary ([Data Standardization – OHDSI]). The word Usagi is Japanese for rabbit and was name after the first mapping exercise it was used for; mapping source codes used in a Japanese dataset into OMOP Vocabulary concepts.

Mapping source codes into the OMOP Vocabulary is valuable for two main reasons:

When converting a raw dataset into the OMOP Common Data Model (CDM) ([Data Standardization – OHDSI]), translating source specific codes into standard concepts (i.e. RxNorm or SNOMED) translates the source data into a “common language” other CDMs follow.

Having source codes tied into the OMOP Vocabulary concepts allow a researcher to leverage the power of finding relevant source codes leveraging classification terminologies in the OMOP Vocabulary (e.g. find all antipsychotic medications or find all condition codes related to heart failure).

1.1. Scope and Purpose

A source code file from a dataset that needs to be mapped are loaded into the Usagi (if the codes are not in English additional translations rows are needed). An adapted version of Apache Lucene ([http://lucene.apache.org/] is used to connect source codes to OMOP Vocabulary concepts. However these code connections need to be manually reviewed and Usagi provides an interface to facilitate that.

Usagi currently does not currently translate non-English codes to English. We suggest using Google Translate ([https://translate.google.com/]).

wstephens · December 5, 2014, 4:19pm

@ericaVoss

My team is moving forward on CDM v5 implementations at 2-3 locations. I happily volunteer to beta test the documentation against some real word ETL if you’re interested.

Bill

ericaVoss · December 6, 2014, 8:10pm

@wstephens Sounds like a plan! I should have the documentation done next week some time.

But I do think @schuemie needs to get it updated with the latest Vocab5 from Christian / Lee. There is a version of Vocab5 released but maybe Martijn is waiting for the newest update?

schuemie · December 18, 2014, 6:32am

Bill, I just created a [new release of Usagi] (https://github.com/OHDSI/Usagi/releases/tag/v0.2.0).

And Erica has posted the manual in our Wiki.

I would have made this version 1.0.0, but I only want to do that when the Vocab V5 is officially released. You’ll need the Vocab V5 CSV files to start using Usagi. Ping @Christian_Reich if you don’t already have them.

Let me know if you want to give it try, and if you have any issues.

Cheers,
Martijn

wstephens · December 18, 2014, 1:07pm

Excellent! Pulling it now.

wstephens · December 23, 2014, 1:35pm

OK, I’m running through a mapping exercise using Usagi. Some initial thoughts…

Convenience:
It would be great to be able to select a group of unapproved matches in the Overview Table and approve all with a single “approve all” click. I had a bunch of Match Score = 1.0, but had to iterate through all.

Issue:
When attempting to conceptually map “INSPIRATORY TIME” from Cerner to CMD v5 using SNOMED, I expected to find SNOMED code 250819002 as a mappable option. This entry is in the concept.csv file that I loaded into the Lucene index (4353947,Inspiratory time,Observation,SNOMED,Observable Entity, S, 250819002,19700101,20991231,). However, I cannot seem to find a way to get this value as a mappable option through any combination of Search or Filters.

schuemie · December 23, 2014, 2:25pm

The first is easy: once you’ve select all matches your want to approve, you can go to Edit --> Approve selected and all selected items will be approved. I guess we need to add that to the Wiki

The issue is harder: I’m unable to reproduce this. If I type in the manual query ‘inspiratory time’ that SNOMED concept is the first that pops up. Have you unchecked all filters? Can you tell me which version of the CSV files you used? I’m on Vocabulary5.0-20141013

wstephens · December 31, 2014, 1:57pm

I’m using the same Vocab version. I’m going to try to reload the index.

Query: when selecting the “Query” radio button / text box it would be nice for the selected source term to auto populate in the query text box rather than the previous query text remaining.
When selecting a match from the results pane and clicking the “Add Concept” button for an entry that mapped to concept “0”, it would be great for the “0” concept to be automatically removed from the Target Concepts list.
Multiple mapped concepts: it’s possible to map multiple concepts to a single source term in the Target Concepts window. Shouldn’t this be limited to a single concept mapping?

Bill

schuemie · January 5, 2015, 3:06am

I agree keeping the text from the previous search is probably not what you want, but I’m not sure if we should start with the source term. Let me think about it.
I recommend you use the ‘Replace concept’ button instead, which does do exactly what you want.
We did this deliberately, since sometimes there’s no avoiding mapping to multiple concepts. We see for example source codes like ‘Disease A and B’, whereas the target vocabulary only has concepts for A and B separately. That being said, this feature should be used only as an absolute last resort. I think I will add a warning popup if somebody approves a code that maps to more than on concept, and will make sure to mention this in the Wiki as well.

wstephens · January 5, 2015, 2:31pm

Another issue:

it appears that there may be a character set issue. The Results pane is having issues displaying some characters, but only in the “Concept name” column:

meniere’s disease: Vestibular active M�ni�re’s disease
SJOGRENS SYNDROME: Sj�gren’s syndrome

Bill

schuemie · January 6, 2015, 4:06am

The Vocab files are using ISO-8859-1, but I was expecting UTF-8. In the next release of Usagi I’ll use ISO-8859-1.

Thanks!

wstephens · January 6, 2015, 1:35pm

Does OHDSI have a character set standard? I also assumed UTF-8 as the default because we always use that.

Bill

schuemie · January 7, 2015, 12:57am

Opened a discussion on that topic here.

wstephens · January 9, 2015, 4:10pm

Martijn,

I’m working through a mapping from ICD9 to Snomed conditions. I’m seeing quite a bit of issues with incorrect matching between the terms with and without. I’ve hit this about 50 times on this run through 1600 ICD9 terms.

input: “Open wound of face, unspecified site, without mention of complication”
match (0.82 score): “Open wound of face with complication”
should be (0.75 score): “Open wound of face without complication”

thoughts?

Bill

Brandon_Ulrich · January 9, 2015, 4:25pm

Hi Bill,

I suspect you are already aware of this, but in case not: You can get complete ICD-9 to SNOMED CT mappings from:

IHTSDO (SNOMED CT’s curator): Via the distributed ICD-9-CM equivalence complex map reference set
US NLM: http://www.nlm.nih.gov/research/umls/mapping_projects/icd9cm_to_snomedct.html

Brandon

wstephens · January 9, 2015, 4:39pm

Brandon,

I have the NLM mapping pulled up while running through this process with Usagi.

I’m not familiar with the IHTSDO mapping and haven’t found it yet (that site loves their PDFs…)

Bill