Enhancing Usagi to Look at Two Columns When Provding Concepts Matches for Question/Value Pairs

Usagi has been designed for mapping formal terminologies; i.e. a long list of codes and descriptions like ICD10. Your example contains variable/value pairs, something Usagi handles less well. You might need to map variables and values separately. (see also my last note below)

There are currently two options:

  • As you say, the way to make Usagi use all the information, is to concatenate everything into one field and make Usagi use that as the Source Term. You can add all the separate pieces as ‘additional information’ columns and use these to join on.
  • Make separate files per codelist. One for the variables (your ‘Source code’) and one for each of the sets of values (your ‘Source terms’). e.g. a codelist for each of ‘SMOKING_STATUS’, ‘ALLERGEN_DESC’, ‘SOCIAL. ADL’. Assign a unique source codes to each of your terms to identify them. Then you can more effectively use the import filters in Usagi to specify what class of targets to use.

Two general notes:

  • Usagi only looks at the Source Term when matching target concepts. The Source Code is meant to be the unique identifier of the source term. In many cases this is a meaningless alphanumeric id (e.g I10 or 32547).
  • The source_to_concept_map table is designed to only use the unique source_code (and source_vocabulary_id as the foreign key to join on. We have a separate discussion in cases where you have variable/value pairs to map: Wide MAPPING table (in vocabulary) (problems with relationship)

I am interested what people think of the use case proposed here and how Usagi should handle this. @clairblacketer @Christian_Reich