Let me add some thought here:
- A wide mapping table should serve both OMOP vocabularies and custom project-related mappings.
That’s why there are 2 options for linkage:
- in addition to source_concept_id add source_vocabulary_id / source_code combination, but they’re not unique for some vocabularies and 2 approaches at once doesn’t seem consistent.
- handle custom mappings using 2B+ source concepts and forget source_to_concept_map table, what creates some difficulties in implementation.
Text string, being a type of the source_code_description and information that sometimes lands on the value_as_string field, is probably required to be added. But wouldn’t it be better to have the source_code_description by itself? Seems no, since it’s a duplication of the concept_name from the concept table.
But once we introduce the source_string field, the custom mappings are not being processed using the 2B+ concepts. This conflicts with item 1.
Unit of measure. May be reflected in the source in different ways:
- being a part of the question or answer. It works well since we have target_unit field.
- being a separate entity coming from another field. Isn’t the concept of the wide mapping table is to provide ETL with a comprehensive way of mapping (without using any additional custom vocabularies and logic, i.e. for unit)? But if we add the source_unit field, it gets us to a сombinatorial explosion for most of the real-world data sources, even thought it might be useful (affecting the target concept) for clean vocabularies/sources.