As a Health Plan (payer) I have a similar set of questions.
Are there documented best practices/FAQ for mapping from US claims to OMOP? There are are several important business decisions that need be made about the mappings which may impact the reproducability of analyses if different claims data owners/submitters are selecting different mapping strategies.
I reviewed the Book of OHDSI, and the Wiki (https://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:data_etl), but don’t feel that I’ve seen a definite answer on best practices for claims data. If such documentation already exists, please point me to it.
For example:
(1) When coded claims data (e.g. ICD-9, ICD-10, CPT4, HCPCS, NDC, GPI) do not map through concept_relationship to a standard concept, is _concept_id set to 0 vs. set to an ancestor code vs. set to the source_concept_id (which then makes queries more complicated)
(2) When concepts Maps To multiple SNOMED codes, should/must all of them be stored in the relevant tables, or can organizations pick which single SNOMED code to use?
(3) Are there quality assurance techniques to ensure that ICD, CPT, and HCPCS mappings to standard codes land in the correct tables (e.g. Observation vs. Measurement vs. Drug Exposure).
(4) Are there plans to enhance the Data Quality Dashboard to assess the appropriateness of the mapping of source_code_ids to standard_codes and target tables? Or are there tools planned to automate that process to avoid the risk of inconsistent ETL strategies?