We are in the procedure of finalizing the ETL for a data partner, yet I have some questions regarding the observation period calculation, as well as the Sql code for mapping ICD10 and ATC codes to Standard codes.
As regards the observation period, in the source database we have information about the date of visit on which each patient conducts a bunch of different medical examinations. Would it be wrong if we set as observation_period_id the code of the medical examination and then for each patient we take as start date the minimum date of visit on which the patient conducted the specific examination, and as an end date the maximum date of visit on which the patient conducted the specific examination?
As regards the sql code used for code mapping (e.g., ICD10 codes → sqlScripts.knit), I got a little bit confused with the tables that are being used. In particular, we have the CONCEPT table (which is named "c" at the beginning of the code). Then we have again the c1 and c2 tables. Are these tables the CONCEPT table again, or am I missing something?
As for the source_to_concept_map, should this be populated with the data from the original database containing the ICD10 codes. Right?
C, C1 and C2 all reference the Concept table. The aliases are needed because the concept relationship table has a reference to the source concept id and target concept id. This requires looking at two different rows in the concept table within the same query. This is accomplished by using an alias for the Concept table (C, C1, C2). The SQL statement treats each alias as if it refers to a separate copy of the Concept table.
Note in the query you reference, it should be “Maps to” with a capital “M”
The source_to_concept_table_map is an initially an empty table and will only have rows if you used it to map some source specific values into OMOP standard concepts. So unless you created custom mappings the source_to_concept_map table will be empty and the second part of the UNION in the SQL statement will be empty.
The observation_period_id is the primary key of the table, this is typically just incremented.
That could be a valid approach. Although, as Don referenced, there has been a lot of discussion around this. Especially considering a ‘washout’ period after the last visit in which you would expect new medical events to be reported (important for e.g. incidence rates). See the EHR obs period convention document here: Observation Period Considerations for EHR Data