Current Problem:
Observation periods have been defined differently due to different data sets containing data collected at different periods of time. It is unknown if future analytic use cases will require this “extra data”. Due to OHDSI’s rule of analyzing data within an observation period only, this “extra data” presents a problem that affects the observation period record start and end dates. If users are going to use the observation period to determine when a patient is observable, how can we include/exclude this extra data when needed.
Use Cases to Consider:
- CPRD using Up To Standard (UTS) variable. This field determines the start of a patient’s observation period in current ETL standards. However, data is collected prior to the UTS variable and analysts will want to review that data for evidence of baseline conditions. Current ETL standards suggest this data prior to UTS will never be considered for analysis and should be omitted from the CDM. This prevents analysts from ever reviewing data prior to UTS.
- U.S. Claims data with different insurance enrollment periods, most notably medical and prescription drug coverage. Different analyses will require claims data be considered when a patient has medical coverage, prescription coverage, or both. This will vary based on the analysis question. For example, If the analysis only requires medical coverage (no prescription drug coverage is required because all of the clinical events studied are covered under medical coverage), then the claims that exist outside of pharmacy coverage should be included. ETL guidelines change based on claims dataset. If the dataset includes both medical and pharmacy data, then the observation period reflects when both coverages are active. If the dataset specifically mentions medical coverage only, the observation period will only include medical coverage. Based on this loose ETL guideline, analysts are limited to the type of analyses they can perform on CDMs due to the type of observation period created.
- Some European databases (Peter – please reference your dataset) will include data outside of the “data cut”. For example, if the dataset is going to provide patients seen between 2011-2016, it is possible to get extra information for patients seen prior to 2011, described as “patient history”. Since this data is outside of the “data cut” (2011-2016) in which the observation period is created, then where should the “patient history” data go since it is not technically within the observation period? This patient history data is useful for some analyses that may look for evidence of certain conditions.
Proposal
Data ETL’d into OMOP CDM can have multiple overlapping observation periods, differentiated by a standard concept ID indicated in the observation_period.period_type_concept_id. Period_type_concept_ids will include the following:
- Medical coverage
- Prescription coverage
- Medical and prescription coverage
- Pre-qualified coverage
- Qualified coverage
- In practice network
- Includes out of practice data
Items 1, 2, and 3 are specific for US claims data. Items 3 and 4 are specific to European EHR data. Items 6 and 7 are specific to NHS (UK) data – specifically CPRD and HES data.
Proposal Applied to Use Cases
- Two overlapping observation periods will be created for CPRD data. One large observation period that includes all data. And one observation period that uses the UTS date as the start of the observation period. Users will select the large observation period to include all data in the analysis. Users will select the UTS observation period to use the UTS data as a “qualified coverage” when considering just data after a practice’s UTS date.
- Three overlapping observation periods will be created for US claims databases (as applicable). One for medical coverage, one for prescription coverage, and one where medical and prescription coverage overlap. (there is a possibility of creating an overarching observation period where the patient can have medical and/or prescription coverage – is this applicable for anyone?).
- Similar to #1 for CPRD, two overlapping observation periods will be created. One that includes pre-qualified data and another that only includes “qualified” data (or data within the visit date range 2011-2016 per the example above).
Example ETL of US Claims data:
Raw Claims Data for Patient 123
Observation Period Table
Considerations:
OHDSI tools are built to only consider one observation period per database. OHDSI tools will need to be amended to allow users to select an observation period type of their choice. In the meantime (i.e. while that is getting fixed), either
- Have OHDSI tools use the min/max of all observation periods for analyses
- Implement a standard that if there are multiple observation period types, then the ETL’er will have to create one large overlapping observation period that OHDSI tools will choose work against.