OHDSI Home | Forums | Wiki | Github

EHR Observation Period logic - please take a look

The EHR WG convened on July 24, August 7, and August 21 to discuss the creation of an Observation Period from EHR data. The current and future conventions are not prescriptive enough and leave room for various ways of interpretation. The goals of our discussions were to increase the standardization for the implementation of the Observation Period table by providing some general guidelines for determining the start, end, and gaps in Observation Periods. The suggestions we came up with are only “suggestions” at this point. If the community and CDM & Vocabulary WG agree with these suggestions, we suggest they be added to the Observation Period conventions, Themis rules, and DQD checks.

All of these decisions should be tempered by local understanding of patients in the EHR you are ETLing.

*Note - These suggestions are not intended for HMO EHR sites since HMO EHR Observation Periods more closely resemble claims data Observation Periods.

  1. Create general guidelines for the implementation of the Observation Period.Start_date, Observation Period.end_date, and Observation Period Gaps.

Observation Period start date

  • Generally, an Observation Period does NOT begin before birth.
  • However, it might begin before birth IF the pregnant mother receives care recorded in your EHR. The child’s record is split from the mother’s record at birth. The recommendation would be birth_date minus 9 months for the Observation Period.start_date if these Persons are in your dataset.
  • Generally, an Observation Period does NOT begin before the implementation of the EHR at your site. Any records prior to implementation are probably “history of” record types and not a complete EHR record of clinical events.
  • Special consideration should be given to migration from previous EHR, implementation at different sites within your healthcare system, implementation of different modules, etc.

Observation Period end date

  • The first date from the following:
    • Date of death + 60 days is a Themis convention to allow events after death. This date should not to exceed the date of the data pull.
    • Last clinical event + 60 days is the assumption is a person will return to the same health provider if an adverse reaction/complication/unresolved condition occurs. This date should not to exceed the date of the data pull.
    • Date of the data pull

Observation Period Gaps and Persistence Windows (borrowing language from the Drug Era conventions)

Observation Period Gaps are periods of time when a Person won’t be receiving care from your institution and therefore the Person is not being observed and should not have an Observation Period. These gaps are usually hard to determine because most Persons don’t announce their departure from an EHR/healthcare institution. Therefore, a heuristic will need to be instituted to determine Observation Period Gaps where the information is not explicit.

Observation Period Persistence Windows are the maximum time allowed between two clinical events under the assumption a Person would have a clinical event recorded, if they are not healthy and seek care.

Example: Person 1 has a series of clinical events recorded from Jan. 1, 2010 to June 15, 2012. The time between clinical events is not greater than 2 months. The next clinical event for Person 1 after June 15, 2012 is on Oct. 1, 2018. Starting Oct. 1, 2018 Person 1 has clinical events occurring at least every 3 months up to the present date.

There is a 6+ year gap between groups of clinical events recorded in the CDM. After discussion in the EHR WG, we believe this 6+ year gap is indicative of a Person not being seen within our EHR/healthcare institution. Per convention #4 for Observation Period table, “As a general assumption, during an Observation Period any clinical event that happens to the patient is expected to be recorded. Conversely, the absence of data indicates that no clinical events occurred to the patient.”

Person 1 has two Observation Periods.

1st Observation Period.start_date = 01/01/2010 and Observation Period.end_date = 08/15/2012 (Per the end_date guideline above)

2nd Observation Period.start_date = 10/01/2018 and Observation Period.end_date = 09/01/2020 (Date of the data pull, per the end_date guideline above)

Now, there are cases where a Person only receives care within you EHR system when absolutely necessary. And if your EHR doesn’t offer primary care services, the majority of Persons lack healthcare insurance or any other reason why Persons are only seen in urgent or emergent situations, the above heuristic might be too restrictive. This is a guideline.

A question the WG debated was how long between clinical events should we assume any clinical event that happens to the Person is expected to be recorded? When should we end one Observation Period and begin another? What should be the time between events for an Observation Period Persistence Window? Wellness checkups/Visits happen approximately every 12-18 months depending on a multitude of factors. If Observation Period Gaps are 548 days or more, then the previous Observation Period should end and another Observation Period should begin on the date of the next clinical event as per the Person 1 example above.

Observation Period.period_type_concept_id : The period_type_concept_id offers a means to give more information about how the observation period is determined. The current options include ‘standard algorithm’, ‘standard algorithm from claims’ and ‘standard algorithm from EHR’. We should better define these types.

  1. Cautions/things to consider when implementing the Observation Period
  • Implementation of your EHR - migration from previous EHR, implementation at different sites within your healthcare system, implementation of different modules
  • Coverage of your healthcare system within your area, population served by the healthcare system,
  • All decisions should be tempered by local understanding of patients in the EHR you are ETLing.
  1. Metadata: what should be recorded and how it should be recorded? What is important for researchers to know to ensure the data is sufficient for the question?
  • We didn’t cover this in the WG, but I think the nuances of an EHR based CDM are important to record in a standard manner.
  1. Unanswered questions:
  • How should records prior to implementation of the current ETL be recorded (this is making the assumption that these prior records are not complete, but still may have accurate dates and observations)? Options are to store in event tables but not include within observation periods, or to put in Observation table as ‘history of’.

@Christian_Reich and @clairblacketer ,

Please take a look at the above

Could you clarify how observation period types might impact having multiple observations covering the same time?

For example, using observation period types, is it possible to have a general observation period from 1/1/2010 to 1/1/2015 and an observation period of a different type from 6/1/2011 to 6/1/2012?

I would caution against this situation, if it is allowed, since this would have to be handled very carefully for determining time at risk for a given analysis. Although observation period types I think exist in the current CDM definition, I’m not aware of any analysis that has been performed to check that these observation periods do not overlap.

@Chris_Knoll, we have no intention of changing the convention that ‘Each Person can have more than one valid
OBSERVATION_PERIOD record, but no two observation periods can overlap in time for a given person.’ In your example above the two periods overlap. Does that alleviate your concern?

As @DTorok said, per the conventions, overlapping observation periods are not allowed.

The EHR WG would like further clarification from @Alexdavv and other Vocabulary team members on the differences between:

Is standard algorithm the parent of the other 2? Colorado has EHR data and death registry data in one CDM. Should we use ‘standard algorithm’ since it isn’t pure EHR data?

Hello, @MPhilofsky and @DTorok:

Yes, if no observation periods can overlap (regardless of observation period type), then that alleviates my concern. Thank you.



I would like to add the following to my EHR Observation Period logic convention proposal:

The Observation Period can be created by only one clinical event. However, the clinical event must NOT be from the Death table. If a Death date does not have any other clinical records 18 months before AND 18 months after the death date, then an Observation Period will not be created.

I believe this logic is needed because if a Person only has a death death_date without other clinical event records, a Person is most likely not being “observed” when the death occurred. If a Person was being observed at their time of death, then other records (visit, condition, measurement, etc.) would be created. This rule is most relevant for those with death registry data since a Person who dies in the hospital has many clinical event records.

Is there value in a taxonomy of justifications for confidence in data completeness within observation periods?
That confidence is a key part of the definition of an observation period, no?
My prior 2 cents on that are here: [Observation_Period table - how do you generate this table at your site?]

I have 2 questions:

  1. When defining Observation Period end date, does the “first date” from the Date of death/Last clinical event/Date of data pull mean the earliest available date from these dates?
  2. Where can I get additional information on considerations when defining “Last clinical event”?

Hello @Rozeta and welcome to OHDSI!

This is the order of operations. If you have a date of death, then use the date of death + 60 days. If there isn’t a death date, then use last clinical event + 60 days. If there isn’t either of those, then use the date of the data pull.

Not sure I understand your question. For the last clinical event date, take all the dates and chose the last chronological date from clinical event tables, i.e. Visit, Condition, Measurement, etc.


If I have clinical event date which is drug_exposure_end_date and it is 6 months in the future because of the refills, say April 17, 2024 for argument’s sake, should I use that date + 60 as Observation Period end date?

Excellent question, @QI_omop! I don’t think we have this written down in our conventions. I’ll have to check.

You never use future dates in the OMOP CDM. With very few exceptions (i.e.non-smoker), we don’t store clinical event data which did not happen. And since the future hasn’t happened, you shouldn’t have any dates > than the date of the ETL to OMOP. I know it is very common in EHR source data to have a drug end date in the future, but when you build your cdm you need to change the future drug end dates to the date the data were last refreshed or ETL’d if your data are always current.

@MPhilofsky: So would you say not to use the drug_exposure_end_date or device_exposure_end_date as valid clinical event end dates?

@PriyaDesai my interpretation of what @MPhilofsky said is that the drug_exposure_end_date should be truncated at the data pull if it extends into the future in the drug_exposure table. If we impose restrictions based on data pull date (and potentially also death date, since I know we see cases where clinical events occur long after death) when we bring data into the main clinical table, then when we later build the observation period table, we should see reasonable observation period end dates based on the logic above- even if we use drug_exposure_end_date or device_exposure_end_date.

Hi @MPhilofsky, even assuming that we have appropriately cleaned up our dates in the clinical tables to not exceed the data pull date, it seems to me like taking the earliest (minimum) date across the three dates of 1) death date + 60 days, 2) latest clinical event + 60 days, and 3) data pull date would be a more accurate representation of observable time periods rather than using the first available date in that order of priority.

The first two criteria as listed in this original forum post specify that the date calculated from 1) and 2) should not exceed the data pull date, although I noticed this detail is not included in the final document here: Observation Period Considerations for EHR Data. I am assuming that we do actually want to cut off at the data pull date if either of the first two calculations exceed it, since we could definitely run into problems of claiming that someone is observable for 60 days past when we have any data available just because they had a visit or something on the day prior to the data pull. Implementing this as a minimum/earliest date across these three would ensure that final observation_period_end_date would NEVER exceed the data pull date.

Also, with the ‘order of operations’ implementation, there could be a situation where the person’s last clinical event is many years before their death date (you mention in your comment above that this situation can arise as a result of death data registry import after the patient has left the health system), and in that case we would still use the death date + 60 days to calculate the observation end period, which would then be many years after the last observed clinical event. If we instead said that we should use the minimum date across these three values, then the observation end date would end up being the last clinical event + 60 days, which seems more reasonable in this case. I recognize that the persistence & gaps part of the recommendations may address this, but I don’t see any reason why it should not be handled in the basic calculation of observation_end_period as well, for sites that have not implemented gaps/persistence.

I should be more precise. Let me try to clarify.

You have to clean all your data before you ETL it into the OMOP CDM. This includes eliminating all clinical events after date of death + 60 days and eliminating or truncating any data with dates in the future. Tomorrow isn’t promised, so a person might have a drug ordered to start next week, but we don’t bring these data into the CDM. A person might currently be on a drug, with a drug end date of next week, but the data pull is today, so we can assume they took their drug today, so the drug end date will be truncated to today.

Death is a clinical event, the ultimate clinical event. So, this is the first date to use as the end of your Observation Period OR if you have it, the clinical event within 60 days of death. The original thought at the Themis meeting years ago is that clinical event records relevant to death might appear in your source data after death. Example: an autopsy might be performed after death and the date of the report would be after death, however, the autopsy might reveal a condition of significance, so we want to keep the record.

If you don’t have a date of death, then you use the most recent, clean clinical event date (drugs, measurements, visits, etc.) + 60 days.

The Observation Period end date should not exceed date of the data pull.

With all of this said, non-HMO EHR data in the US is very tricky because we don’t really know when a person might have a clinical event record in the EHR. We don’t know when they are at risk, until they go to the health system and we don’t know when they chose to no longer seek care at a health system. The HSIG (formerly EHR WG) wrote this paper a few years ago. And I am very open to having another discussion on the recommendations presented in this paper. Perhaps it is time to revisit this topic.

@MPhilofsky So wanted to clarify a couple of things:

  1. the observation_period end date should never exceed the date of the data pull
  2. For a case like:
  • We have data for a person from 2010-2012; then nothing till 2020 when we have a date of death thats it - in that case we have only one observation period from 2010-2012 and thats it? Just having a death date does not constitute a clinical event
  • Similar to above- We have data for a person from 2010-2015; then nothing till 2020 when we have a date of death and a couple of “observations” after that -maybe autopsies etc - in that case we have one observation period from 2010-2012 and another one which is dod+60 days?
  • It does seem like taking the minimum of
    * Date of death + 60 days
    * Last clinical event + 60 days
    * Date of the data pull
    should in general be the correct thing to do for the observation end date - with specific attention if the last clinical event is dod to make sure it really qualifies?

Correct. If this person died in your healthcare system, then you would have clinical events immediately prior to the actual death. IF you don’t have records from 2012 until their death in 2020 (8 years of no healthcare interaction in your system), then you most likely weren’t “observing” the person. Now, they could have been very healthy from 2012 - 2020 and there wasn’t a need for them to seek care. Or they could have changed healthcare systems (they changed insurance, they moved or they decided to seek care elsewhere) and that’s why you don’t have any records. The thing with EHR data are that they are tough to fully decipher because we don’t know what we don’t know!

I would chart review a handful of persons with “observations” after death, but not immediately prior to death before making a decision. However, my first thought is IF you don’t have records immediately prior to death record, then was this person being “observed” by your health system? Probably not.