OHDSI Home | Forums | Wiki | Github

Observation Period Flavors (First THEMIS-Focus Group 2, now discussed everywhere)

  1. Data presence should be reliable, and data absence should be reliable

Thank you @Christian_Reich for stating this so clearly. This is my issue with the desire to include “old” information from before the observation period. In particular, the absence of data is particularly hard to explain. People often want to include a few extra bits of information to get a few more people with an exposure of interest, forgetting that a lot of similar people are missing.

As an analogy, the issue is very much the same as with outcomes. Just as missing data in the outcomes period can be informative, data outside the observation period can be informative. This can bias outcomes associated with the exposure if the exposure definition period extends outside a contiguous observation period for some people.

Note that this is a different issue from forcing everyone to have the same duration of observation period before an index date (for example, forcing everyone to have a 12 month look-back period when some have valid, longer ones available). In this case, for people with different length observation periods, it is better to use all the information. See Gilbertson, et al. (I am ignoring the situation when applying a definition that requires a specific duration of observation period in which case you should use that.)

Observation periods can be constructed dynamically using the payer_plan_period, or through other modifications to the observation period table. We support that in our application and our data model, and we prefer that flexibility for our purposes. But I can also attest to the fact that dicing the data in this way can affect performance with large datasets. The OHDSI approach is to place information in the CDM that is ready to be analyzed and to consider analytical performance. This isn’t always explicitly stated with the OMOP CDM, so it is important that you clarified this. I think any change needs to be considered in the context of the breaking changes it would introduce.

1 Like


Turns out the same discussion is popping up everywhere. I am consolidating. Please let’s have it here.

Since it is by far the hottest THEMIS issue, I suggest we will nail it at the next face-2-face at the 8-9 March 2018, hosted by Amgen in 1000 Oaks.


Copy from the other conversation:

From email:

1 Like


Related to this thread . . .

@Ajit_Londhe just pointed out a scenario where we are getting death records outside of the OBSERVATION_PERIOD for a claims database. @clairblacketer, @Ajit_Londhe, and I were going back and forth on what to do. Personally I think I agree with the approach from above, I guess it really isn’t history, but rather future. :slight_smile: If we know they are no longer enrolled but know they died in the future, we are missing a snippet of time that might explain what truly happened to the patient.

We actually know this about everybody. :frowning:


@Christian_Reich - other than being depressing on the matter, would you agree that it would be appropriate to delete a death that occurs let’s say 30+ days after enrollment ends?

If we move forward with the option that I outlined, which Peter advocated
for, then no data that falls outside of the observation period would need
to be deleted (in which case to your scenario, the death record would be
preserved, even though it falls outside of the observation period). A
given source would need to consider at ETL-time how you want to think about
an observation period in this context, but I don’t think it would require
death deletion in any circumstance.


Dear all

We are converting EMR data to CDM.
The Observation_period table is produced by the following method.

“Periods of continuous enrollment is calculated by combining monthly records and
recording the observation_period_startdate for the first period as the enrollment start date
and observation_period_enddate for the last period as the enrollment end date. If the
time between the end of one enrollment period and the start of the next is 30
days or less, treat this as continuous enrollment.”

After the conversion, Achilles_heel report that looks at “events outside observation periods”.

We have checked the data to resolve this error.

In Korea, there are cases where a doctor prescribes medication without a visiting enrollment,
or patient comes back to the hospital a few days after the treatment prescription, examination prescription and then the treatment or examination is carried out without a visiting enrollment.

If it comes without visiting enrollment, it will be recorded as before or after visit enrollment.

That is, this information is an event that the patient visited the hospital correctly,
but the visit registration is not recorded and an Achilles_heel error occurs.

In order to include these data in the observation period,
First, we try to generate observation_period using the drug, examination, treatment, and enrollment tables in EMR.
And then, the time between the end of one enrollment period and the start of the next is 6 months or less, treat this as continuous enrollment.

First point, would we use other tables(drug, examination, treatment table etc.) as well as enrollment tables to create observation_period?
Second point, would we window period 6 months? How long will it be appropriate?
We checked the issues, the claim database has used a window period 1 year based on insurance registration. And EHR databases have used that CCAE database 32days, and GE database 12months.

We would like to hear from OHDSI members if this method is suitable for solving the problem of observation_period invalid in Korea.


I think you are trying to use the rules for claims data, instead of those for EHR data. In EHR data, unless your healthcare system has a specific mechanism for “enrolling” patients with an institution, you don’t know if you “have” a patient in active treatment or not during the time when nothing happens. Those axiom 2 situations could mean the patient is healthy and happy like a fish in water, or gone with the wind. There is nothing you can do. Most EHR system define the observation period as between first occurrence of something (Condition, Drug, Procedure etc.) and the last occurrence. Everything in-between is part of the observation period.


Thank you for your suggestion.
However, we have “missing observation period” problem.

The term “enrollment” doesn’t mean the insurance enrollment in this case. Dahye and we used “visit” information as enrolled period because we can exactly know if a patient was observed or not, only during he/she is visiting the hospital.

Especially in Korea, patients can go to any hospital freely they want.
For me, there’re 4 hospitals I encounter independently; usually A for cold, B for dent, C for surgery or emergency, D for physical examination. It changes very frequently, and A hospital will never know that I’ve got surgery last year. Also C won’t know that I had light fever right after surgery/discharge, if I decide to go to nearest hospital A for fever.

This “blank period” during a patient is not visiting certain hospital could be for one month, or could be 10 years.

That’s why we’re not using the normal EHR rules.

  • “first occurrence of something to last occurrence of something”

Here’re additional issues you may answer,

  • if any other country’s EHR systems have the similar problems of blank observation period?
  • if they have, how “EHR rules” effect on research results?
  • which is better method to build observation period table for “blank” data?


This is a very good question. Actually two:

  1. How short Observation Periods can be (longitudinal or horizontal separation of data)
  2. What do I do if I know I capture only a fraction of the data (vertical separation of data)

For 1): The idea with EHRs is that if you were sick you would return to the same hospital you went last time, and which probably is the closest to you. I know this is not a given. But the likelihood is there, so people make this approximation. They have nothing else to hang on to. Which means, if there are no data then you are healthy enough not to be in the hospital (even though you might be in a different one). For use cases that look at events inside a visit this is probably fine (rarely people get referred from one hospital to another just like that). For use cases that cover longer times (like studies with long-term follow up) this might create “Axiom 2” errors = You underestimate the rate of events because you would wrongly interpret the whitespace as a time without events.

The idea of making mini-Observation Periods around each visit is dangerous. Because you need the whitespace in-between for the correct prevalence assessment. Otherwise it looks like a patient is “always” in the hospital, or “always” has an asthma exacerbation, because during an Observation Period they do, and outside you are not supposed to look. You can also never define a washout period. In the extreme case, you have one-day Observation Periods: Everything has a prevalence of 100%. So, don’t throw away the whitespace.

That’s fine. I haven’t been in a hospital for 10 years thank God. But if something happened I would have gone.

For 2) This is really a problem for the Metadata and Annotation Workgroup. They need to solve the problem how to capture the fact that there is only partial information available.

1 Like


According to your explanation, observation period as between first occurrence of something (Condition, Drug, Procedure etc.) and the last occurrence.

there will be one observation_period per patient.

Thank you for your detailed explanation.

A proposal about new period_type_concept_id was considered in this discussion. And what the result: will be added new concepts or this idea was rejected?

Thank you in advance


@Christian_Reich @mvanzandt
Are there any updates regarding this topic? There are still only non-overlapping periods in OBSERVATION_PERIOD. To not loose patients during the mapping into OMOP CDM, claims databases tend to include any type of enrollment (so we need to use PAYER_PLAN_PERIOD to know whether a given enrollment period has only medical coverage, only pharmacy coverage or both kinds). If we want to analyze such database, then we would need to add our own solution in-between to restrict to only patients with medical+pharmacy coverage during analysed period. But it means that we would not be able to use ATLAS (not sure how R packages will handle such an input in-between - I haven’t checked yet).

Best regards,

Hello @Ewas,

You can use the payer_plan_period in Atlas as part of your cohort definition. You are able to add many attributes to the inclusion criteria. See this screenshot:

Hi @MPhilofsky,

Thank you for your answer! I am not an experienced user of ATLAS, so might be wrong but I cannot find a way to require e.g. 365 days of continuous medical and pharmacy enrollment prior to index date.
We would need to combine overlapping records in PAYER_PLAN_PERIOD into continuous period of medical coverage, which I do not think is possible in this section of Atlas.
Am I missing something?

Thank you in advance!

I hope that we can not overload observation period with different flavors of OP and instead leverage something like payer plan period to represent periods of time of special coverage.