OHDSI Home | Forums | Wiki | Github

Complete vaccination records in OMOP databases

Hi there, has anyone tried to obtain a set of patients with complete vaccination records from any OMOP databases? I understand that immunization databases are not always linked to the EHR and hence, not all patients will have complete vaccination records to analyze. Any advice is appreciated! Thanks!

Hello, @dqng

What’s the complete vaccination? I think the answer will depend on person-specific (age, sex) or geographical region-specific factors.

In Athena you can see a lot of vaccines to build cohort from. These concepts have ‘Drug’ domain and coming from CVX or RxNorm (+RxNorm Extension) vocabularies. There are also concepts related to vaccination in other domains, you can use them to build the most detailed cohort, knowing the specificity of your data.

Was it helpful? You can describe your case in more detail here and the community would be glad to help you specify the existing solutions for your use case.


What’s your scientific question? You will be much more successful in getting answers like “I have a few” in this Forum if you make it interesting. If you just ask “who has data” folks are not likely to engage in the conversation because it will not be straightforward, as @zhuk demonstrated. What’s in it for them?

I am trying to build a cohort of patients who have received a specific vaccine (e.g. influenza, Hepatitis B) to compare against a cohort of patients who have not received. However, to my knowledge, this is not possible due to incomplete data as you will not be able to clearly define a group of patients who did not receive the vaccine of interest.

Thus, this leads to my question on whether we can accurately identify a group of patients with complete vaccination records. Any other solutions to the above problem is welcomed as well. Thank you!


This is a good question, and an important one. We need to figure out how we compare patients with an intervention (vaccines, but other drugs or procedures, too) to those that did not have the intervention. It’s not clear how you would do that, because the patient with the intervention have a clear index date (when they got the intervention), but when did something not happen? Today? Yesterday? A year ago? This requires methodological work. Do you want to engage in that?

That is true. From what I can tell, the problem is twofold:

  • defining a cohort of patients without the intervention (vaccine, procedure, drug etc) → I am trying to figure out this currently for vaccinations as it is complicated by incomplete vaccination records. Specifically for vaccinations, it seems to me that the CVX vocab, using the source vocabulary, might be from immunization databases.
  • obtaining the index date to determine the follow up period to identify the events of interest → I have yet to look into this in details but from what I know, researchers have approached this in a variety of ways such as using date of diagnosis or an arbitrary date as index date.

I have access only to the All of Us OMOP database to work with the first methodological question. I am also new to the OHDSI community so any help is appreciated! Let me know if anyone is interested in working with these issues!

When we do cohort studies that combine clinical data with longitudinal population study (LPS) data, then we can draw conclusions about interventions by “round”. In the case of vaccinations, we might combine drug exposures reported in clinics with self-report from a questionnaire administered during a visit and/or some other data source like a vaccination registry. An LPS has visit occurrences that can be combined to create an observation period. This would be a time series study. During that observation period as a whole or across one of its slices, it is possible to say that a patient didn’t get an intervention based on multiple data sources (clinical, registry, self-report, etc). Maybe this is the “methodological work” that Christian was talking about.

In practice when it comes to COVID-19 vaccinations, the WHO has developed an instrument that can be administered one or more times by health care workers that includes a section on vaccinations. It is intended for use with patients upon hospital discharge or after the acute illness to examine the medium- and long-term consequences of COVID-19. The Post COVID-19 CRF, as it is called, can be the basis for one kind of LPS. In LMIC it is often the case that longitudinal “community” surveys are in play that pre-dated COVID-19. These studies may also cover vaccinations. I am part of a group that is in the process of annotating the WHO Post COVID-19 CRF for use with ETLs that push data collected with a Post COVID-19 CRF into OHDSI for use in network research studies.