OHDSI Home | Forums | Wiki | Github

Proposal of CVX and CVX-RxNorm/CVX-ATC hierarchical crosswalks creation

(Violetta Komar) #1

Hello, Everyone!

Currently, the CVX vocabulary is considered to be standard and is often used for mappings of vaccines from many different drug vocabularies. However, CVX vaccines have a granularity deficit due to the absence of such defined attributes as an exact ingredient, dose form or drug strength. This can complicate the cross-analysis of pharmaceutical information and lead to data loss during studies. Therefore, we have developed a new mapping logic for the CVX and built relationships providing hierarchical crosswalks to the RxNorm and ATC and making the CVX a more full-fledged part of the ATC-RxNorm hierarchy.

Please, see our proposal below.


CVX is a coding system for a vaccine substance administered. The codes are used for immunization messages and it is a product of the Centers for Disease Control and Prevention (CDC). CVX includes approximately 200 active and inactive vaccine terms. It also indicates a vaccine’s current availability and the last update time for the vaccine code. Inactive vaccine codes allow users to transmit historical immunization data.


All CVX source information is obtained from the following:

  1. Centers for Disease Control and Prevention (CDC) Website:
  1. The National Library of Medicine (NLM) Website:

For the purpose of comparative analysis, The NDC to CVX Lookup Crosswalk was used during the creation of CVX hierarchical relationships.

Standard concept

  • If CVX concept has a standard equivalent in "RxNorm", it is considered to be Non-standard.
  • If CVX concept does not have a standard equivalent in "RxNorm", it is considered to be Standard.


The majority of CVX concepts are in the "Drug" domain. Only “Tuberculin skin test”-related CVX concepts have the domain of ‘Measurement’ (5.3 Mapping to the SNOMED).

Concept Class

All CVX concepts have Concept Class of "CVX".

Internal CVX relationships

"Is a" from CVX Vaccine concepts to CVX Vaccine Groups and reverse "Subsumes" relationships define Internal CVX Hierarchy.

External CVX relationship

Target vocabularies for the CVX were defined as the RxNorm*, ATC and SNOMED.

Each CVX concept should be obligatory assigned either a ‘Maps to’ or ‘Is a’ relationship, except those mentioned in paragraph 4 (see below).

  • "Maps to" represents an equivalent mapping from a CVX concept to a standard RxNorm one. It can be only a one-to-one relationship.
  • "Is a" represents nonequivalent “uphill” mapping from a CVX concept to the closest (either single or multiple) standard logical ancestor(s) in the RxNorm or ATC.

The following relationships are optional:

  • "Subsumes" indicates “downward” mapping from the CVX to the closest (either single or multiple) standard logical descendant(s) in the RxNorm only
  • "CVX - RxNorm" relationships from the CVX concepts to the RxNorm, which have been made by the RxNorm team.

* Building of relationships from the CVX to the RxNorm Extension requires further assessment.

Mapping logic

  1. If a CVX concept has a standard semantic equivalent represented in the RxNorm by the following:
  • Ingredient - only a source ingredient of a CVX concept is defined,
  • Clinical Drug Component - while an ingredient is determined, a drug strength of a CVX concept not mentioned in the source can be defined under the global immunization standards (Internal link 5.2 Exact dosage ubiquity),
  • Clinical Drug Form - while an ingredient is determined, a dose form of a CVX concept, if not mentioned in the source, can be defined under the global immunization standards (5.2 Exact dosage ubiquity),

then “Maps to” relationship is assigned.

  1. If a CVX concept does not have a standard equivalent and is represented by the RxNorm Ingredient and, if present, other RxNorm attributes such as Drug strength, Dose Form, Quantity factor, it is considered to be AT or BELOW the RxNorm Ingredient level according to the OMOP Drug Domain rules. In such a case "Is a" relationship is assigned to the RxNorm Ingredient and "Subsumes" relationship - to the RxNorm Clinical Drug Component(s) or Clinical Drug Form(s), if possible.

  2. If a CVX concept does not have a standard equivalent and cannot be covered by OMOP Drug Domain rules, it is considered to be ABOVE the RxNorm Ingredient level. The following use cases are distinguished:

3.1 A CVX vaccine contains attributes other than RxNorm and not corresponds to a Marketed Drug Product. In such an instance, a single "Is a" relationship to the ATC is built.

3.2 A monocomponent CVX vaccine has several relevant RxNorm Ingredients. In this case, a single "Is a" relationship to the ATC and "Subsumes" relationships to several RxNorm Ingredients are assigned.

3.3 A CVX vaccine has several relevant RxNorm Ingredients, Dose Forms, and Drug Strengths. In such a case, a single “Is a” relationship to the ATC and “Subsumes” relationships to the RxNorm Quantified Clinical Drugs are built. It works with vaccines against Hepatitis A and a combined vaccine against Hepatitis A and Hepatitis B. For example, “hepatitis A vaccine, adult dosage” has 2 relevant RxNorm Ingredients of “798361 Hepatitis A Vaccine (Inactivated) Strain HM175” and “253174 Hepatitis A Vaccine, Inactivated” as well as Dose Forms (Injection, Prefilled Syringe) and Drug Strengths for adults’ immunization (1440 UNT/ML for HM175 strain and 50 UNT/ML for Hepatitis A Vaccine, Inactivated).

3.4 A CVX vaccine contains "unspecified formulation" and resembles a category. For all such concepts single "Is a" relationships to the ATC are assigned.

  1. If a CVX concept does not have a standard equivalent and cannot be mapped via “Is a” to the closest semantic ancestor, it remains unmapped and does not participate in the CVX-RxNorm and CVX-ATC hierarchies. Absence of mapping is attributable to the following:
  • a vaccine does not exist yet (“65 - leprosy vaccine”)
  • a vaccine is currently under development or not approved by the FDA (“61 - human immunodeficiency virus vaccine”, “58 - hepatitis C vaccine”)
  • a CVX concept represents not a vaccine (“99 - RESERVED - do not use”)
  1. Particular cases:

5.1 Vague RxNorm Ingredient formulation

Some of the RxNorm Ingredients have semantic duplicates with insufficient details. There is a checklist with such RxNorm Ingredients. The reason why an Ingredient was treated as less appropriate is represented at a "deprecation_reason" column.

Incorrect RxNorm Ingredients.xlsx (5.8 KB)

5.2 Exact dosage ubiquity

For some CVX vaccines, exact dosages not mentioned in the source are determined according to the global immunization standards established by WHO and CDC . For example, "121 - zoster vaccine, live" is used only for the prevention of shingles in an exact dosage of 29800 UNT/ML. Meanwhile, "21 - varicella virus vaccine" is used for the prevention of rubella in an exact dosage of 2700 UNT/ML. So, they both are mapped to corresponding RxNorm Clinical Drug Components.

5.3 Mapping to the SNOMED

Among CVX codes, there are 3 exceptions, which are mapped through "Is a" relationships both to the RxNorm Ingredient (a "Drug" domain) and the SNOMED Procedure (a "Measurement" domain) due to mention of such a procedure in the concept name.

5.4 Relationships for Influenza vaccines

The majority of CVX Influenza vaccines are ambiguous in meaning because they do not have information about an Influenza strain and a year of utilization, while this is always reflected by the RxNorm. Thus, such vaccines cannot be embedded in the RxNorm Hierarchy. However, to preserve links given by the source (NLM), not hierarchical relationships "CVX - RxNorm" have been used. Meanwhile, to interpose CVX Influenza vaccines to the ATC hierarchy, "Is a" relationships to the ATC have also been built.

Only one Influenza vaccine (CVX code of “160”) has an "Is a" link to the RxNorm Ingredient because it semantically resides AT the RxNorm Ingredient level.


Depending on a semantic localization towards the RxNorm Ingredient, CVX concepts are embedded either in the RxNorm or ATC hierarchy, as shown in a diagram below.

As a result, a hierarchical crosswalk between CVX, ATC, and RxNorm will be available after an ATC to RxNorm mapping improvement.

Exclusion criterion for participation in a hierarchy construction is an absence of mapping either to the RxNorm or ATC.

CVX current mappings.xlsx (103.0 KB)

Vaccine concept mapping improvement
Vaccines standard - CVX or RxNorm?
(Violetta Komar) #2

@Christian_Reich, @Dymshyts, @aostropolets what do you think about that?

Also, we want to draw your attention to paragraph 5.3 Mapping to the SNOMED. Do you agree with this approach?

(Christian Reich) #3

@Violetta_Komar: Wow. This is a big bite we need to chew on. Give us a minute. :slight_smile:

(Dmytry Dymshyts) #4

So, the question is
“Can we map diagnostic test to a Drug domain?”
according to our documentation
A Drug is a biochemical substance formulated in such a way that when administered to a Person it will exert a certain physiological or biochemical effect.

So, tuberculin skin test is a Drug. But not a procedure, because CVX encodes product, but not a procedure in this case.

(Anna Ostropolets) #5

I think so far we’ve been considering diagnostic tests as devices or procedures. Probably, this is because of the logic that is not explicitly stated in the definition above but is implied: these substances not only have a certain physiological effect but also are administered because of that. So, the primary reason for ordering tuberculin skin test is to test and not to treat; the effect of the test is therefore somehow ‘side’. Otherwise, we could also classify foods as drugs as they have a physiological and biochemical effect on the body.

(Dmytry Dymshyts) #6

Right, and what should we do with Homeopathy - put to procudure alike a psychotherapy?:slight_smile:

LPD_Belgium, JMDC, DA_France have these tuberculins stated as Devices,
RxNorm and RxNorm Extension(!) have a lot of Drug concepts.
So, the correct decision is to assign to some RxNorm and subsequentially RxNorm Extension Ingredients Device domain.
It might be a relatively big effort as we need to modify all our drug scripts.
So, I would like to keep CVX mapping for tubrerculin test as is for now.

(Vojtech Huser) #7

However, CVX vaccines have a granularity deficit due to the absence of such defined attributes as an exact ingredient, dose form or drug strength

I agree that ideal standard vaccine terminology should have those.

A nice vaccine attribute would also be type: either live virus or not live (=chopped virus particles).

As a personal hobby - I like to collect boxes of influenza vaccines from past years (the part of the box with NDC code). And then in few years later, I challenge all fancy systems with searching for those NDC codes from those boxes.

Do I also understand correct that specific vendor of vaccine may not be captured if we exchange CVX codes. (in other words - vaccines may have same problem as devices or “procedure-drugs”).

(Violetta Komar) #8

Hello, we updated the file with mappings

(Andrew S. Kanter, MD MPH FACMI FAMIA) #9

Violetta, sorry to nit pick but happened upon this and looked at the cross-map table:
cvx_id cvx_code cvx_name cvx_domain_id cvx_standard_concept relationship_id target_concept_id target_concept_code target_concept_name target_vocabulary_id target_domain_id target_concept_class_id target_standard_concept
40213291 01 diphtheria, tetanus toxoids and pertussis vaccine Drug S Is a 529218 798302 acellular pertussis vaccine, inactivated RxNorm Drug Ingredient S

Difference between CVX 1 and 20 was the acellular pertussis… which seems to be mapped to CVX 1 when it was supposed to be mapped to CVX 20. Not sure it matters much, but didn’t want the observation to be lost :slight_smile:

(Karthik) #10

hi @Violetta_Komar, I agree w/ @Andy_Kanter. I don’t think that row should be mapped to “acellular pertussis vaccine, inactivated”. It probably should be mapped to “pertussis vaccine

@Christian_Reich @Dymshyts do you know when this mapping will get incorporated into the vocabulary? This could be helpful for studying the covid vaccines.

(Karthik) #11

It looks like these relationships might be in the ancestor table. The FeatureExtraction package groups drugs via ATC and Ingredient class. @Christian_Reich or @Dymshyts does it make sense if CVX codes has a class of Ingredient so the query pulls in CVX codes from the ancestor table when looking at drug_eras or does that not make sense?

(Dmytry Dymshyts) #12

Hi @cukarthik
Ingredient concept class stands for a single active substance, while vaccines quite often are multiingredients.
I like the idea of including CVX as a group for the FeatureExtraction package. Perhaps it would be better to add OR vocabulary_id ='CVX' and standard_concept ='S' to the FeatureExtraction package.

(Dmytry Dymshyts) #13

On the other hand, CVXs might be rolled up to RxNorm (extension) ingredients or ATCs.
Let’s see first what would be the result of @Violetta_Komar work.

(Karthik) #14

That’s what I was thinking as well, but I wasn’t sure if all CVX codes would get rolled up into RxNorm (extensions) codes. Looking at what @Violetta_Komar mentioned, there might be some cases that don’t map (i.e. leprosy). In those cases, adding CVX as a vocabulary into the FeatureExtraction package.

(Alexander Davydov) #15

CVX (as well as the vaccines themselves) is complex.

Sometimes it’s a combination of ingredient (or multiple ingredients) and some other property meaning that it can be either placed below the ingredient level or mapped over to RxNorm.

Sometimes it gives you an ingredient, but not the purpose or type of the vaccine meaning that there’s no way to link to specific RxNorm ingredients and understand what exactly was done with the patient.

Sometimes there is just a scanty idea of the ingredient used, but you explicitly know the purpose (and sometimes the type) of the vaccine.

RxNorm is different. Sometimes it’s very specific and provides you the concepts only for the clinical drug products implied (and you can guess the type of the vaccine):

Sometimes it’s more general and on the ingredient level, you actually can’t know the exact vaccine type.

I’m not really sure that this puzzle can be efficiently addressed with the modeling we have in the Drug domain.
The point is that many of these things can’t be described using trivial hierarchical relationships (even if we link to the Drug forms / Drug products or Branded Drugs as we did while implementing ATC). Given this fact, the cohort building can’t rely on the concept_ancestor table and all the options should be always reviewed and critically evaluated in both RxNorm/RxE and CVX.

Another question is the use cases we need to support:

  • Are they on the ingredient level only? I don’t think so. The vaccines of the different types/ways of preparation/adjuvants are different products with different effectiveness. Meaning that ATC and CVX would be a better choice rather than Ingredient.
  • Do we need vaccine ingredient dosages (in general) or CVX-like vocabulary supports most of the usecases?

But eras won’t be built unless you make them Ingredients.

Definitely not all of them. Many of them are placed above the RxNorm, and ATC would be the only possible parent.

(Chris Knoll) #16

Sorry for my ignorance, but I have some questions:

What’s the dependency on identifying the purpose for the administration? For example, chemotherapy can treat things other than cancers, but the components of chemotherapy are ingredients…so I’m wondering why it’s important to know if something was used for chemotherapy (for example) in order to determine ingredient?

In these cases wouldn’t it make sense to call those procedures? If it’s not talking about a drug exposure, but rather the person underwent vaccination, isn’t that where we could draw a line between procedure and exposure?

It seems that you would need ‘purpose of administration’ in order to assign an ATC class, so that much I think I understand. But would it make sense to identify specific vaccines as ingredients, combo-vaccines as combinations of the underlying ‘component’ vaccines, and maybe children of these ‘component vaccines’ can contain how they were prepared? I think from a concept hierarchy perspective, tho, it makes it complicated to look for descendants of a combo but then only descendants of a combo which one of the components was prepared in a certain way…but the cost of addressing this I think would just be a larger set of concept ancestor records.

Maybe the term ‘ingredient’ is a loaded one (with respect to branded drugs, etc) but isn’t a vaccine at a certain level the same as an ingredient from a ‘molecular structure’ perspective? And if the preparation of the vaccine lends to the molecular structure of the compound, maybe the answer is to place the vaccines + preparation at the ingredient level…and how they maybe used can put them under an ATC class?

(Oleg Zhuk) #17

Just one small notice here: we don’t really need building eras for vaccines.
In most use cases you’ve just ‘got a shot’ a very few times per lifetime.

(Karthik) #18

True, @zhuk . I don’t know if there’s a discussion around how to handle vaccines in the era table. I was under the assumption that they would be one data row. The main concern is that FeatureExtraction groups the drugs via ATC. In these cases, CVX encoded vaccines are missed.