OHDSI Home | Forums | Wiki | Github

Vaccine concept mapping improvement

I definitely would like to be involved with this group. I manage the CIEL dictionary for OpenMRS and have published an open source concept dictionary (mapped to SNOMED, ICD-10, CVX, RxNORM) for COVID including vaccination concepts (including global vaccines) here: https://app.openconceptlab.org/#/orgs/CIEL/collections/COVID-19-Starter-Set/

There are also other vaccine concepts (unrelated to COVID) which are included in the core CIEL concept database: https://app.openconceptlab.org/#/orgs/CIEL/sources/CIEL/

Someone should update the CIEL load in Athena.

2 Likes

April 21 meeting summary

  • Adam presented the decomposition of four pneumo CVX codes.
  • Why are we decomposing? We want a use case driven hierarchy and decomposition will help us identify the important attributes that should be considered when building a hierarchy.
  • In past work on vaccines vocab team did not infer attributes that were not explicitly stated in CVX concept name. Possibly better to decompose branded drug rather than CVX code.
  • In the case where CVX codes have only one branded drug form the decomposition is simple. For “unspecified” case the decomposition is less clear.
  • Application – (1st, 2nd, 3rd dose) should be handled from the cohort building perspective and not encoded in the vocabulary.
  • Ingredient, brand, and vaccine type are important attributes for use cases
  • RxNorm separates dose of each individual ingredient. Is the dose of each individual ingredient important for vaccine use cases?
  • A lot of vaccine records are recorded in source data as procedures with very limited attribute information.
  • Ideas for improvements could be made in the near future
  • How much vaccine content is not represented as CVX?
  • CVX ‘maps to’ RxNorm in a small number of cases: when there is an exact equivalent

April 28 meeting agenda

  • Decomposition update: Denys will present decomposition of Merck branded drugs
  • Review use cases and attributes required by use cases
  • Roadmap update and discussion:
    • Fix incorrect vaccine mappings identified by Denys
    • Introduce CVX Vaccine Group
    • ATC-RxNorm (CVX) hierarchy improvements – where to start
    • Review CVX “maps to” RxNorm relationships
    • Refresh CVX descriptions

Hi @Adam_Black ,

I’ve been following this thread and the vocabulary work from afar. The 6am MT meeting time is just too early for me. Here in Colorado, and other Epic data I’ve seen, have most vaccine data coming across as string text and not coded. Is providing guidance on mapping these data to standard concept_ids part of this WGs role? I define “guidance” as the process on mapping a string text. Something along the lines, map a drug based on the following attributes with #1 being the most important:

  1. active ingredient
  2. disease prevented
  3. amount of ingredient
  4. dose form
  5. route
  6. preservative or preservative free
  7. live or attenuated
  8. etc
1 Like

@MPhilofsky:

Short answer: Yes.

Right now, we are building the grand truth table with all codes and all their attributes, so we can create a target structure. It’s a mess.

2 Likes

Hi everyone,

Given that we have a significant amount of decomposition work to complete I would like to adjust this meeting to every two weeks instead of weekly. This will give us more time to make progress on the decomposition work before we check in again. Our next meeting will be 5/19. Denys Kaduk and I will continue our work decomposing branded vaccine products. If anyone else wants to coordinate with us on this work please reach out to me or Denys directly.

Thanks again for your participation in this group! I’ll post the agenda for the 5/19 meeting here a few days before we meet again.

Hello Everyone!
Here’s an Agenda for upcoming tomorrow’s meeting:

  • Decomposition results so far:
    • Semi-automated CVX decomposition: Pnuemo, HPV, DTAP
    • Automated RxNorm branded drug decomposition
  • Consensus on minimal set of attributes
  • Exactly which concepts need decomposition? (CPT, HCPCS, Other standard vaccine codes)
  • Are we building a custom OMOP vaccine vocabulary?
  • Presentation of sample hierarchy and related issues

May 19 meeting summary

  • Adam presented semi-automated way of dzecomposition CVX codes, based on ‘CVX-RxNorm’ relationship.
  • Discussion about packages not-working properly, especially FeatureExtraction in conditions when CVX happened in the data
  • Polina presented ATC-CVX relationship (table) as part of the work on the ATC
  • Alexander revealed concern that there may be a lot of gaps in implementation CVX into hierarchy.
  • In discussion several solutions were proposed by Mik:
  1. CVX implementation, but gaps in the hierarchy wouldn’t be resolved
  2. CVX ingestion and creation of concepts using ‘OMOP Extension’ for gaps in hierarchy
  3. New Vaccine Extension vocabulary - a lot of effort will go on this

June 2 meeting agenda

  • Gaps in implementation CVX into hierarchy. (Alexander Davydov)
  • Proposal for the implementation of combined vaccines using the example of TD vaccines (Denys Kaduk)
  • Vaccine procedure codes. How to handle them?

June 2 meeting summary

  • Alexander Davydov presented “Gaps in implementation CVX into hierarchy”
  1. on examples of measles vaccine, Hep B, COVID19, influenza
  2. presented solutions (1. Use ATC, create ATC-CVX, whithout CVX-RxNrom. 2 New OMOP Vocab. 3do not precoordinate)
  • Christian Reich mentioned if the research is interested in specific cases use ingredients\forms of Standard RxNorm
  • Christian Reich propose to create generic ingredient without RxNorm linking

Agenda for June 16

  • CVX implementation on DT, Hep combination, Hep A cases (Denys Kaduk)
  • Discussion on suggestion to create generic ingredients
  • Vaccine procedure codes. How to handle them?
2 Likes

June 19 meeting summary
Denys K presented ‘CVX implementation on DT, Hep combination, Hep A cases’
Discussion and the final points for are:

  1. CVX will have relationship up to ATC at this moment, without creation new relationship to RxNorm
  2. Generic ingredient cases will be presented by Alexander on upcoming meeting
  3. Specific Vaccine procedure codes will have mapping to Drug domain with losing some attributes
  4. Non-specific Vaccine procedure codes will be leaved as Standard

Agenda for June 30

  • Automated hierarchy building using CVX (Rashmie A.)
  • Discussion Generic Ingredients for vaccines (Alexander D)

This is what I posted in the MS teams after the discussion:

The current solution (that is implemented for COVID and we need to apply for the whole list of vaccines):

  1. Map them over to the Drug Domain concepts. The context whether it’s a first, second or buster dosage will be lost, but may be addressed in cohort definition.
  2. Unclear historical or current vaccinations are mapped to the History of drug therapy concept, while the actual vaccine is mapped to the Drug Domain concept (value_as_concept_id field).
  3. Unspecific vaccine / serum administrations are mapped over to the appropriate generic CVX concepts, unless we found them redundant or corrupting something.
1 Like

I have been looking at the vaccine tradename to CVX mapping from the CDC (link) and it looks to me like different manufacturers sometimes produce vaccines under the same tradename perhaps at different points in time. For example:
Columns are: tradename, cvx name, cvx code, manufacturer
image
image
image

Can I safely assume that vaccines with the same tradename contain the same ingredients even if they were created by different manufacturers?

No, there is no clear rule.
Even within the same brand name and manufacturer they do change the composition of ingredients.
For flu vaccines it happens every year.
Another example is prevnar, prevnar 13 and prevnar 20 - pretty the same brand names.
For the rest it should be stable more or less, unless you compare the vaccines from the different decades.

And we didn’t observe composition differences due to manufacturer change. Mostly it happens because of administrative reasons - companies collaborate or sell the technology to each others.

1 Like

Does anyone have an example two vaccines with the same brand name but with different (active) ingredients?

Can we assume that each CVX code has one unique set of ingredients and that those ingredients do not change over time?

One thing that I’m struggling with regarding CVX is when to assign attributes to a CVX code that are not explicitly stated in the description. For example what is the difference be between

CVX code 20 - diphtheria, tetanus toxoids and acellular pertussis vaccine
and
CVX code 107 - diphtheria, tetanus toxoids and acellular pertussis vaccine, unspecified formulation

The latter implies that the former has a specified formulation. But what then is the formulation of CVX code 20?
There are at least four different branded products that map to CVX code 20 according to the product to CVX code crosswalk from CDC.
image

If these four products have different formulations then the formulation of CVX code 20 is ambiguous which is, for our purposes, the same as unspecified since we cannot say what the formulation is. (I actually can’t tell if all of these have the same formulation and have found it difficult to find information on discontinued vaccines.)

Shouldn’t CVX:20 and CVX:107 actually be represented by a single general DTaP concept in the OMOP vocabulary?

I’d ask the CDC, @Adam. What their intention was to create 107 when they already had 20. Usually, there is some specific reason. Which may be already obsolete. I know it sucks.

It mostly depends on how you define the brand name.
I just checked in OMOP vocabulary. In RxNorm it’s all clean since even for the flu vaccines they made the following attributes a part of the brand name: season (year), dose potency, route of administration, number of ingredients, hemisphere it was recommended for.
And now the branded names look like:

In many vocabulary sources (and as well as in OMOP) such things are not considered to be the property of the brand name.
That’s why when I run the same query against RxNorm Extension I found many candidates.
However, most of them are just the same thing written differently (and this is another problem we have). Also, you’ll find some erroneous combinations (like “Cholera Vaccine / Vibrio cholerae Oral Suspension”).
But this is anyway a good list to start with:

The query
with inclusion as (SELECT
'vaccine|virus|Microb|Micr(o|)org|Bacter|Booster|antigen|serum|sera|antiserum|globin|globulin|strain|antibody|antitoxin|toxoid'
),

exclusion as (SELECT
'Drosera'
),

products as (
SELECT *
FROM concept c
WHERE c.vocabulary_id IN ('RxNorm', 'RxNorm Extension')
    AND c.concept_class_id ~* 'branded|marketed'
    AND c.concept_name ~* (select * from inclusion)
    AND c.concept_name !~* (select * from exclusion)
),

a as (
SELECT c2.concept_name as brand_name,
       c.concept_id as product_concept_id,
       c.concept_name as product_concept_name,
       array_agg(DISTINCT c3.concept_id ORDER BY c3.concept_id) as ingredients

FROM products c

JOIN concept_relationship cr
    ON c.concept_id = cr.concept_id_1
        AND cr.relationship_id IN ('Has brand name')
        AND cr.invalid_reason IS NULL

JOIN concept c2
    ON cr.concept_id_2 = c2.concept_id

JOIN concept_ancestor ca1
    ON c.concept_id = ca1.descendant_concept_id

JOIN concept c3
    ON ca1.ancestor_concept_id = c3.concept_id
        AND c3.concept_class_id = 'Ingredient'

GROUP BY 1,2,3
),

brand as (
SELECT brand_name
FROM a
GROUP BY 1
HAVING COUNT(DISTINCT ingredients) > 1
)

SELECT DISTINCT b.brand_name,
                product_concept_id,
                product_concept_name,
                ingredients
FROM brand b

JOIN a
    ON b.brand_name = a.brand_name

ORDER BY b.brand_name,
         product_concept_name,
         product_concept_id,
         ingredients;

Look here, this 20 guy is what typically recognized as DTaP, while 106 has more pertussis antigens and recognized as DTaP(5) / Daptacel. And the 107 guy is a common grouper for them and for all other DTaP-containing combined vaccines.

What scares me more is that 01 diphtheria, tetanus toxoids and pertussis vaccine and some other “cellular” guys are all linked to DTaP (acellular) what is wrong by the definition, but was probably done because the “cellular” DTPs are not longer used on the U.S. market. So they used the current, but more specific grouper what creates a lot of confusion.

Please note that 102 DTP- Haemophilus influenzae type b conjugate and hepatitis b vaccine is “non-US” but doesn’t affect the way how CVX is organized. It is also linked to the closest (but more specific) grouper.

Yes, they all are different containing 1 (CERTIVA), 2 (TRIPEDIA), 3 (INFANRIX) or 4 (ACEL-IMUNE) pertussis antigens. But this is what TDaP is :smiley:

If we want to distinguish 1-4 vs 5 pertussis antigens, then no.
If we want to distinguish better, then we need even more groupers.

1 Like

In general I think the reliance on non-explicit information in the vocabulary is very bad. Vocabulary users should be able to take concepts at face value and not have to consider the intention of the creators. The reason is that if a user can’t take CVX codes at face value how will they know they can take any standard OMOP concept at face value.

Exactly! That is what we called “we can’t use CVX as Standard any longer”.

1 Like

The above are useful information for those of us having to custom map our US vaccine data. How do we tease out these attributes?

Agree, but how would you know when this elimination from the market has happened? In the sources you have both current and old data (when it was still used in the U.S.), while vocabularies can show only one (the current) state for the concept, i.e. DTP was widely used in the U.S. before they completely switched on DTaP.

Actually, we add it to the concept name (or synonym), look.
However, it looks like there are some more:

Maybe CVX just recently changed something, @Dymshyts @Violetta_Komar?
We anyway need to fix it and this is one of the low-hanging fruits we discussed in the vaccine WG.

t