Adding strength and form to the drug_exposure table

schuemie · February 26, 2015, 6:11am

We use RxNorm as the standard for drugs, and RxNorm encodes drug strength in its drug identifiers (e.g. concept ID 19041845 stands for ‘Carbocysteine 250 MG Oral Tablet’). However, especially outside of the US there are many drugs that are prescribed in a different strength than what is in RxNorm. For example, in Japan clarithromycin is most often prescribed as 200mg tablets, and there is no corresponding code in RxNorm (only 50mg ,250mg, 500mg, 1000mg tablets). As a consequence, we are forced to map to the ingredient level, and lose the strength information even though it is available and structured in the source data. In general, this makes the idea of deriving a dose_era table pretty useless: only part of the prescriptions of an ingredient will have associated strength as encoded in RxNorm clinical (or branded) drugs, but for the most part the information is lost, and the loss is non-random.

To fix this, I suggest we copy some of the fields of the drug_strength table to the drug_exposure table:

amount_value
amount_unit_concept_id
numerator_value
numerator_unit_concept_id
denominator_unit_concept_id

And we’ll probably also need to add

dose_form_concept_id (indicating whether its a tablet, solution, cream, etc.)

This way, if we’re unable to map to a clinical drug, we can map to the ingredient and use these fields to keep all the relevant information. For drugs with multiple ingredients, we could use one row per ingredient.

@Christian_Reich: what do you think? Something to add to CDM v5.1?

Christian_Reich · February 26, 2015, 8:29am

@schuemie:

Completely understood your problem. Can I make another suggestion? Let’s get the Japanese Drugs in and add them as pseudo-RxNorm, having all the right connections. Can you provide a table? We are doing the same thing with Multilex and going forward DM+D/Gemscript.

schuemie · February 26, 2015, 9:21am

My god, you want to take over from NLM?

I think we can share the Japanese drug codes (I’ll have to ask), but just out of curiosity: what does your suggestion achieve that mine doesn’t, except generate a lot of overhead?

Christian_Reich · February 26, 2015, 10:39am

I am the NLM.

Because we don’t want two separate ways to represent the same information. Standard database rules. The analytical tool would have to go to two different places to cobble together the strength information. You don’t want that. Nobody (apart from the two of us) would see through this.

Also: It isn’t that bad. I am building an automated script that if you have the information for each product of what the ingredients and their strengths are (in RxNorm lingua), it will automatically build up all the rest, including relationships to drug classes.

Patrick_Ryan · February 26, 2015, 12:06pm

I think martijn’s proposal is something we should seriously consider for
the next cdm version. we’ve encountered numerous ex-us databases with
different dose and forms of the same ingredient, and I don’t think it’s
fair or reasonable to expect that the ohdsi vocabulary curators will be
able to accommodate in a timely manner for all of the requests that this
would generate. Our experience had been that rxnorm ingredients have a
very good (but not fully complete) coverage of western pharmaceutical
active ingredients, but can be missing a majority of dose/form combinations
used in other geographies. Allowing a data holder to populate the required
drug _ concept_id with either a rxnorm ingredient or clinical drug (as is
our current convention) but then adding an optional dose and form field
would accommodate the various situations that have arisen. And, much like
with drug era logic, which requires knowing which optional fields are
populated, we could apply similar logic to the construction of dose era
table to coelesce values in these new optional with dose/form information
that may be available within the drug_strength table when a rxnorm clinical
drug concept is applied. These fields would not break current conventions
or require any modification to existing tools, but would enable capture of
custom dose information, in a manner similar to what Charlie argued for
with effective dosing in cdmv5.

Christian_Reich · February 26, 2015, 2:21pm

@Patrick:

Hang on a sec. There is a difference between the strength of the product and the intended or administered dose a patient receives from a provider. I would not mix them.

The mapping: As I said. We are building an automated tool. The steps are:

Format your drugs for input (product - name - ingredient(s) - strength(s) - formulation
Create mapping RxNorm Ingredient - target ingredient (usually around a dozen weirdo compounds)
Create mapping RxNorm formulation - target formulation (usually a few dozen of them)
Run the script and issue new concepts and mappings to RxNorm if exists

schuemie · February 27, 2015, 9:10am

@Christian_Reich: I see why you would say that, but it does seem like a lot of work if we want to become our own data curators.

It means that every time there is a data refresh, we’ll need to find out which drugs are new and add them to the OHDSI Encyclopedia Of All Drugs.

Meanwhile, for every refresh of RxNorm we’d need to move drugs from the pseudo-RxNorm to RxNorm vocabulary when appropriate. When that happens, do they get a new concept ID, or would we just give them the RxNorm identifier and vocabulary_id and keep the same concept ID?

I would think that just remembering there are two ways to find drug strength information in the CDM is a lot less work.

Christian_Reich · February 27, 2015, 10:02am

Correct, but if we want to use the rich and well-organized world of Concepts in the Drug domain we will have to do that. We can’t have a two-class society - the RxNorm drugs and the rest being junk.

[quote=“schuemie, post:7, topic:355”]
Meanwhile, for every refresh of RxNorm we’d need to move drugs from the pseudo-RxNorm to RxNorm vocabulary when appropriate. When that happens, do they get a new concept ID, or would we just give them the RxNorm identifier and vocabulary_id and keep the same concept ID?
[/quote]If a product comes to the market on the US, but it existed already internationally and we had a pseudo-RxNorm, we will have to turn the latter into a non-standard concept and add a record “Maps to” in the concept_relationship table. With the next update of the ETL everything is gravy. That is what’s happening to SNOMED for example all the time.

The alternative is a big mess, and you never know where you should look for information. A drug product can exist as a Concept or as a mish-mash of Ingredient concept with some dosing information in a data table. The latter is not controlled. For example, the strength you want to put into drug_exposure will be a mess: You got mg/dl, mg/ml, g%, vol%, all for the same product. The forms are equally messy. Do you want to normalize that stuff during data ETL?

Chris_Knoll · February 27, 2015, 3:54pm

What if we just add a ‘perscribed strength’ as a % of the actual drug exposure row to handle cases where the prescriber says ‘take only half this tablet’. or ‘i’m giving you a double strength shot of this exposure’. You wouldnt’ really override the dose form, would you? Unless they took a tablet smashed it up into a paste and then applied it to the skin? Is that one of the use case we’re thinking about when specifying the dose form in the drug exposure?

On the other hand, reading more closely, Martijn seems to be talking about exposures that are defined in regions that know nothing of US standards. On the other other hand, standardization is what the CDM is all about so yes, @Christian_Reich, we do want to normalize that stuff during the ETL so that we all are talking in the same ‘language’.

I do not think adding new concepts to a vocabulary based on new dosage procedures we see in the wild is the right way to go…

Christian_Reich · March 2, 2015, 2:41pm

Guys:

You are implying that this is a new idea. But we already have that! Multilex is exactly that. There are about 25% of the products that don’t exist in that form in the US. For example, in the UK they sell Nicorettes not just as patches and tablets, but also as inhalers (so it feels more like a cigarette). So, the ingredient is the same, but there is a combination with a form that dosn’t exist in the US. So, all we do is to add that product, and the relationships to the form (solution for inhalation or so) and the strength. This really is not rocket sience.

If we don’t do that, we have well defined products with relationships to forms, drug_strength and drug classes, and the “ugly” ones which oinly exist as ingredients. I jsut did a query at AZ for inhalant steroids (we have a product like that), and it would have been impossible to run this in CPRD without it.

Vojtech_Huser · March 2, 2015, 3:52pm

To provide another voice - I kind of agree with Patrick that we should consider Martijn’s proposal to extend the drug table seriously.

Updating the vocabulary takes time.
How many releases we would have realistically per year year?
Managing all combinations for several world countries is a big task.

schuemie · March 4, 2015, 1:09am

@Christian_Reich: why would it be a ‘big mess’? We’d be coding ingredients, drug form and units using concepts in the vocabulary.

I agree that having two ways to code the same thing might be a hassle, so let me go in the complete opposite direction: why don’t we demote RxNorm clinical drugs and branded drugs from standard to source concepts, and code everything as I suggested, including data where the source has RxNorm clinical drugs?

This would make life a lot easier for people doing an analysis. Instead of first having to query the vocabulary for what they’re looking for (‘give me all drugs with this ingredient, with this strength and form…’), they can query the CDM directly.

Patrick_Ryan · March 4, 2015, 1:23am

This is a very interesting proposal. It’d be good to think more about the
consequences. For example, how would we handle combination products?

Christian_Reich · March 4, 2015, 4:11am

@schuemie:

It is an interesting proposal indeed. The advantages are big: No matter how you get the information (from an NDC with strength and form built in, or from a prescription order system with those separate), we could easily handle the difference. We would also be on top of situations where we only got the ingredients, or only 2 out of the 3 components (we need extra two concept_classes for those today).

The main disadvantage would be that we would loose the concept of a product. We couldn’t ask the question “In what forms is Nicorette sold in the UK vs. the US?” We couldn’t do product based queries in general. We also would loose combination products. We would push the normalization of 2 of the three components to the ETL (when is a “solution for injection” the same as a “prefilled syringe”, is g% the same as g/dl?). We would need a special type of mapping to combinations of units, amounts and forms.

At any rate. Nothing will happen before this release.

schuemie · March 4, 2015, 8:23am

Combination products: Add an extra field that links ingredients of the same product (every exposure to a product gets a unique identifier that links across rows). Lets call this field exposure_id

Brands: Add a brand field (e.g. ‘Nicorette’)

You could then answer your Nicorette question (although I guess you need quite a complex SQL statement with a pivot).

Chris_Knoll · March 4, 2015, 6:59pm

Perhaps we can add a product column (product_concept_id) which can link to a new domain vocabulary ‘Product’ which contains lists of known products. then you can associate a drug exposure from a product that it originated from. In cases where the drug exposure came from just a raw ingredient, the product_concept_id could be null.

We could also introduce a product_detail vocabulary table that allows you to dig in deeper to the construction of a product by ratio of ingredients or something (which would be concept IDs linking back to Drug).

I’m not a big fan of storing a text value as an indicator, that’s my only concern.

-Chris

Christian_Reich · March 5, 2015, 3:13pm

Friends:

What problem are we trying to solve? If the problem is the addition of a new international drug database - we got that covered, or we will. It will be very easy. The only job we will have to do manually is the mapping of forms and ingredients, and we have to do that no matter what. To push all the complexity from the vocabulary to the data tables, making them a lot more difficult to understand, with a lot new fields and rules, makes very little sense to me.

Martijn: Do you have a specific international drug vocabulary you need to cover?

Patrick_Ryan · March 5, 2015, 3:38pm

I think the issue isn’t producing a complete list of active ingredients
(though we are seeing some missing in RxNorm)…the primary issue we’re
trying to address is that there are a wide array of strengths and
formulations for the same set of ingredients, and it would be difficult to
maintain a universal set.

Many of the collaborators we’re working with in the Asia-Pacific region
have local codes to represent their drugs, and we’ve been unable to find
any mappings (though some had manually created mappings to ATC
ingredients), so we’ve been using Usagi and other approaches to build the
mappings. That’s where this issue of unavailable strength/form concepts
has really come to a head.

Brandon_Ulrich · March 5, 2015, 4:34pm

The size of some of the ontologies and drug dictionaries might also be larger than you expect. For example, in Singapore, their drug dictionary alone is much larger than SNOMED CT, while their national extension contains additional ingredients, dose forms, etc. not present in the international release (see below). It seems to me that this would be quite a maintenance burden given the frequency of drug product updates.

Christian_Reich · March 5, 2015, 6:36pm

You would have to have

a local code
the ingredient mapped to RxNorm (or an addtional pseudo RxNorm)
the strength of that ingredient
the above for all ingredients
the form mapped to RxNorm forms
to create the content of the table. Correct?

If you have those, you have all you need to update the vocabulary. It’s the same. Except, you have the advantage to know what products exist on a certain market, which has proved very useful.

I really don’t see the point in adding those to the CDM tables.

Same in SNOMED UK. I wouldn’t worry about the number of codes.