OHDSI Home | Forums | Wiki | Github

Drug Dose: Fixes required to allow a standardized calculation - Call for input

Community:

Some of our uses cases involve drug dose, which is the amount of active ingredient per day or exposure time. But currently we have no good way to calculate this, even though we should have all information in DRUG_EXPOSURE and the corresponding references CONCEPT and DRUG_STRENGTH. Folks keep bringing this up all the time. A few of us (@tburkard, @mdewilde, @aostropolets, @Klaus, @MPhilofsky et al.) have been discussing a potential solution to this issue and would like to bring it to the community.

Read on if you are interested. If you do not care about drug dose, this is unlikely going to affect your use cases.


Problem Space

Drugs are not administered to patients as pure ingredient. Instead, they are formulated into drug products with various dose forms, such as tablets and solutions for injection. Therefore, their doses mostly do not directly occur in observational data. Instead, they can be calculated from three parameters: (i) how much drug product was provided to the patient (ii) during what exposure time and (ii) how much active ingredient is in that product. In the OMOP CDM, all this information lives in DRUG_EXPOSURE and DRUG_STRENGTH, respectively.

So, what’s the problem?

  1. Drug product over time - that information is spread over three fields: quantity, days_supply and sig. You only need two out of the three as days_supply = quantity/sig. But they have problems:
    • Days_supply is straightforward, except it is not defined for one-off administrations.
    • Quantity is overloaded, meaning, its definition is conditional on information in another field, in this case drug_concept_id. It can mean:
      • The number of fixed size elements of mostly solid drug formulations (e.g. tablets, inhaler puffs),
      • The volume of a mostly liquid divisible drug formulation, or
      • The weight or volume of an ingredient, depending whether it is solid or liquid.
    • Quantity also collides with the quantity information (denominator_amount in DRUG_STRENGTH) for Quantified Drugs. Likewise, it collides with the box size information (only in RxNorm Extension).
    • Sig is not computable. It’s free text. We have gotten away with that only because many databases provide quantity and days_supply.
    • Drug Strength is defined for all drug concept classes except Drug Forms.
  2. The representation of the drug forms is not explicitly defined:
    • Solid fixed formulation drugs is what is implicitly assumed.
    • Liquid drugs can be confusing because apart from the concentration they can have a volume of a flask and a number of flasks (e.g. Quantified Branded Box 1.6 ML Filgrastim 0.3 MG/ML Injection [Neupogen] Box of 20). Alternatively, some of them lack volume adn concentration and are handled like fixed formulations (e.g. Clinical Drug leuprolide 42 MG Prefilled Syringe).
    • For slow release drug (e.g. patches) the intended duration is currently treated as a denominator, similar to volume for the liquids. But that does not work that way, if you leave a patch twice as long on the arm it will elude more, but not twice as much drug.
    • For packs, which are combinations of drug products at fixed amounts, the calculation of their relative contribution is not defined.
    • Drug pumps, which elude a certain amount of ingredient over time, are not defined.
    • More exotic drug forms, some of them even living in other domains, are not covered at all: chemotherapy regimens, stents, radiopharmaceuticals, homeopathics, etc.

For these reasons, it is impossible to create a standard script that will produce daily or total dose per ingredient. For the same reason, nobody has filled the DOSE_ERA table.

If we want to fix this we need to solve the following problems:

  1. Fix the definition of days_supply for single administrations.
  2. Fix the definition of quantity and consolidate all quantity information in one place.
  3. Get the relevant information out of sig and model it accordingly.
  4. Clean up how all drug forms are represented correctly.

Proposal

Drug doses for all exposures will be calculated by simple formulas:

total dose = quantity * drug strength for each ingredient
daily dose = quantity * drug strength for each ingredient / days supply
or
daily dose = sig * drug strength for each ingredient

This will work if:

  • For drugs formulated into fixed element (solids, denominator unit in DRUG_STRENGTH = NULL), quantity contains the number of these elements. Drug strength already contains the amount of ingredient per element.
  • For divisible drugs (liquids, denominator unit in DRUG_STRENGTH is not NULL), quantity should contain the volume or other measure that can be divided randomly. Drug strength contains the concentration (numerator, denominator).
  • For combination drugs, these formulas must be applied to each ingredient separately, resulting in multiple records.
  • For drug records provided as ingredient as well as Clinical/Branded Drug Form, quantity should contain the amount of ingredient and drug strength = 1.

It is worth mentioning that this will give us the maximal possible dose. Obviously, we know that patients show less than 100% adherence and the doctors sometimes oversupply, especially for “on demand” drugs, like asthma medication. Methods of establishing the actual dose exist but are not addressed here.

To get us there, we propose the following changes to the OMOP CDM, Themis conventions and Vocabularies.

Changes to the OMOP CDM

Changes to DRUG_EXPOSURE:

# Solution Solves the problem Note
1 Add field sig_amount not computable sig Just like quantity, it should contain the amount of drug product per day. For example, a text “3 tablets per day” will result in sig_amount=3. “2 tablespoons twice a day” will result in sig_amount=10, provided the denominator unit is mL and a tablespoon is 5 mL. The reference duration is always one day, and therefore there is no need for an equivalent to days_supply. This change will require NLP, but the LLMs should have no problem with that.
2 Rename sig to sig_source_value not computable sig It contains verbatim text and should not be something a standardized analytic ever uses.
3 Rename quantity to total_amount quantity definition This change will make sure ETLers will fill in the correct value, instead of just copying and pasting from a quantity field in the source. This change is optional.

Changes to DRUG_STRENGTH:

# Solution Solves the problem Note
4 Merge amount and numerator value and unit field pairs quantity definition This would combine amount_value with numerator_value and amount_unit_concept_id with numerator_unit_concept_id. Currently, content in the amount and numerator field pairs are mutually exclusive, indicating a drug is fixed element or divisible formulation. But we already know this from the denominator unit (the amount might also be empty). This change will make things easier, removing one coalesce() function and avoiding internal contradictions, without losing any information. However, this change is optional.

Changes to Themis Conventions

To support the ETL process, the following should be well documented and enforced:

# Solution Solves the problem Note
5 days_supply = NULL or 1 for one-off administrations days_supply definition We just need to set the convention.
6 quantity = total number of individual elements for fixed element drugs quantity definition This is most common for solid drugs such as tablets and capsules, but it could also be used for powders for solutions or other liquid substances that are formulated to be used all at once (e.g. Clinical Drug leuprolide 42 MG Prefilled Syringe. The quantity can be obtained from the source (often from a field called “quantity”) or from the box size in DRUG_STRENGTH. E.g., a prescription of 30 capsules of acetaminophen 300 MG Oral Capsule would get 30 in the quantity field, a record of Acetaminophen 1000 MG Oral Capsule [Paracetabs] Box of 10 would receive 10. The quantity must be an integer, with the exception of simple fractions (½, ⅓, ¼, written as 0.5, 0.333, 0.25)
7 quantity = total volume or other measure for divisible drug formulations quantity definition The volume can be taken from the source data, or the denominator in DRUG_STRENGTH. This is most commonly measured in milliliters, but there are also actuations (puffs) of inhalers, square centimeters of patches, units and liters, which work the same way. If there is more than one unit in the product (e.g. several flasks or prefilled syringes) the volume should be denominator * box size. E.g. a prescription of 1.6 ML Filgrastim 0.3 MG/ML Injection [Neupogen] Box of 20 would result in 1.6*20=32 for the quantity field.
8 quantity = total amount for ingredients quantity definition Ingredients are measured mostly in milligrams, but there are also cells, milliliters, international units etc., which are defined in DRUG_STRENGTH.

Changes to the Vocabularies

These changes are substantial, but might make the whole system simpler and more concise. But they are also a rather large change to the system, affecting the RxNorm and RxNorm Extension vocabs and hierarchy.

# Solution Solves the problem Note
9 The time denominators (12 hours, 24 hours) of slow release drugs become part of the dose form quantity definition In contrast to drug pumps, the release time of slow release formulations is not linear and can therefore not be treated like divisible amounts of drug. For example, 24 HR gabapentin 300 MG Extended Release Tablet will change to “gabapentin 300 MG 24-HR Extended Release Tablet”.
10 All liquid Quantified classes are destandardized and mapped over to their Clinical or Branded counterpart quantity definition This will remove the denominator value from DRUG_STRENGTH and put it into the quantity field, unless it is overwritten with better information from the soure. For example, Quant Branded Drug 1.6 ML Filgrastim 0.3 MG/ML Injectable Solution [Neupogen] will be mapped to Branded Drug filgrastim 0.3 MG/ML Injection [Neupogen], and Quantified Branded Box 1.6 ML Filgrastim 0.3 MG/ML Injection [Neupogen] Box of 20 goes to Branded Drug Box Filgrastim 0.3 MG/ML Injection [Neupogen] Box of 20. This change is desirable, since it will make the drug hierarchy much more concise, but at the end of the day this change is optional.
11 Add Drug Form records to DRUG_STRENGTH drug_strength definition Currently, we have records for all drug concept classes except Clinical/Branded Drug Form. But these should be treated like ingredients, providing themar unit of the ingredient (mostly mg, sometimes mL or more exotic units).

Open issues

There are a number of additional issues, the solution for which we could decide to adopt or leave alone:

  • Right now, we have quantity (maybe renamed total_amount) and sig_amount. Alternatively, we could name them exposure_amount and daily_amount.
  • With the above definitions for quantity (maybe renamed total_amount), it still can mean 3 different things. We could decide to not overload it by splitting it up into _of_doseform, _of_denominator and _of_ingredient. Same thing for sig_amount. Doing so we avoid the ugly “case when then” clauses, but we would add six new fields.
  • We might have simplified sig_amount too much. “Two tablespoons twice a day” indeed is 4 tablespoons a day, but we are throwing away the fact that it is 2*2. How much do we care about that?
  • What do we do with Packs? Contraceptives have packs with 21 products, 7 containing estrogen and 14 with progesterone. Now what?
  • When we resolve the sig, what do we do with “as needed”? Similarly, do we need to encode “up to”?
  • We also need to solve the problems of drug pumps, and we have a proposal, but let’s table that for now.
  • We also need to solve the problem for the exotic drugs, but let’s table that for now.

This would be a big surgery indeed, but we believe if we get that right we solved the problem, and we will be the only global initiative to achieve that.

What do you think? Anything we forgot? Anything that won’t work?

2 Likes

@Christian: Many thanks for bringing this up again and for the excellent summary.
I would like to share how I have handled this in the past, which is pretty similar to what you suggested.

The biggest problem was that the “quantity” field in the “drug_exposure” table refers to something undefined or, in best case (after our proposal) to different things depending on the drug_concept_id.

My approach was to define that quantity refers always to the (entire) entry in the “drug_strength” table.

To be more precise, the formula for total dose should be:

Total_dose = quantity * amount_value * box_size

The daily dose can be derived by total dose / day_supply. However, as you clearly pointed out, this depends on the patient’s compliance and this topic is out of scope for the time being.

So, what needs to be done in order to enable the simple formula above?

Because we need the drug_strength table, drugs need to be mapped to standardized concepts and we need the dosage information; if we only have a concentration without specifying the total amount, the denominator of the concentration must equal the total amount.

I addressed the German market, maybe something cannot be transferred to other markets, but all the examples listed in the notes are working with the following assumptions:

  • Your proposal #4: Combination of amount_value and numerator_value into one single field.
  • Your proposal #5: Days_supply is 1 for a one-off administration. This was always the case.
  • Additional proposal: box_size will be set to 1 in case it is NULL

Let’s look at the examples:

  1. 1.6 ML Filgrastim 0.3 MG/ML Injection [Neupogen] Box of 20
    The amount_value in the drug_strength table is 0.48 and box_size is 20. The formula can be applied.
    Quantity refers to the entry, i.e. the entire box of 20 flasks.
    Proposal #7 is not needed.

  2. leuprolide 42 MG Prefilled Syringe
    The amount_value in the drug_strength table is 42, box_size was empty but replaced with 1. The formula can be applied.
    Quantity refers to the entry, i.e. one syringe.

  3. acetaminophen 300 MG Oral Capsule
    The amount_value in the drug_strength table is 300, box_size was empty but replaced with 1. The formula can be applied.
    Quantity refers to the entry, i.e. one capsule.

  4. Acetaminophen 1000 MG Oral Capsule [Paracetabs] Box of 10
    The amount_value in the drug_strength table is 1000, box_size is 10. The formula can be applied.
    Quantity refers to the entry, i.e. one box containing 10 capsules.

  5. 1.6 ML Filgrastim 0.3 MG/ML Injectable Solution [Neupogen]
    Because this is not a standard concept, we don’t have an entry in the drug_strength table. The proposed mapping in proposal 10 to filgrastim 0.3 MG/ML Injection [Neupogen] would result in the amount value being lost, so that the derivation of the dosage can’t be achieved. So, we need a mapping similar to that in the first example where the amount_value is 0.48.

Maybe I overlook here something, but we may not need proposals 6 to 8 if we use the box_size information as proposed above?

Right, that’s the plan. Except “quantity” isn’t only ingredient amount, as you pointed out, and cannot be forced that way. Look: some of the information is in the reference: amount, concentration, total volume, box size. But other information is coming from the source data: administered or filled (from prescription) element numbers, size of bolus of injection, etc. And the distribution of this information depends on the drug, where it is handed out (by provider or through pharmacy), institution (what their EHR is capturing) and country (whether you have standard box sizes).

As a result, it is the task of the poor ETLer to collect that information from both reference and data records and consolidate. What we need to do is to say what we want in quantity and what we provide in DRUG_STRENGTH.

I know some of them already are in place, but we have to make them very explicit.

Correct. Except you are missing the data in the records. For 1), everything seems in the reference. For 2), that’s probably a single injection, but maybe it is an every other day thing over two weeks (quantity = 7 and days_supply=14). For three, there is quantity coming from the prescription (in the US, box sizes are often not standardized, the pharmacist grabs the tablets from a big keg and puts them into the yellow plastic flask). 4) is probably ok as is. 5) also needs information from the data.

Thanks @Christian_Reich for the proposal. Further standardising, and documenting, the drug dose would massively help.

I only have one comment on point #3 Renaming of quantity to total_amount. A new name would be helpful to bring this change under attention to ETL developers. However, considering renaming is breaking change, I would propose rather than renaming, to add a new field (otherwise breaking change and this would, if we follow semantic versioning, give us god-forbid, OMOP v6).

For point #10, just to confirm, will this also include inhalers / pen injectors? e.g. 200 ACTUAT albuterol 0.09 MG/ACTUAT Metered Dose Inhaler?

I like these alternative names (exposure_amount and daily_amount). And if I understood correctly, these will always have the same unit (e.g. pieces, mg, ml), so makes sense to name them similarly.

I don’t see where the current proposal introduces case when then. It seems very elegant, the amount in drug_exposure will always be the ‘multiplier’ of the numerator/amount_value in drug_strength.

Related: it would be awesome if Athena displays the information in the drug strength table. I will add it as a feature request on Github.

That’s exactly the idea.

Makes sense.

Yes, Sir. The 200 shots would have to go into the quantity or total_amount. We can live without #10. @mdewilde would send us hate mail, and for cause, but I am not sure the Vocab team has the bandwidth to refactor the entire drug machinery right now.

Agreed, and yes. One is for the total exposure, the other one for the day. Of course, you wouldn’t need both if you reliably had days_supply, but we don’t.

I know. The dose formula doesn’t care, the result will be an ingredient amount one way or another. Still, from a religiously clean perspective on data modelling you would split them up since the information is different. I am not married to either solution. Clean is good, a concise model is good. Let’s see what the crowd says.

Agreed 100%. We need the resources.

@Christian_Reich: Me sending you hatemail ??? Haha… Where is that coming from? I know during our very long meeting on this topic we were both pretty hard trying to get through and it was pretty intense at some points. No problem at all. I actually love these kinds of passionate discissions. Taking the time and really get all the way down to the core of the topic. At the end we had this nice proposal we were pretty proud of it.:wink: So, instead of the hatemail you expected, let me give you compliments and thank you how you worked it out resulting in this post!!!

Regarding #10:

Not only for “liquid Quantified classes” and “actuats”, I believe we can simply do this for all the quant drug classes. Since the idea is to store the total amount in the drug_exposure.total_amount, the box classes are also redundant. If we want to do this consistently, we also have to destandardize the quant and also the box levels. This makes the whole construction even more explicit and easier to handle. I believe that was also the conclusion at the end of our meeting in Rotterdam (simpler and more powerful at the same time).

Some of the benefits when having less of the drug classes:

  • easier, cleaner and more exact total amounts in ETL (at least for our own ETL).

  • less possibilities to store the same total amount of an exposure with much less needed concepts

  • drug vocab will be easier to understand and explain to ETL builders/concept mappers and researchers how to deal with dosages

  • less maintenance for the vocab team (about 1/3 !!! of the standard drug concepts are on quant and box level)

  • less need for having RxNorm extensions. If a box of 6 with 123 actuat ml bottles is missing, we no longer need a new RxNorm extension if the (branded) clinical drug is already

  • ancestor table will be reduced enormously (bonus)

  • removing ALL quant and box levels will probably make it easier to change this in “the boiler” … (giving back some of the resources to make the change…)

  • …

If we are changing OMOP things like above (breaking some things), I think we must take this opportunity to combine this with other OMOP CDM/Vocab/Themis changes/improvements there are in the community pipeline. Picked up something around concept mappings with a factor (that can be used for the quant and box level mappings to clinical drugs).

Btw: I hope because of the table changes we move to OMOP CMD v7 (as a bonus we can get rid of the whole v6 confusion).

Of course, the resources are limited (also for the data partners to get the necessary changes implemented). If the next OMOP release has enough improvements, there will also be more motivation to make these changes. “No pain no gain”.

I understood that “The Boiler” has become a very complex piece of machinery and difficult to maintain and/or change. Changing some classes to non standard looks(!!) like an easy change. As an escape maybe as a postprocess after the boiler (remove standard concept flags, add extra maps_to relations, and regenerate ancestor table). If this proposal gets enough support, I hope the #10 can be considered to make this possible.

Intermediate solution: If #10 cannot be done easily/in time in the vocab, we can also use it as a Themis rule (not allowed to use quant and box level drugs) for the time being. DQD can have an extra check on this as well. Then can start with this in the next OMOP CDM release and the change in the vocab can be done in a later stage.

As you know, I’m always available thinking along…

Cheers from The Netherlands

We should also add a drug_dose field. It pairs well with Drug_Exposure.dose_unit_source_value.

This will lower the entry bar and lessen the burden for those with EHR data, since many EHR’s contain a field for drug dose. Also, it’s a very simple solution for implementers and for end users :slight_smile:

t