Mapping ATC/Clinical Drug concepts to RxNorm Ingredient for drug_era

jlieverse · April 15, 2026, 12:21pm

Hello OHDSI community,

We are building an ETL to transform medication administration records into OMOP drug_exposure.

In our source table, we have, among other things, ATC codes. OHDSI provides ATC to RxNorm ingredient mappings which can be used in the drug_exposure table.

Let’s assume the example of having concept abacavir – 300 mg – J05AF06 in the source.

In drug_exposure we do not have an “amount_value” field because this is derived from drug_strength. It does not work for RxNorm Ingredient concepts (i.e., abacavir - 1736971) since such concepts do not have a value for “amount_value”, hence we cannot preserve the dosage information attached to the drug (i.e., 300 mg)

On the OHDSI OMOP Github for the drug_concept_id field in drug_exposure table, it is mentioned that the preferable hierarchy is “Marketed Product, Branded Pack, Clinical Pack, Branded Drug, Clinical Drug, Branded Drug Component, Clinical Drug Component, Branded Drug Form, Clinical Drug Form and only if no other information is available Ingredient.” Of note: some of these include concepts from RxNorm Extension as well.

So we now made these mappings with Usagi get to a Clinical Drug (in this example, where available, we also map to higher-level hierarchical concepts) = concept ID 1736994.

Now onto our questions. Next, we want to populate drug_era but the script to create it requires RxNorm Ingredient concepts.

Do we need to make an additional mapping table in which we use the ATC to Ingredient mapping to populate the drug_era table?

Or is there any other way to get from this Clinical Drug (concept ID 1736994) to Ingredient? In Athena we noticed how this concept relates through the “Consists of (RxNorm)” link to concept ID 1736997, which next relates through “Has ingredient (RxNorm)” to concept ID 1736971 = Ingredient. However, we expect major performance issues if we build that in to the query…

We would be happy to hear your considerations, thanks!

MaximMoinat · April 15, 2026, 7:17pm

See this default drug era derivation query:
https://ohdsi.github.io/CommonDataModel/sqlScripts.html#Drug_Eras

The first step is indeed to retrieve the ingredient(s) from the drug_concept_id, then collapses exposures of the same ingredient within 30 days. This query can indeed run for quite some time depending on your data size. I know @Chris_Knoll and @Adam_Black have been working on optimisations of this query.

Adam_Black · April 15, 2026, 9:14pm

Yes here is the updated query Chris came up with. I think it should be faster on all dbms but I have not tested it. It’s just for the era collapse step.

select person_id, min(start_date) as era_start, max(end_date) as era_end, count(*)
from (
  select id, person_id, start_date, end_date, sum(is_start) over (partition by person_id order by start_date, id rows unbounded preceding) group_idx
  from (
    select id, person_id, start_date, end_date, 
      max(end_date) over (partition by person_id order by start_date, id rows between unbounded preceding and 1 preceding) latest_end,
      case when max(end_date) over (partition by person_id order by start_date, id rows between unbounded preceding and 1 preceding) >= start_date then 0 else 1 end is_start
    from #RAW_EVENTS
  ) E
) G
GROUP BY person_id, group_idx

See Possibly simplify drug_era SQL · Issue #464 · OHDSI/CommonDataModel · GitHub