OHDSI Home | Forums | Wiki | Github

ATC release


Together with @Christian_Reich, @abedtash_hamed, @Alexdavv, @Dymshyts and other contributors we revised the content of ATC and its relationship to RxNorm Drug Concepts. We learned a lot in the process, which may be useful for you to understand if you want to use ATC as a classification, or if ATC is part of your source data.

  1. ATC is much more complicated than it looks.
  2. We had to fix the ATC-RxNorm/Extension hierarchy.
  3. We added a mapping from ATC to RxNorm/Extension to support ETLing.

If you want to understand the details, please read on.

ATC, despite being the most popular drug classification system, has a number of non-obvious problems:

  • Some ATC concepts are WYSIWYG, but others incorporate attributes that are not immediately visible: route of administration, indication, mechanism of action, dosage and combinations. Therefore, not every ATC code containing the name of a drug ingredient is automatically the correct classifier for that drug. For example prednisolone is the ingredient of 6 ATC concepts that have the same name but different routes of administration, dose or indication.
  • Some drug classes have even more exotic attributes, like insulines and vaccines. For example, A10AB04 “insulin lispro” is fast-acting and A10AC04 “insulin lispro” is intermediate-acting.
  • Many ATC concepts have no drugs in any of the markets we currently support. Often, these represent historic drug products that have left the market.
  • Many drugs have no ATC class. This is predominantly the case for traditional medicines, extracts, allergenic preparations, but a few “good” ones are there as well, such as thyroglobulin.

What is wrong with the way ATC was implemented till now:

  • ATC 5-level concepts were mapped to RxNorm ingredients. As a consequence, all those additional attributes were not considered. E.g., ophthalmologic prednisolone was an ancestor to all drugs containing prednisolone. We took out all combinations a year ago to prevent complete chaos, but obviously that was not really solving the problem.
  • Many people have ATC in source data, essentially using it as a way to define ingredients. We provided no support to convert these data into standard OMOP.

Here is what we did to fix the hierarchy:

  • We revised the mapping to Ingredients
  • We added some of the missing attribute information to the ATC 5 from the ATC website. For example, H02AB06 “prednisolone” became H02AB06 “prednisolone, systemic”.
  • Alternatively, we extended the attributes from ATC 3 or 4 levels down to ATC 5. For example, ATC 5 R03AC07 “isoetarine” doesn’t have route specified, but we inferred its route from an ancestor R03A “ADRENERGICS, INHALANTS”.
  • We rebuilt the entire ATC-RxNorm hierarchy taking into account the tacit attributes. See the picture below. The above H02AB06 “Prednisolone” with its systemic route of administration is connected to the Ingredient Prednisolone in CONCEPT_ANCESTOR, and from there to only those descendants which have a form intended for system use (e.g. Oral Tablets or Injections). Combinations are also correctly handled:

Here is what we did to support mapping ATCs during ETL

  • New “Maps to” relationships connect ATC 5 to RxNorm and RxNorm Extension. See picture below. Single ingredient ATCs are mapped to RxNorm Ingredients. Combinations are only mapped to the explicit components.

What we haven’t done and needs to be addressed in the future:

  • Independent QA. We are planning to integrate the Norwegian drug repository, for which there is a definitive ATC mapping from the same WHO Collaborating Centre for Drug Statistics Methodology (WHOCC) which maintains the ATC, and then compare their assignments to ours. It won’t include all drugs though.
  • Missing attributes. There are still attributes to be modelled and correctly implemented, even though they are not common. For example:
    • We did not differentiate corticosteroids based on their potency: D07AC14 “methylprednisolone aceponate” is a potent corticosteroid and D07AA01 “methylprednisolone” is a weak corticosteroid, but currently they have the same RxNorm descendants.
    • We didn’t implement mechanisms of action. For example, ATC B01AC06 “acetylsalicylic acid (Platelet aggregation inhibitors)” and N02BA01 “acetylsalicylic acid (OTHER ANALGESICS AND ANTIPYRETICS)” have the same RxNorm descendants.
  • Orphan drugs. We have to think what to do with the RxNorm ingredients and corresponding drugs for which no ATC exists. One way is to bring it up with the WHOCC. Another way is to create a few pseudo-ATC catch-all classes.

As usual, we are happy to take comments, suggestions, bug reports and congratulations.


Thanks so much @aostropolets and team for all your hard work, and for the clear description of the progress.

Could you please help me understand how this may impact our use of CONCEPT_ANCESTOR, particularly when creating conceptsets for drug classes with the aim to find all standard concepts (ingredient and below) within the class?


Good question. First, Atlas needs to stop double-guessing the CONCEPT_ANCESTOR table with the CONCEPT_RELATIONSHP table. Will work even less than before. Instead, just rely on CONCEPT_ANCESTOR. Blindly. :slight_smile:

The thing to consider:

  • ATC is redundant, and the same ingredient might show up in more than one ATC. So, people really need to pick the right ATC, or collect them all if they want. Or, they find the ingredient and first go up the hierarchy.
  • All ingredients are correctly listed. This is important for the combination ATCs. TThere are a lot of ingredients hidden in combo ATCs, which are not explicitly listed in the name. But the hierarchy knows which one they are, and you can see them in ATLAS.
  • All drug components, drug forms and drug products are the correct descendants of the ATC. So, “Predinsolone systemic” will not list cremes, inhalants and other local forms.

This should all work nicely, with the little caveats Anna mentioned above. We don’t think this will be a problem very often.

To clarify, a conceptset expression, as created in ATLAS and generated from webAPI, only relies on CONCEPT_ANCESTOR to expand out into an ‘included concepts’ list. It does not, and never did, use the CONCEPT_RELATIONSHIP for this purpose. To create the ‘included source concepts’, we must use CONCEPT_RELATIONSHIP and the ‘Maps to’ relations to identify non-standard concepts that will be contained within the ‘included concepts’ list.

The first point is the one I am just trying to understand: you say ‘find the ingredient and first go up the hierarchy’, does that mean that a RxNorm ingredient will have an ancestor record to ALL ATC concepts which contain that ingredient now? If so, that’s amazing and extraordinarily useful!

1 Like

Someone needs to attend more Atlas training classes :slight_smile:

Enlighten us, @Chris_Knoll. Would be wonderful if that were fixed. The ATLAS I am using and the one at ohdsi.org returns “No hierarchy found for non-standard concepts” if you click on the Hierarchy tab of an ATC concept. I was told the join CONCEPT_RELATIONSHIP is the problem.

How is it working?

Please lets not hijack this thread, which should stay focused on ATC release. This is an important thread to stay on point, given the value of this new contribution and its application across the OHDSI community.

@Christian_Reich you are talking about ATLAS hierarchy view, which is not relevant to discussion of conceptset expression expansion to included concepts. I am happy to take that offline with you.


Correct. That is now supposed to be working properly.

One feature that @aostropolets’s post hinted at that I want to bring forward was the importance of having domain experts collaborate with vocabulary experts side-by-side in the review/analysis/correction of the ATC hierarchy. Folks from the University of Colorado School of Pharmacy added detailed domain knowledge that highlighted the existing deficiencies and potential reorganization that would be in line with intended use. To the degree possible, this should be the approach used for reviewing all highly specialized terminology hierarchies. As the OHDSI community grows, we hopefully have access to an expanding set of domain experts in addition to our amazing terminology experts to replicate this approach.

@mgkahn: Are you offering up your folks to help? Would be nice!

Hi @mgkahn, thanks for the comment! Our approach is very similar to what you’ve described, by filling the gaps in the knowledge about each drug class and source code (we call this metadata). This collection of “metadata” about has enabled us tremendously to match source drug codes to the correct ATC classe that represents not only the active ingredient but also ROA, dose form, indication, and in some cases the right isomer of the molecule. We are using variety of drug information and terminology resources thru this process.

We had also a successful experience with @callahantiff and her team at UC last year and they were such a great help to move this effort forward. It would be nice if we could resume the collaboration to finalize some remaining issues. Looking forward to it!


The faculty member who has been directing the inclusion of domain experts from the School of Pharmacy in the ATC validation work is @trinklek. So @Christian_Reich – Colorado SMEs are already engaged in the ATC work.

Hi - does anyone know if the PLP and FeatureExtraction methods will be affected by the ATC improvements or if they might be using an incorrect approach? Specifically, @sanyabt and I think that both packages use ATC levels 3 and 5 to group the ingredients as covariates but we don’t know if it does the grouping in a way that avoids issues.

This query returns returns what seems an incorrect result. Acetaminophen is returned a descendant concept to ATC 5 ‘Other agents against amoebiasis and other protozoal diseases’ (P01AX) which is a mapping not present in the ATC browser (https://www.whocc.no/atc_ddd_index/?code=P01AX). The strange mapping did not appear in the hierarchy shown in Atlas so I assume it is because the query is too simplistic and not accounting for the issues mentioned above. Should we go first to descendant concepts in ATC and use the concept_relationship table to go to RxNorm concepts?

FROM vocabulary.concept c1 inner join vocabulary.concept_ancestor ca on c1.concept_id = ca.ancestor_concept_id
 inner join vocabulary.concept c2 on c2.concept_id = ca.descendant_concept_id
 c1.vocabulary_id = 'ATC'
 and c1.concept_code = 'P01AX'
 and c2.vocabulary_id in ('RxNorm','RxNorm Extension')
 and c2.concept_name ilike '%acetaminophen%'

Finally, Achilles also seems to be affected. When I updated our vocabulary two days ago (11/22/20) I noticed allot of strange ancestor mappings in the Achilles drug reports that were not present earlier this year.

It is complicated. Will try to disentangle a couple of issues here:

  1. Acetaminophen (ingredient) is a descendant of P01AX as it exists as a part of combo-drug with emetine (anti-protozoal). All combos with emetine get picked and so is acetaminophen.
    In fact, emetine drugs seem to be antipyretics rather then anti-protozoal, but WHO only classifies it as anti-protozoal.

  2. The hierarchy is different when you go uphill or downhill. In this case (I assume) you are moving from RxNorm drug to ATC class.

If you are moving from RxNorm drugs (want to see to which ATC class a drug belongs), everything is fine as in this example:

select * from concept_ancestor
join concept on ancestor_concept_id=concept_id
where descendant_concept_id = 1125315 – acetaminophen 1000 MG Oral Tablet
and vocabulary_id = ‘ATC’;

But if you are moving from RxNorm ingredients, you’ll see a bunch of things, simply because this ingredient belongs to multiple drugs and, therefore, to multiple classes:

select * from concept_ancestor
join concept on ancestor_concept_id=concept_id
where descendant_concept_id = 1125315 – acetaminophen
and vocabulary_id = ‘ATC’;

So whenever Drug Class covariates are constructed based on ingredients and not on drugs there will be all sorts of funny things.
We need to look at the code logic to figure out what, how and where to modify. Does Achilles use FeatureExtraction or its own logic?


@aostropolets can you explain to me what these relationship_ids mean:

Why is there a ATC - RxNorm name?

What does lateral here mean?

Primary, Secondary mean?

Why is there ATC - SNOMED? are there ATCs that do not Map to RxNorm?


ATC to RxNorm and ATC - SNOMED are the relationships coming from the other vocabularies we import (UMLS, RxNorm). They do not participate in the hierarchy.
Primary and secondary relationships represent the first axis we introduced to organize ATC relationships. Primary reflects the main ingredient (RxNorm counterpart), secondary - other ingredients (or groups of ingredients) if present.
Lateral relationship is assigned to unambiguous ingredient, upward - to ambiguous ingredients (i.e. groups).
For example, ambrisentan and tadalafil has a primary lateral relationship to ambrisentan and secondary later relationship to tadalafil.
amitriptyline and psycholeptics has primary lateral relationship to amitriptyline and secondary upward to all possible psycholeptics - alllobarbital, buspirone etc.


Dear all, I am pleased to inform the community about the ATC refresh. In a nutshell, there are the following changes:

  1. the list of new and missing ATC codes have been added to the ontology
  2. the conversion of ATC Administration Routes to OMOP had been extended and improved, as a consequence, the hierarchical coverage of ATC was enriched and cleared
  3. for the ETL process, “Maps to” relationships have been revised, ambiguous one-to-many mappings were deprecated.
  4. COVID-19 vaccines have been embedded into the ATC hierarchy
  5. in order to simplify the further maintenance, the SQL-based ATC refresh automation based on the atomic approach and technical documentation have been developed
  6. the QA process formalization has been initiated.

To see all semantic amendments, click here.

Finally, the OMOPized ATC is one of the biggest open-source hierarchies available for the scientific community and its deployment in OMOP is quite sophisticated. Thus, we are looking forward to the collaboration and feedback about the work done by the Vocabulary team.


Dear all,

First, I would like to congratulate you for the hard and excellent work done .

We are mapping our drug vocabulary to RxNorm/Extension and facing some issues in getting the correct hierarchy (from ATC) even after read all this thread and some documents.

Our mapping was done mostly to Ingredients for example:

dipyrone → concept_id 19031397

prednisone → concept_id 1551099

After building the hierarchy, it seems that dipyrone is being mapped to S02DA instead of N02BB or N02BB02 ;

and prednisone to N05CX instead of H02AB07.

Drug Exposure: Dipyrone

Drug Exposure: prednisone

We just had downloaded the vocabularies some days ago.
What could be going wrong in our work?

Dear @Mateus,

Thank you for noticing and appreciating the Vocab team’s contribution!

Regarding your issue, in the concept_ancestor table:

  • the hierarchy between ATC and RxN/RxE is built between ATC Drug Classes (as ancestors) and RxN/RxE Drug Products (as descendants), but not Ingredients.
  • RxN/RxE Ingredients (as ancestors) are connected with RxN/RxE Drug Products (as descendants), but not with ATC.

But how can you get precise mappings from RxNorm Ingredients to closest maternal ATC codes?

For example, through the concept_relationship table:

SELECT c.concept_id AS rx_id,
      c.concept_code AS rx_code,
      c.concept_name AS rx_name,
      d.concept_id AS atc_id,
      d.concept_code AS atc_code,
      d.concept_name AS atc_name
FROM concept c
 JOIN concept_relationship r ON r.concept_id_1 = c.concept_id
 JOIN concept d ON d.concept_id = r.concept_id_2
WHERE c.concept_class_id = 'Ingredient'
AND   c.standard_concept = 'S'
AND   d.vocabulary_id = 'ATC'
AND   r.relationship_id = 'Mapped from'
AND   c.concept_id IN (19031397, 1551099);

OR via the tables of concept_relationship and concept_ancestor (if you want to look at all related ATC codes from the ancestry):

SELECT c.concept_id AS rx_id,
      c.concept_code AS rx_code,
      c.concept_name AS rx_name,
      d.concept_id AS atc_id,  -- closest relative
      d.concept_code AS atc_code,
      d.concept_name AS atc_name,
      x.concept_id AS parent_atc_id, -- distant relative
      x.concept_code AS parent_atc_code,
      x.concept_name AS parent_atc_name
FROM concept c
 JOIN concept_relationship r ON r.concept_id_1 = c.concept_id
 JOIN concept d ON d.concept_id = r.concept_id_2
 JOIN concept_ancestor ca ON ca.descendant_concept_id = d.concept_id 
 JOIN concept x ON x.concept_id = ca.ancestor_concept_id
WHERE c.concept_class_id = 'Ingredient'
AND   c.standard_concept = 'S'
AND   d.vocabulary_id = 'ATC'
AND   r.relationship_id = 'Mapped from'
AND   c.concept_id IN (19031397, 1551099)
ORDER BY d.concept_code,

Dear @Polina_Talapova,

Thank you for your prompt answer. I had some time to think about and try a solution based on what you wrote. It helped us a lot despite some ambiguities inherent to mapping drugs from ingredients to ATC.