The FDA doesn’t support ATC classes, they have their own classes as Fred pointed out. You can use the OMOP Standardized Vocabularies in this manner. This subject had been discussed before, like here. Look at my comment.
the query in the comments has 2 columns with identical names (atc_id)
corrected query is below
mistake was in second line - ndc.concept_id as atc_id
atc.concept_id as atc_id, atc.concept_name as atc_name, atc.concept_code as atc_code, atc.concept_class_id as atc_class,
ndc.concept_id as ndc_id, ndc.concept_name as ndc_name, ndc.concept_code as ndc_code, ndc.concept_class_id as ndc_class
from concept atc
join concept_ancestor a on a.ancestor_concept_id=atc.concept_id
join concept_relationship r on r.concept_id_1=a.descendant_concept_id and r.invalid_reason is null and r.relationship_id='Mapped from'
join concept ndc on ndc.concept_id=concept_id_2 and ndc.vocabulary_id='NDC'
This is the right query. But we are in the process of fixing the ancestry relationship between ATC and RxNorm (which is what the first half of the query is banking on). Give us a week. After that, the query will have high quality results, both in recall as well as precision. The whole communication about that is here.
a poster with statistics about the mapping process, presented at AMIA 2017
a paper comparing drug classification systems
the FDA NDC database (as of earlier this year) with ATC-4 classes already joined
If you dig into the R code, you can see that first it queries RxNorm (https://rxnav.nlm.nih.gov/APIsOverview.html) for the RxNorm CUI of each DC, then it queries for the ATC classes (if any) of each CUI. RxNorm also offers other drug classification systems (DCS) besides ATC, which you can get by changing the second step aforementioned; but ATC and Veterans Affairs’ Drug Classes (“VA classes”) are substantially better for large dataset analyses than the other DCSs, as depicted in the paper.
RxNorm to ATC mapping goes through the ingredients. That’s the whole problem we are trying to solve. ATC ingredients are ambiguous. For example, aspirin has 3 ATC 5th-level concepts, and with all combinations it has 20. The RxNorm to ATC mapping cannot distinguish them, so maps them all across.
VA Class no longer is maintained. I agree, it was a nice one.
Just to be clear, that file is merely the result of running the FDA NDC database file through the script.
That is right about ATC in RxNorm. RxClass provides ATC-4 for drug products, but I don’t know for sure if that is a separate source of information that doesn’t suffer from ambiguities. VA classes, by the way, are attached to drug products as well – can be noted in the R script.
I have seen tailored efforts of disambiguation be done via leveraging other data provided by RxNorm. For example, the strength of timolol as an eye drop (under ATC-2 S01) is a percentage (a concentration), while the strength of timolol as a cardiovascular drug (under ATC-2 C07) reads as a number of milligrams. There is also the dose form variable that can help sometimes.
“Map” is the wrong term. We map NDC to RxNorm. And RxNorms are members of Drug Classes (e.g. ATC). So, you could classify NDCs into these classes with a double hop “Maps to” Concept Relationship and then the hierarchy in Concept Ancestor. Works fine.