Dear all,
I would like to know of any database, paid or unpaid, which would help me map drugs to their higher level classes (antidepressants, antiflammatory etc) through NDC codes.
Also, does the CDM facilitate prescription data as well? Very new to this community but would like to contribute as well.
I am sorry, but I wasn’t clear with my question. I’m looking for the
therapeutic classes (ATC) corresponding to prescription drug strings, which
all have an NDC code as their only identifier.
The FDA doesn’t support ATC classes, they have their own classes as Fred pointed out. You can use the OMOP Standardized Vocabularies in this manner. This subject had been discussed before, like here. Look at my comment.
the query in the comments has 2 columns with identical names (atc_id)
corrected query is below
mistake was in second line - ndc.concept_id as atc_id
select
atc.concept_id as atc_id, atc.concept_name as atc_name, atc.concept_code as atc_code, atc.concept_class_id as atc_class,
ndc.concept_id as ndc_id, ndc.concept_name as ndc_name, ndc.concept_code as ndc_code, ndc.concept_class_id as ndc_class
from concept atc
join concept_ancestor a on a.ancestor_concept_id=atc.concept_id
join concept_relationship r on r.concept_id_1=a.descendant_concept_id and r.invalid_reason is null and r.relationship_id='Mapped from'
join concept ndc on ndc.concept_id=concept_id_2 and ndc.vocabulary_id='NDC'
where atc.vocabulary_id='ATC'
This is the right query. But we are in the process of fixing the ancestry relationship between ATC and RxNorm (which is what the first half of the query is banking on). Give us a week. After that, the query will have high quality results, both in recall as well as precision. The whole communication about that is here.
Hello, this R script will map NDCs to ATC level 4 classes using RxNorm: https://github.com/fabkury/ndc_map. The repository also contains:
a poster with statistics about the mapping process, presented at AMIA 2017
a paper comparing drug classification systems
the FDA NDC database (as of earlier this year) with ATC-4 classes already joined
If you dig into the R code, you can see that first it queries RxNorm (https://rxnav.nlm.nih.gov/APIsOverview.html) for the RxNorm CUI of each DC, then it queries for the ATC classes (if any) of each CUI. RxNorm also offers other drug classification systems (DCS) besides ATC, which you can get by changing the second step aforementioned; but ATC and Veterans Affairs’ Drug Classes (“VA classes”) are substantially better for large dataset analyses than the other DCSs, as depicted in the paper.
About my script above, a new version is now online and provides plenty more about each NDC, including:
ATC-5,
extra ATC-5 data: Defined Daily Dose, Administration Route, Note (if any)
drug strength and generic ingredients,
whether NDC is a brand name or a generic,
SNOMED CT and MESH Pharmacological Action codes,
Other drug classification systems from RxNorm: Veterans’ Affairs Classes (VA Classes) and Established Pharmacological Classes (EPC)
All content comes from merely querying RxNorm, except for the extra ATC-5 data which is web-scraped from the official index at https://whocc.no/atc_ddd_index/.
RxNorm to ATC mapping goes through the ingredients. That’s the whole problem we are trying to solve. ATC ingredients are ambiguous. For example, aspirin has 3 ATC 5th-level concepts, and with all combinations it has 20. The RxNorm to ATC mapping cannot distinguish them, so maps them all across.
VA Class no longer is maintained. I agree, it was a nice one.
Just to be clear, that file is merely the result of running the FDA NDC database file through the script.
That is right about ATC in RxNorm. RxClass provides ATC-4 for drug products, but I don’t know for sure if that is a separate source of information that doesn’t suffer from ambiguities. VA classes, by the way, are attached to drug products as well – can be noted in the R script.
I have seen tailored efforts of disambiguation be done via leveraging other data provided by RxNorm. For example, the strength of timolol as an eye drop (under ATC-2 S01) is a percentage (a concentration), while the strength of timolol as a cardiovascular drug (under ATC-2 C07) reads as a number of milligrams. There is also the dose form variable that can help sometimes.
“Map” is the wrong term. We map NDC to RxNorm. And RxNorms are members of Drug Classes (e.g. ATC). So, you could classify NDCs into these classes with a double hop “Maps to” Concept Relationship and then the hierarchy in Concept Ancestor. Works fine.
I am working on calculating the Spend of Pharmacy claims per therapeutic class. But im unable to find the list of all classes with the drugs in each class.
I was using VA Classes from the NLM RxNorm, but they also do not have the complete list of Drugs to Classes mapping.
I have a total of unique NDCs in claims 17,668. However I found only 37% in the VA classes, which means 63% of NDCs are without Class.
Total Unique NDCs in Claims 17,668
Total Unique NDCs in VA Classes 107,471
Matched NDCs with Classes 6,528
Match 37%
Does FDA or any other site have any list for the classes vs Drugs mapping?
VA is not the best idea for one simple reason - they are far from being called comprehensive.
We suggest using ATC (which will be updated together with RxNorm, VANDF and VA Class in the upcoming August release - just keeping you posted).