OHDSI Home | Forums | Wiki | Github

Drug approval years for every ingredient in RxNorm and RxNorm extension

For some OMOP related work, we are in need of knowledge base that would tell us for each OHDSI drug standard concept (of ingredient class) its year when it was approved. We are fine with US approval dates (but can be also from EMA or other agencies).

We used the DRUG_STRENGTH table from OMOP but many drugs get start year of 1970 and we need better data for start of drug use between 1970 and 2003.

Has anyone else come accross this problem? (get drug approval year for every ingredient in RxNorm and RxNorm extension)

Our experimens with the table above are below (R code)

#year of drug ingredient
dfname='local/DRUG_STRENGTH.CSV'
d<-read.delim(file.path(dfname),as.is=T,quote = "")
str(d)
cid=42873636 #mirabegron rxnorm code
library(tidyverse)
#inspect the data for a sample drug
d %>% filter(DRUG_CONCEPT_ID==cid)
d %>% filter(INGREDIENT_CONCEPT_ID==cid) %>% View()

library(stringr)
#take only some columns, make year
dd<-d %>% select(CONCEPT_ID=INGREDIENT_CONCEPT_ID,VALID_START_DATE,INVALID_REASON) %>% mutate(year=str_sub(VALID_START_DATE,1,4)) %>% select(CONCEPT_ID,year) %>% unique()
#pick lowest year for each concept
library(sqldf)
dd2<-sqldf('select concept_id, min(year) from dd  group by concept_id')
library(OHDSIVocab)
ddd<-dd2 %>% left_join(concept,by='CONCEPT_ID')
View(ddd)

For example for drugs below it should tell years like those below

tadalafil 2003
mirabegron 2012
atenolol 1976
avanafil 2012

the knowledge base should use coded concepts (idealy OMOP concept_id’s)

I don’t know if it would be helpful to you but the drugs@fda database has been useful for me for something somewhat similar. You can download the whole databases here:
https://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm

About the DB here: https://www.fda.gov/Drugs/InformationOnDrugs/ucm135821.htm

What I did in the past is grep the active ingredient column of the Products table and the join that with some information in the RegActionDate file. The R code looks like what I pasted below but you could probably easily load the data tables in a way that you could join from concept table’s concept_name field the Products table activeingredient field (or whatever it is currently labeled). I imagine that you could get more sophisticated but the ingredient names should be pretty standard between RxNorm and drugs@fda. If you find that allot fall out, you might try to use the UNII data file (available from https://www.fda.gov/forindustry/datastandards/substanceregistrationsystem-uniqueingredientidentifierunii/) to see if I could grep out a better mapping. Or, bring up the RxNorm DB an play around with other ways to crosswalk to drugs@FDA.

hope it helps,
-R

# Determine the approval dates for all drugs in a list

all_prods <- read.csv("Product.txt",header=TRUE, sep=",")
prods <- all_prods[ all_prods$ProductMktStatus == 1,]
appdocs <- read.csv("AppDoc.txt",header=TRUE, sep=",")
reg_acts <- read.csv("RegActionDate.txt",header=TRUE, sep=",")

## active ingredients from search of antidepressants in Micromedex,
## Medical Letter on Drugs and Therapeutics, MeSH, and DrugBank (see research notes for
## November and December 2010)
#act_ingrds <- c("METHYLPHENIDATE", "AMINEPTIN", "AMITRIPTYLINE", "AMOXAPINE", "CLOMIPRAMINE", "CYCLOBENZAPRINE", "DESIPRAMINE", "DESMETHYLDOXEPIN", "DIBENZEPIN", "DOSULEPIN", "DOTHIEPIN", "DOXEPIN", "IMIPRAMINE", "IPRINDOLE", "LEVOPROTILINE", "LOFEPRAMINE", "MAPROTILINE", "METAPRAMINE", "MIRTAZAPINE", "MELITRACEN", "MELITRACENE", "NORTRIPTYLINE", "NOXIPTILIN", "OPIPRAMOL", "OXAPROTILINE", "PIPOFEZINE", "PROTRIPTYLINE", "TIANEPTINE", "TRIMIPRAMINE", "MIANSERIN", "TECIPTILINE", "SETIPTILINE", "AAPTAMINE", "ALMOXATONE", "AMEZINIUM", "AMIFLAMINE", "BAZINAPRINE", "BEFLOXATONE", "BETA FLUOROMETHYLENE META TYROSINE", "BIFEMELANE", "BROFAROMINE", "CHLORPHENAMIDINE", "CIMOXATONE", "CLORGYLINE", "CUPRIZONE", "DESMETHYLSELEGILINE", "ETRYPTAMINE", "FURAZOLIDONE", "HARMALINE", "HARMINE", "IPROCLOZIDE", "IPRONIAZID", "ISOCARBOXAZID", "LADOSTIGIL", "LAZABEMIDE", "MILACEMIDE", "MOCLOBEMIDE", "MOFEGILINE", "MONOCROTOPHOS", "NIALAMIDE", "NORSELEGILINE", "PARGYLINE", "PHENELZINE", "PHENIPRAZINE", "PIRLINDOLE", "RASAGILINE", "RO 41-1049", "SAFINAMIDE", "SAFRAZINE", "SELEGILINE", "SERCLOREMINE", "TETRINDOL", "TOLOXATONE", "TRANYLCYPROMINE", "TRIBULIN", "ALAPROCLATE", "ATOMOXETINE", "CERICLAMINE", "CITALOPRAM", "DAPOXETINE", "ESCITALOPRAM", "FEMOXETINE", "FLUOXETINE", "FLUVOXAMINE", "HYDROXYNEFAZODONE", "HYPERFORIN", "IFOXETINE", "INDALPINE", "LITOXETINE", "MANIFAXINE", "MAPROTILINE", "MEDIFOXAMINE", "MOXIFETIN", "NEFOPAM", "NISOXETINE", "NOMELIDINE", "NORCITALOPRAM", "NORFENFLURAMINE", "NORFLUOXETINE", "NORSERTRALINE", "NORTRIPTYLINE", "PAROXETINE", "RADAFAXINE", "REBOXETINE", "SERTRALINE", "TAMETRALINE", "THIONISOXETINE", "VILAZODONE", "ZIMELDINE", "DESVENLAFAXINE", "DULOXETINE", "MILNACIPRAN", "VENLAFAXINE", "AGOMELATINE", "BUPROPION", "MIRTAZAPINE", "NEFAZODONE", "OXYTRIPTAN", "TRAZODONE", "VILOXAZINE")
act_ingrds <- c("METHYLPHENIDATE")

## a search ran to estimate the distribution of market release dates
## for currently marketed drugs
con <- file("product-approval-dates.txt", "w")
for (elt in act_ingrds){
  writeLines("", con)
  writeLines("############################################################", con)
  writeLines(paste("ACTIVE INGREDIENT:", elt), con)
  writeLines("", con)
  
  mtchs <- grep(elt, prods$activeingred, fixed=TRUE)
  if (length(mtchs) == 0){
    writeLines(paste("No match found for ", elt), con)
    writeLines("", con)
    next
  }
  
  for (pd in mtchs){
    writeLines(paste("\tApplNo", prods[pd,]$ApplNo), con)
    writeLines(paste("\tProduct name:", prods[pd,]$drugname), con)
    writeLines(paste("\tProduct form:", prods[pd,]$Form), con)
    writeLines(paste("\tProduct dose:", prods[pd,]$Dosage), con)

    approval_date <- subset(reg_acts, ApplNo == prods[pd,]$ApplNo & DocType == "N", select = ActionDate)$ActionDate

    writeLines(paste("\tProduct approval date: ", approval_date), con)
  }
}
close(con)



## ## code to analyze product release dates and determine their distribution
## datesIn <- read.delim("reduced-product-approval-dates-Aug2010.txt", header=FALSE, sep="\n")
## dates <- as.integer(datesIn[[1]])

## summary(dates)
## #   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## #   1911    1992    2002    1998    2007    2010 


## jpeg("histogram-of-product-release-dates.jpg")
## hist(dates, xlab="Year", ylab="No. products", main="Histogram of approval dates for currently-marketed drug products")
## dev.off() 
t