OHDSI Home | Forums | Wiki | Github

Missing NDC codes for ONE TOUCH ULTRA TEST STRIPS

@Christian_Reich:

I’m working on my ETL for the CDMv5 on Optum, and I have a couple of NDC codes that I would like to know if the were simply omitted or if I should not use them for some reason:

NDCs:
53885024510 (n= 2,164,418)
53885024450 (n= 1,265,861)

Those are the top 2, and you can see that makes up 3 million records in our dataset, so I’m thinking I’d like to address it.

To make sure that those could be valid NDCs, i search for them, and i found both these two in the same document here:
http://www.hfs.illinois.gov/assets/teststrips.pdf

In this situation, would you create new concepts for these? Or is there something wrong with these NDCs such that they wouldn’t be identified by the vocabulary?

-Chris

Follow up:

Third in my list is another NDC code, but this time for an insulin needle:
https://www.betterlivingnow.com/products/product-detail.cfm?ndc=08290320109

Is part of the problem that these NDCs refer to what we might consider ‘devices’ and therefore we might not see them in the vocabulary?

-Chris

Those look like valid NDCs, so they deserve their own concept. Then,
hopefully those concepts will successfully map to an appropriate standard
concept in the DEVICE domain.

The challenge we’ll need to dig into is, why didn’t these NDCs get
identified by our current sources for NDCs, since a code with the same
9-digit prefix (53885024510) did make it in (but didn’t get mapped to a
standard concept). @christian_reich, when these situations occur, how do
we want to address them?

My immediate proposal. Map SOURCE_CONCEPT_ID = 0 and then map the standard
_CONCEPT_ID field to the appropriate place. That appropriate place could
be attempted to be identiifed by first looking to see if the 9-digit prefix
has a home, and if not, then use a tool like Usagi to attempt to find a map.
Chris_Knoll http://forums.ohdsi.org/users/chris_knoll
August 5

@Christian_Reich http://forums.ohdsi.org/users/christian_reich:

I’m working on my ETL for the CDMv5 on Optum, and I have a couple of NDC
codes that I would like to know if the were simply omitted or if I should
not use them for some reason:

NDCs:
53885024510 (n= 2,164,418)
53885024450 (n= 1,265,861)

Those are the top 2, and you can see that makes up 3 million records in our
dataset, so I’m thinking I’d like to address it.

To make sure that those could be valid NDCs, i search for them, and i found
both these two in the same document here:
http://www.hfs.illinois.gov/assets/teststrips.pdf

In this situation, would you create new concepts for these? Or is there
something wrong with these NDCs such that they wouldn’t be identified by
the vocabulary?

Friends:

There is a whole category of drugs which are not in a good shape: Device Drugs. Like contrast material, pregnancy tests, glucose tests, insuline pens or drug-eluting stents. RxNorm doesn’t treat them as drugs. So, they fall somewhat between the cracks. It’s on the list of things.

53885024510 and 53885024450: Both are in there, but they’re glucose test strips, and therefore not mapped to anything. We can add those for high-frequency ones by hand no problem.

All those Optum NDCs we don’t have: Please please send them over, so we can take a look. I can’t see Optum. They are now the Evil Empire. :smile:

@Chris_Knoll - Ajit Londhe was looking this problem awhile back as well. You might want to ping him as well.

I’ve uploaded the file. It has the NDC and the counts they occur in Optum. I only selected the top 100, the counts ranging from 2.1 M down to .023M (23k). I think I’d be happy just to include the top 10, but I sent you 100 in case you wanted it.

I didn’t resolve these NDCs to a name, however. That’ll be the main lift. But like i said, if you can do the top 10 manually, that’d be awesome…the range of the top 10 goes from 2.1M to .19M (190k) so I feel just the top 10 would cover the most common unmapped cases for us.

Top_100_NDC_Unmapped_OptumClinformatics.xlsx (11.1 KB)

Christian:
The above list was supposed to be NDC codes that don’t have a source concept. Some of those NDCs actually DO have a source_concept_id, and that was my main concern of my origional post: the NDCs that don’t have any concepts. So, I’m going to make another XLS with 2 tabs: unmapped and no standard concept. The unmapped will be the top 100 NDCs from Optum that do not have a source concept Id. The other will bhe the top 100 that do have source concept ID but no standard concept map.

Will take me a minute but I will follow up shortly.

Here’s the updated XLS. It has 2 tabs: NDCS with no source cocnept, And NDCs that do have s source concept but no mapping to any other concept. (I’m assuming the mapping we’d want to introduce is those to a standard concept.)

For the NDCS without source concepts the top 10 range from 149,771 to 54,771 dropping to 3375 occurrances at the 100th NDC.

For NDCs without maps: we have 2.16 Mil to 450k for top 10, down to 18.7k at the 100th NDC.

While the standard concept mapping would be great, at least we have concept_ids in the CONCEPT table that I can ETL from the source NDCs into Vocabulary Cocnepts. And it appears all of these NDCs that do not have standard concept maps have the source concept domain as ‘Drug’, so I can also figure out where to put them in the CDM schema, and write the source_concept_id value to that drug_exposure table.

So things aren’t as bad as I thought they might be since I do have concept IDs for the big ones in our database. But if you want to look at how you’d map them to standard concepts, I’ve uploaded the new file.

-Chris
Top_100_NDC_Unmapped_OptumClinformatics.xlsx (16.7 KB)

I’ll take a look. How quickly do you need all this, Chris?

Domains of NDC Source Concepts: Yeah. The fact that it says “Drug” doesn’t make a Glucose stick a drug, really. NDCs got assigned Drug by default, it really doesn’t mean anything. There is no way to figure out what it really is unless we have a mapping to a Standard Concepts. For those, the Domain assignment is much stronger.

Whenever you have time Christian, I don’t think we do specific studies looking for these cocnepts yet, but once people get the idea about how awesome the vocabulary is for identifying things you can find in observational data, the’ll be wanting it, big time. But no rush.

Here’s the use case for ETLing: we get a source value in a column called ‘ndc’ so we’re going to try to look it up as a source_concept_id in vocabulary_id = ‘NDC’. Once we have the source concept, we’ll try to map it to a standard concept. If we don’t have a standard concept (as we see in the above NDC codes) then I still want to put it into our CDM because it does have a source concept ID (and OHDSI tools can query on the source concept ID field). So how do I know which CDM domain table this source concept should go? I have 2 choices: I can say ‘anything that doesn’t have a standard concept map, put into Drug Exposure table when it comes from the NDC column of the Optum native RX table’, or I can say ‘Create a row in the domain table that the source concept is assoicated with’.

One problem with option 2 is that some source concepts have mixed domains (drug/device for example) so in that case, not sure what I would do. But those are the 2 options i’m mulling, and I’m beginning to wonder if option 1 is the way I should go (i’ve been trying to let the vocabulary drive the target CDM table in all cases, but maybe that’s not right).

-Chris

I did re-run our data collector to obtain “missing” NDC codes from the Optum ClinFormatics dataset, and the results mostly match what @Chris_Knoll posted. Based largely on some sources online (Inland Empire Health Plan, BioPortal from the National Centers for Biomedical Computing), it appears that more than 60% of the non-bogus NDCs are multivitamins, prenatal vitamins, metabolic supplements, flu vaccines, or those glucose test strips.

I’m not sure how valid IEHP or BioPortal are as vocabulary sources, but I am going to run a full comparison of Optum ClinFormatics NDCs against those sites to get a broader sense of what drug types the OMOP Vocab may be missing.

@Chris_Knoll:

I totally hear you. So, right now the idea is to do option two. Source Concepts without a mapping should not have mixed domains, because that only happens if they do have mappings to Standard Concepts of more than one domain. If not, they should have only one default domain, which in the case of NDC is “Drug”.

One problem I have is this: If we wanted to fix the situation - what would the Standard Concept for a Glucose Stick be? I could imagine we promote the NDC Source Concept to Standard, but then we have the NDC mess where it changes every 3 weeks for the same product. I don’t have a solution. That’s a good thing to ask the NLM folks.

@Ajit_Londhe:

Thanks so much. That would be wonderful.

@Chris_Knoll, @Ajit_Londhe:

Could you be so kind and send again the list of NDC codes that is unmapped? But this time, can you group by ndc_code, ndc_description, year_of_record?

Reason is we are talking to the NLM about getting them added properly to RxNorm. This doesn’t prevent us from doing a temporary fix, but if they took them on that would be much better. However, they are claiming that most of these are obsolete and no longer in use, and sometimes they get re-used. So, without the time it’s hard for them to bring them in. Can you run those counts on the various databases you have, and don’t cut at the top 100? I’ll do the same thing at IMS, and between us we should have a very good idea of what the problem is.

Ok, I’ll find the query and execute it on Optum.

I’m a little worried about NDC codes and how they can change meaning over time for the same code. I think I understand the reason based on discussions with @ericaVoss: NDCs are like IP addresses, issued to companies and they can decide how to allocate them. So over time, an NCD can refer to one product, and then become reused if that product goes out of production. Do I have it right? If so, I’d like to vote against any coding system that works like this becoming a ‘standard concept’. Seems too fragile to standardize on.

-Chris

@Chris_Knoll:

Thanks.

Yes, that is a problem. It doesn’t happen that often, but it does. We also have the problem with DRGs and with vaccine procedure codes: The same code gets mapped to different concepts over time. As soon as we have NDC in order, we will make an official change to the mapping conventions and validity time of the “Maps to” relationship: It will in future designate time, and you should check the time stamp of the record against the this validity time.

@Chris_Knoll: Any progress? Trying to keep the communication with the NLM warm.

@Chris_Knoll: Actually, now that I think about it: What would be better than the yearly “group by” is the first and the last date of occurrence, including the description string of that first and last use. That way, they can check whethr there are reused or not.

Christian, sorry I took so long to get back to you (on vacation one week) but I created a list of NDCs that exist in the Optum database that do not have source concept mappings, and leveraged the Optum provided NDC lookups to get some information (such as name, drug strength, is it a generic, etc) as well as the effective from-to dates. Hopefully this is helpful. This XLS contains about 3,200 NDCS (one of which is UNK but it is frequently used, but obviously you can’t map that), as well as some others that look obviously fake, but they are there anyways. You be the judge).

After the first 200, it drops off rapidly, only accounting for < 1000 records in the database. So even if you manage to address the top 100, it would be a huge improvement.

NDC_Unmapped_OptumClinformatics.xlsx (389.7 KB)

@Christian_Reich, any problems with this file? Was it what you wanted?

t