Drug concept navigation and source codes

Sigfried_Gold · December 23, 2016, 9:00pm

I was talking with Christian about this drug concept diagram. I thought maybe I should move conversation to the forum because others might be interested.

@Christian_Reich is going to give me the powerpoint version of this which I’m hoping to be able to transform into SVG, and then maybe I’ll be able to use it in a navigation tool. If anyone else would like to play with the SVG (if I can make one), let me know.

One question I’m wondering about: RxNorm codes that are used for exposures will be marked as standard_concept=S in the concept table, and all the classification codes should be marked as C. Is there any way to identify source code concepts (or their vocabularies) without referring to the drug_exposure table? Would they just be all the concepts with domain_id=Drug and standard_concept=null?

Christian_Reich · December 24, 2016, 1:03pm

@Sigfried_Gold:

In email.

Yes. Concepts with standard_concept=null are either source Concepts, or they are ex-Standard Concepts which were deprecated.

Sigfried_Gold · December 24, 2016, 9:26pm

Thanks!

Sigfried_Gold · December 26, 2016, 2:30pm

Hey @Christian_Reich. As I explore the drug concept diagram (also here: http://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:drug) and its relationship to the actual vocabulary tables, I’m finding a number of puzzling discrepancies…I think. I’ll use the vocabulary tables from the Symposium tutorials in order to have a common reference, though I find somewhat different issues in the CDM I’m actually working with.

select vocabulary_id, domain_id, standard_concept, count(*) 
from concept 
where domain_id = 'Drug' and invalid_reason is null 
group by 1,2,3 order by 3,2,1;

  vocabulary_id   | domain_id | standard_concept | count
------------------+-----------+------------------+--------
ATC              | Drug      | C                |   1257
Cohort           | Drug      | C                |     12
EphMRA ATC       | Drug      | C                |    895
NDFRT            | Drug      | C                |  18202
NFC              | Drug      | C                |    692
RxNorm           | Drug      | C                |  37595
SPL              | Drug      | C                | 152392
VA Class         | Drug      | C                |    486
DPD              | Drug      | S                | 131946
HCPCS            | Drug      | S                |     31
RxNorm           | Drug      | S                | 145929
RxNorm Extension | Drug      | S                | 157720
ATC              | Drug      |                  |   4751
CIEL             | Drug      |                  |   7673
DPD              | Drug      |                  |  35492
GCN_SEQNO        | Drug      |                  |  28689
Gemscript        | Drug      |                  | 224408
HCPCS            | Drug      |                  |    790
MeSH             | Drug      |                  |   3991
Multum           | Drug      |                  |   9770
NDC              | Drug      |                  | 428374
NDFRT            | Drug      |                  |   7728
OXMIS            | Drug      |                  |      3
Read             | Drug      |                  |     20
RxNorm           | Drug      |                  |  18630
RxNorm Extension | Drug      |                  |   6555
SNOMED           | Drug      |                  | 308214
SPL              | Drug      |                  |  14707
VA Product       | Drug      |                  |  17951

CVX, NDFRTInd, and FDBInd appear on the diagram but not as vocabularies in the data at all (checking in my vocab tables and http://athena.ohdsi.org/). All of the boxes on this diagram are vocabularies, right?
I don’t find ETC in my vocab tables but it does show up in ATHENA as License Required, so that’s probably fine.
Four vocabularies show as having Drug/Classification concepts in my query but not on the diagram: Cohort, EphMRA ATC, NFC, RxNorm. Any explanation?
SNOMED appears in the classification section in the diagram but in my query shows up as only have non-standard (source) concepts in the Drug domain.
Based on the diagram (and other docs and discussion), I would expect that the only vocabulary with Standard concepts in the Drug domain would be from RxNorm and RxNormExtension. In my query I also see DPD and HCPCS (in the CDM I’m actually working with I also have CPT4 and don’t see RxNorm Extension at all).
RxNorm, RxNorm Extension, ATC, NDC, NDFRT, SNOMED and SPL also show up in the query but not the diagram as having non-standard (source) concepts. Is that correct?

I also see some confusing relationships beyond what we already discussed here: Relationship.relationship_name != concept.concept_name, but I’ll post another message about those later.

Thanks!

Happy holidays!

Sigfried

Christian_Reich · December 26, 2016, 7:31pm

@Sigfried_Gold:

CVX: Need to push out. Still in QA.
Indications: The situation is a little confusing. We could fix it, if folks think it’s a good idea:
NDF-RT Indications are vocabulary_id=‘NDFRT’ and concept_class_id=‘Ind / CI’
FDB Indciations are vocabulary_id=‘Indication’ and concept_class_id=‘Indication’

Yes, you have to cough up the money and hand it to FDB. We won’t make a dime on it.

EphMRA ATC, NFC have to be added to the diagram. Except you need to subscribe to the proprietary DA_Germany vocabulary to ever have a drug connected to them. It’s a temporary artefact. I don’t think they are useful, but we had to add them for legacy purposes.
RxNorm: We added a new concept_class_id Clinical Dose Group and Branded Dose Group. They are categorized as C, and therefore pop up. Charlie requested them. Need to add them to the diagram.

We could fix that. Should we?

DPD: Will be fixed in next release. It came in before we decided to build RxNorm Extension.
Standard HCPCS Drug concepts: These are those where the ingredient isn’t defined, but they still are drugs. For example 2718656 “Unclassified biologics” and 2718681 “Hemophilia clotting factor, not otherwise classified”. We need to work on the proposal to allow drug classes in the data (turning them into Ss)

That’s all deprecated stuff.

Thanks for watching out for us.

Sigfried_Gold · December 27, 2016, 12:10pm

Thanks, @Christian_Reich. So, is the upshot pretty much that I can trust my query results to be a mostly accurate, up-to-date representation of the vocab structure and discrepancies are either due to problems soon-to-be-fixed in the vocabulary tables themselves or to things that should be fixed in the diagram?

If so, maybe I can make a version of the diagram that gets generated directly from the data, which would probably be helpful to people working on the vocabularies.