OHDSI Home | Forums | Wiki | Github

Proposal for a convention change/clarification in the DRUG_COST table

Currently, it is allowed to have more than one entry per DRUG_EXPOSURE record:
“Each Drug Exposure may have any number of corresponding records in the DRUG_COST table, but usually it is none (no cost data recorded) or one. They are linked directly through the drug_exposure_id field”

From my perspective, this would make sense if the DRUG_COST table holds something like a cost_type and a cost_value. In this case we would need multiple records for one DRUG_EXPOSURE record.
However, the DRUG_COST table provides a fixed number of pre-defined fields. If we allowed multiple records (each holds the same pre-defined fields) for one DRUG_EXPOSURE record, which record would be the right one?

If I don’t misunderstand the approach here the proposal is to allow not more than one DRUG_COST record per DRUG_EXPOSURE record.

Good thoughts! In the current version (version 5) of the DRUG_COST table, the only reason I could see one DRUG_EXPOSURE record having multiple DRUG_COST records is to record payments from multiple sources. The payer_plan_period_id allows the DRUG_COST record to be associated with a particular payer. If your data has payment information from multiple payers, then multiple payment (DRUG_COST) records could be assigned to one DRUG_EXPOSURE record (generally in the form of a patient having primary and secondary insurance). Of course this is not applicable to claims databases (since the claims usually come from one payer) but this is applicable to EHR or practice management systems. This also assumes that payer plan periods are stored by payer in your CDM.

In a future version of the CDM where costs will be consolidated into one table (please see proposal here), I have personally contemplated having a “cost_type” variable where the user could store other types of costs, such as WAC values, ASP values, Facility vs Provider values (specifically for AHRQ’s MEPS data). But, I may be in the minority for supporting this. If you have any suggestions for cost types, please let us know on the proposal site since we are discussing this soon.

@jenniferduryea:

Ahh, OK - that makes sense! However, I suggest that we explain this somewhere. At first glance this is not obvious – at least not in Europe.

Per CDM Builders meeting. Not sure why the request to change domain_id from text to numeric. If the domain_id is a foreign key to the vocabulary domain table the string will be validated by database constraints. And I will not have to mentally map the number to the string.

Well, the constraint is not relevant with respect to the vocabulary, but with respect to the table names. So, you would have to use whatever is in that field and put that into a query. SQL doesn’t let you do that, I believe. So, you’d have to do a big fat case-when-then-end block anyway. Not sure numbers would be any better or worse, here. But maybe I am wrong.

@DTorok:
In the IT world it is rather uncommon to represent different states or conditions with strings. Numerical representations are faster in comparisons and more space-saving than string representations. I just wonder why we don’t use a concept_id representing the domain.
However, I don’t consider this “text to numeric” thing as a major issue. We should rather use a string representation than getting bogged down in technical details.

Gents:

This is actually a larger problem, and I should add it to the list, if we had any chance of solving it. The problem is as following: We have the Domains in the Vocabulary, and they are supposed to map each concept to the CDM table it belongs to. So, a concept with the Domain=‘Procedure’ resides in the PROCEDURE_OCCURRENCE domain. That’s the idea. In practice, we have a bunch of problems:

  1. There is nowhere an official mapping of Domain name and CDM table. So, you have to know that ‘Procedure’ belongs to PROCEDURE_OCCURRENCE; and ‘Drug’ to DRUG_EXPOSURE. Ugly.
  2. There are cases where things are ambiguously defined. For example, a lab test is Domain ‘Measurement’, but in CDM4.5 it would go into the table OBSERVATION.
  3. There are cases where the Domain does not designate tables, but fields. For example, the Domain ‘Gender’ refers to the field gender_concept_id, and not to the table PERSON. Same is true for ‘Ethnicity’ and ‘Race’.

So, if anybody has a generic and clean solution I want to hear it.

We have problem 1 anyway, even if we use the string representation. We need somewhere code like
if cost_domain_id == “Drug_exposure” …
if cost_domain_id == “Visit_occurrence” …
(You may try to use cost_domain_id directly to generate the SQL string, but this has other downsides).

Problem 2 and 3 are not present with respect to the single-cost-proposal. So, the very pragmatic way is to use

concept_id     domain representation       Table Synonym
10             Procedure                   PROCEDURE_OCCURRENCE
13             Drug                        DRUG_EXPOSURE
17             Device                      DEVICE_EXPOSURE
19             Condition                   CONDITION_OCCURRENCE
21             Measurement                 MEASUREMENT
27             Observation                 OBSERVATION

However, I doubt it is worth spending too much time on this issue; we may stick with the string approach, if the numerical representation suggestion generates confusions.

My comments are from someone that spends most of their time ETLing data into CDM v4 and v5. Domains as defined are a problem. We have no problem with the fact that concepts can be anything, so I do not see why the definition of concept domain should be limited to representing the table that a concept belongs to. Should be sufficient to say concept domains partition concepts into categories with similar attributes.

There is an official mapping of domain names to CDM tables, just happens to be in the documentation. I don’t think in necessary that the database be self defining. It may be possible, but would just add a layer of abstraction that would probably have little practical use.

The fact that a Measurement goes into the Observation table in CDM4.5 is just a problem of backward compatibility which is something that cannot always be maintained. Again handled by documentation.

That concept domains are inconsistent because they sometime reference a table and other times a field is a result of trying to limit the definition of a concept domain. (see above)

t