OHDSI Home | Forums | Wiki | Github

Patient-Reported Drugs and Conditions

@clairblacketer, @Ajit_Londhe, and @anthonysena and I are looking for some thoughts.

We have 3 types of patient reported data:

  1. Health Risk Assessment Data
  2. Patient Reported EMR Data
  3. Survey Data

Currently if we get data from these sources, if they map to a domain like DRUG or CONDITION domain we move the record there. However we are struggling a bit with this.

1) Health Risk Assessment
This data comes from questionnaires completed by employees within a certain insurance. The data comes in bulk typically at the end of a calendar year when insurance is being renewed. The date around occurrence is the date in which the survey was filled out. This data is typically accompanying claims data.

Example data:
Self Reported Asthma. Yes or No?

If Yes, we were writing a record to the CONDITION_OCCURRENCE for 317009-Asthma

2) Patient Reported Medications in EMR Data
At the point of seeking care a patient will declare what medications they are on and it is recorded within the medical record. Typically this data is accompanying other medical record type data.

Example data:
Patient states they are taking 300MG of Clindamycin and the nurse selects an of NDC 63304069301. And we write a record for997899-Clindamycin 300 MG Oral Capsule` to the DRUG_EXPOSURE table.

3) Survey Data
Specifically thinking of something like NHANES survey data where we have an entire data set comprised of questions asked of someone.

Example data:
Have you ever been told you have high blood pressure? Yes or No?

If yes, we record a CONDITION_OCCURRENCE record for 320128-Essential hypertension

QUESTIONS WE HAVE
What we couldn’t decide is if:

  1. Given these types of patient reported data does it make sense to move the data to their proper domain or just dump everything to the OBSERVATION table. The TYPE field can always help you know where it came from but we are not sure people will know to exclude certain types during analysis time.
  2. Do we treat all patient reported data in the same way in regards to the movement or should we treat them differently depending on source.

My gut right now is telling me that we should handle patient reported data differently depending on how we get it. I think the HRA data should go to the OBSERVATION table given the date is completely unreliable. EMR patient reported medication data may be somewhat reliable given that it is usually asked during the time the patient might be exposed. The Survey data should move to the proper domains given that the entire dataset is survey based and should be analyzed as such.

1 Like

Seems, that your gut is right:
but a few questions:

  1. Health Risk Assessment
    Self Reported Asthma. Yes or No?"
    Does it mean that patients decide by themselves if they have an asthma?
    So in this case they can have difficulty in breathing for a thousands of reasons (like Hysterical Hyperventilation or other psychogenic reasons), so the answer is completely unreliable.

  2. Patient Reported Medications in EMR Data
    Is it medication taking in the given moment or might be the history of these medicine use?

  3. Survey Data
    Have you ever been told you have high blood pressure? Yes or No?

According to your example, it’s about the history of a condition, but you put this in a condition_occurrence as a current diagnosis. It’s Ok for a chronic conditions like “Essential hypertension”, but can there be an examples with something that was present in a past and gone now?

@Dymshyts,

  1. [Health Risk Assessment] I think normally for health risk assessment the question usually reads, “have you been told you have asthma” but at the end of the day the person can say what they want.

  2. [Patient Reported Medications in EMR Data] Usually when you enter the hospital the first thing the nurse does is ask you what medications you are taking at home. You may or may not continue them while in the hospital.

  3. [Survey Data] Yeah, it could present in the past in now gone. I think the thing that makes it okay to move to the domain in this example is ALL the data is like that. So you should know going in what you are doing with. On the examples above the patient reported data is mixed with more reliable data so the concern is that people might not notice it if they don’t want to include it.

My opinion is that we stick to the CDM specification: let the Vocab route the concepts, but assign appropriate type_concept_ids to each to ensure the context of that concept is captured. Even if we’re handling drugs that are patient reported (and full of accuracy issues), we map them to drug_exposure but qualify these records with appropriate drug_type_concept_ids. I just worry that making data source-specific decisions to handle HRA/patient-reported/survey data could make running site-level or network-level analyses inconsistent and unreliable.

From a user perspective, that means being cognizant of type_concept_ids when designing cohorts, but I think that could be helped with some Atlas UI enhancements: perhaps requiring the selection or exclusion of type_concept_ids when adding domain criteria. Users would also need to be aware of this nuance when writing custom SQL.

2 Likes

I’m wondering if we should just wait to see what happens with the survey data proposal? I think this would be the best place to put items like this.

Is this a question for THEMIS, for the Survey group for the CDM group in general? I am not sure here.

@ColinOrr or @ColinOrr2006 , I’m hoping to take part of the conversation over here.

“Erica, yes, for some of this data I agree, it is the same class of problem. However, some of the data as you have already pointed out belongs in their already defined domains such as drug exposure. The method of collection may be somewhat irrelevant or least secondary.”

Can you give me an example of where you think my examples above wouldn’t leverage the survey infrastructure?

I’m asking because what our team is struggling with is what constitutes a drug exposure or a condition. For example when you answer questions on an HRA questionnaire the answers most likely won’t be associated with a date that makes any sense for true exposure (e.g. I took drug X in April, and took the HRA survey in October). I’m nervous about mixing it in with all the drug dispensed/condition information - don’t know if people will remember to exclude by DRUG_TYPE_CONCEPT_ID.

ad 1: target table

I can see a good reason why all raw data should go to the OBSERVATION table. The “cleaned” data can than go the appropriate table.

Consider CRFs from clinical trial and answer: jan 31 2016 to question: End date of second pregnancy. We don’t know how the pregnancy ended (e.g., delivery or miscarriage). We don’t know dates of first pregnancy… etc… lots of problem with “cleaning/inference”.

ad 2. Do we treat all patient reported data in the same way in regards to the movement or should we treat them differently depending on source.

I think differently depending on the source. Different type of “inference” may require different ETL treatment.

Reviving this thread . . .

At the THEMIS F2F this was discussed and here was the recommendation:

RECOMMENDATION
Patient reported data recommendation should land in the appropriate domain table (e.g. if a patient reports they had lymphoma it should land in the CONDITION_OCCURRENCE table. These data should be strongly typed so that it easily known which records are patient reported.

ACTION
Ask the CDM WG to add this to the FAQ https://github.com/OHDSI/CommonDataModel/wiki/Frequently-Asked-Questions

@ericaVoss:

Where are we getting the Type Concepts from? Is anybody creating them?

Good point, I guess part of the recommendation requires us to recommend the types. I don’t know if anyone is doing that

@Christian_Reich @ericaVoss
We’re making the rules for converting Patient Generated Health Data into CDM with the IT/ PHR / Life style Coaching companies.
The target records include the diet, exercise, blood sugar and insulin. So we need the Type Concept for this (The target tables in CDM will be ‘observation’, ‘drug_exposure’ and ‘measurement’).

@SCYou, @ericaVoss :

Right now, we have the following Patient reported types:

Do we need anything else?

2 Likes

@Christian_Reich That’s all we need. Thank you!! :grinning:

I’ve updated the ticket under review to include these:

Thank you, @ericaVoss

I released the sample of Patient Generated Health Data, which was generated by a gentleman for a year.

Together with Noom (@yipaulkim) , LifeSemantics and Samsung medical center, we’ll convert PGHD into CDM and combine these data with hospital’s CDM under the patients’ approval.

Should these patient reported drugs or conditions be a part of eras?

I vaguely remember discussing this at the F2F, and we decided that if, as an ETL person, you feel confident that the patient reported events should be part of standard analytic queries, then do bring them into drug_exposure / condition_occurrence, and expect them to be part of drug or condition eras. But, if you have question marks around the validity of patient reported events, then bring them in as observations. Is this correct?

But @SCYou isn’t this data coming from a device? Something is counting steps?


Ugggggggggggggggggggggggggg . . . . my vote would be no, patient reported drugs should not be part of eras cause they probably don’t contain information about how much drug was consumed or when exactly it was . . . but originally I wanted to shove all patient reported items in OBSERVATION. :smile:

@bailey - I remember once you told me that you see the value in patient reported drugs, would you argue that they should be in ERAs?


All,

Could we also consider Health Risk Assessment data as patient reported?

  • 44786633 – numeric HRA values
  • 44786634 – categorical HRA values

@ericaVoss Patient Generated Health Data (PGHD) not only includes step counts (activity), calories intake (nutrition), but also include patient reported condition, measurement or drug.
Since first target population is diabetic patients, we’ll start with glucose level and insulin dosages. I think these data can be captured in measurement and drug_exposure table only when the time and the dosage or the result are specified.
Otherwise, I agree with @Ajit_Londhe 's opinion, which somewhat ambiguous data would be stored in observation table.

I’m developing ETL document for accessible PGHD now. I’ll release it soon. Then, we can discuss further!

Sorry, since we don’t have these data in our data network, I don’t know… I hope these data also to be integrated within OMOP-CDM.

@bailey or @karthik or @Christian_Reich

You have been randomly selected to help decided . . . should patient reported data affect DRUG_ERAs? I’m assuming yes as some datasets this is the bulk of the data.

However data that falls outside of the OBSERVATION_PERIOD should not impact the DRUG_ERA regardless of where it is from. (see this thread)

Here is the current text, but would like to add something about ERAS.

RECOMMENDATION
Patient reported data recommendation should land in the appropriate domain table (e.g. if a patient reports they had lymphoma it should land in the CONDITION_OCCURRENCE table. These data should be strongly typed so that it easily known which records are patient reported. Type concepts include 44814721-Patient reported 44818706-Patient reported device 44818704-Patient reported value 45905770-Patient Self-Reported Condition 44787730-Patient Self-Reported Medication 44786633 – numeric HRA values 44786634 – categorical HRA values

NEXT STEPS
Ask the CDM WG to add this to the FAQ https://github.com/OHDSI/CommonDataModel/wiki/Frequently-Asked-Questions

t