Q: Outcomes discordance EMR data vs Claims data?

mgkahn · March 31, 2018, 12:50am

See question from Cindy Girman, member of PCORI’s Methodology Committee. Anybody know of any studies that looked at this issue?

From: Cindy Girman
Date: Thursday, March 29, 2018 at 10:24 AM
Subject: Data Quality

Hi Michael,
Hope you are doing well. Do you know of any work in PCORnet or other large databases or studies that have looked specifically at the discordance between outcomes identified in EMR vs outcomes in claims and validation thereof? Just curious. There is very little in the literature on this. Thanks
Cindy

Mark_Danese · March 31, 2018, 6:31pm

There is a paper using CPRD data that looks at HES (hospital), CPRD (EMR), and MINAP (registry) data and the concordance among them for cardiovascular events. This is not exactly claims vs. EMR, but it gets at a similar point.

Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study

BMJ 2013;346:f2350 doi: 10.1136/bmj.f2350
Emily Herrett, Anoop Dinesh Shah, Rachael Boggon, Spiros Denaxas, Liam Smeeth, Tjeerd van Staa, Adam Timmis, Harry Hemingway

cgirman · April 1, 2018, 1:28am

Thanks Mark. More interested in US but this is relevant. Anyone know of anything in US with reported discordance rates?
Thanks
Cindy

SCYou · April 1, 2018, 6:02am

@mgkahn
There is a Korean paper investigating the diagnostic accuracy of acute myocardial infarction in national claim data, too the paper

Mark_Danese · April 1, 2018, 2:55pm

It takes a lot of digging because there are lots of validation studies and one has to read the methods to figure out what source they used as the gold standard.

Here are a couple that might be useful (I didn’t review them fully). This is a very interesting topic to me because we store the sensitivity and specificity of algorithms for exposures and outcomes in our software. So, examples like this are very useful to us. And, of course, the effect of the misclassification on the risk estimate is important too!

Identification of Physician-Diagnosed Alzheimer’s Disease and Related Dementias in Population-Based Administrative Data: A Validation Study Using Family Physicians’ Electronic Medical Records.

ncbi.nlm.nih.gov

Identification of Physician-Diagnosed Alzheimer's Disease and Related Dementias in Population-Based Administrative Data: A Validation Study Using Family Physicians' Electronic Medical Records.

RL Jaakkimainen, SE Bronskill, MC Tierney, N Herrmann, D Green, J Young, N Ivers, D Butt, J Widdifield and K Tu, Journal of Alzheimer's disease : JAD, 2016 10 08

Population-based surveillance of Alzheimer's and related dementias (AD-RD) incidence and prevalence is important for chronic disease management and health system capacity planning. Algorithms based on health administrative data have been successfully developed for many chronic conditions. The increasing use of electronic medical records (EMRs) by family physicians (FPs) provides a novel reference standard by which to evaluate these algorithms as FPs are the first point of contact and providers of ongoing medical care for persons with AD-RD.We used FP EMR data as the reference standard to evaluate the accuracy of population-based health administrative data in identifying older adults with AD-RD over time.This retrospective chart abstraction study used a random sample of EMRs for 3,404 adults over 65 years of age from 83 community-based FPs in Ontario, Canada. AD-RD patients identified in the EMR were used as the reference standard against which algorithms identifying cases of AD-RD in administrative databases were compared.The highest performing algorithm was "one hospitalization code OR (three physician claims codes at least 30 days apart in a two year period) OR a prescription filled for an AD-RD specific medication" with sensitivity 79.3% (confidence interval (CI) 72.9-85.8%), specificity 99.1% (CI 98.8-99.4%), positive predictive value 80.4% (CI 74.0-86.8%), and negative predictive value 99.0% (CI 98.7-99.4%). This resulted in an age- and sex-adjusted incidence of 18.1 per 1,000 persons and adjusted prevalence of 72.0 per 1,000 persons in 2010/11.Algorithms developed from health administrative data are sensitive and specific for identifying older adults with AD-RD.

Development and Validation of an Algorithm to Identify Patients with Multiple Myeloma Using Administrative Claims Data.

ncbi.nlm.nih.gov

Development and Validation of an Algorithm to Identify Patients with Multiple Myeloma Using Administrative Claims Data.

N Princic, C Gregory, T Willson, M Mahue, D Felici, W Werther, G Lenhart and KA Foley, Frontiers in oncology, 2016

The objective was to expand on prior work by developing and validating a new algorithm to identify multiple myeloma (MM) patients in administrative claims.Two files were constructed to select MM cases from MarketScan Oncology Electronic Medical Records (EMR) and controls from the MarketScan Primary Care EMR during January 1, 2000-March 31, 2014. Patients were linked to MarketScan claims databases, and files were merged. Eligible cases were age ≥18, had a diagnosis and visit for MM in the Oncology EMR, and were continuously enrolled in claims for ≥90 days preceding and ≥30 days after diagnosis. Controls were age ≥18, had ≥12 months of overlap in claims enrollment (observation period) in the Primary Care EMR and ≥1 claim with an ICD-9-CM diagnosis code of MM (203.0×) during that time. Controls were excluded if they had chemotherapy; stem cell transplant; or text documentation of MM in the EMR during the observation period. A split sample was used to develop and validate algorithms. A maximum of 180 days prior to and following each MM diagnosis was used to identify events in the diagnostic process. Of 20 algorithms explored, the baseline algorithm of 2 MM diagnoses and the 3 best performing were validated. Values for sensitivity, specificity, and positive predictive value (PPV) were calculated.Three claims-based algorithms were validated with ~10% improvement in PPV (87-94%) over prior work (81%) and the baseline algorithm (76%) and can be considered for future research. Consistent with prior work, it was found that MM diagnoses before and after tests were needed.

Missing clinical and behavioral health data in a large electronic health record (EHR) system.

ncbi.nlm.nih.gov

Missing clinical and behavioral health data in a large electronic health record (EHR) system.

JM Madden, MD Lakoma, D Rusinak, CY Lu and SB Soumerai, Journal of the American Medical Informatics Association : JAMIA, 2016 11

Recent massive investment in electronic health records (EHRs) was predicated on the assumption of improved patient safety, research capacity, and cost savings. However, most US health systems and health records are fragmented and do not share patient information. Our study compared information available in a typical EHR with more complete data from insurance claims, focusing on diagnoses, visits, and hospital care for depression and bipolar disorder.We included insurance plan members aged 12 and over, assigned throughout 2009 to a large multispecialty medical practice in Massachusetts, with diagnoses of depression (N = 5140) or bipolar disorder (N = 462). We extracted insurance claims and EHR data from the primary care site and compared diagnoses of interest, outpatient visits, and acute hospital events (overall and behavioral) between the 2 sources.Patients with depression and bipolar disorder, respectively, averaged 8.4 and 14.0 days of outpatient behavioral care per year; 60% and 54% of these, respectively, were missing from the EHR because they occurred offsite. Total outpatient care days were 20.5 for those with depression and 25.0 for those with bipolar disorder, with 45% and 46% missing, respectively, from the EHR. The EHR missed 89% of acute psychiatric services. Study diagnoses were missing from the EHR's structured event data for 27.3% and 27.7% of patients.EHRs inadequately capture mental health diagnoses, visits, specialty care, hospitalizations, and medications. Missing clinical information raises concerns about medical errors and research integrity. Given the fragmentation of health care and poor EHR interoperability, information exchange, and usability, priorities for further investment in health IT will need thoughtful reconsideration.

cgirman · April 2, 2018, 3:00pm

Therein lies the problem – takes a lot of digging and each validation study is very specific to outcome or population. Was hoping someone had looked at discordance more broadly and comprehensively. Good idea for a publication!

Mark_Danese · April 2, 2018, 6:00pm

I agree. Let me know if you need some help, or if you want to do some ad hoc searches. We don’t have a lot of resources, but t isn’t any trouble for us to pitch in. The accuracy of various “algorithms” for finding events is a critical issue for us.

rosa.gini · April 4, 2018, 3:16pm

a poster i presented to the OHDSI Europe Symposium may be of interest

my PhD thesis was on this issue, you may be interested to have a look at it - the main case study is on Italian data but there are also more general discussions

https://repub.eur.nl/pub/93461