DQD check thresholds

MPhilofsky · April 13, 2020, 3:10pm

Where does the DQD source the lab result threshold values? The ranges I’m reviewing seem very tight.

Example: A platelet count <100 is abnormally low, but plausible.

Description from DQD: “For the combination of CONCEPT_ID 3007461 (Platelets [#/volume] in Blood) and UNIT_CONCEPT_ID 8848 (thousand per microliter), the number and percent of records that have a value less than 100.000.”

clairblacketer · April 13, 2020, 6:06pm

Hi @MPhilofsky, great question! The DQD implausible MEASUREMENT values were determined by a team of physicians at Columbia. Your concerns are warranted though because I have seen many values in my claims databases that do not align well with the plausible values chosen. To investigate this further, @Vojtech_Huser has submitted an abstract to AMIA that looked at the distribution of values for certain measurement/unit pairs. I believe it is his hope to inform the plausible values in the DQD based on this research.

MPhilofsky · April 13, 2020, 6:19pm

Thank you for the information, @clairblacketer!

Vojtech_Huser · April 13, 2020, 6:59pm

I am glad people find plausibility checking of lab data interesting !!

Here is a good readme file (with results of the study; partial results submitted to AMIA; working of follow up) : https://github.com/vojtechhuser/DataQuality/tree/master/extras/DqdResults

And the key file would be this one https://github.com/vojtechhuser/DataQuality/blob/master/extras/DqdResults/S01-benchmark-kb-subset.csv

For that test (3007461 and 8848 unit) P01 (percentile 1) is 20. One check we envision is to see if your data percentile p01 for that lab-unit pair is in the same range (to be yet determined the width of that range, maybe ±SD; )

(in your lab test the SD is too big 1480, I would say to be used a good indicator)

I have to admin that reaching DQD developers consensus on these things is challenging…

(we can simply use mean P01 as the threshold and specify some value for records that we are OK with (well 1% would make some sense) being off that mark