I am just saying that if the result comes as a range, there is no need to store the range (other than in the source value column). Most studies will do just as well storing one number from that range (middle, or even a random draw from the range under the uniform distribution) as opposed to keeping track of the low and high number. It adds a little to the existing measurement error, but at least the values remain usable. Most researchers will forget to include logic to use explicit result ranges appropriately.
@hripcsa the metadata and annotations workgroup (@Ajit_Londhe) needs to make note of this. The need for an uncertainty concept has arisen in several contexts recently, including oncology - episodes of care (@rimma, @Christian_Reich) and the chart review project (@jon_duke) and I think the NLP WG (@HuaXu). I suspect it will have relevance to DDLs being developed for PLE results and those that might be developed for PLP products and parameters (@schuemie, @Rijnbeek, @jennareps, @Patrick_Ryan). In general, I think uncertainty could be a very useful piece of metadata in many OHDSI efforts and it’s representation deserves a good bit of input from the community. If others think so, maybe this could be moved to become the start of a separate thread on that topic.
@Christian_Reich. I am proposing 2 models as follows:
Change the definition of Range_low and Range_high columns to allow both actual result and normal range. Load actual result and normal range into 2 separate rows and then link them up via Fact_relationship table. So CDM tables will look like below:
Concept 555666777 (Has normal range) needs to be created as a new Relatinship_id concept for this design.
Load lab test result range as one number (so 20-50 becomes 2050) into Value_as_number column. Create new Value_as_concept_id concept to indicate how to parse this number to find range low. The rest part will be range high. So CDM will look like below:
Concept 666777888 (First 2 digits as range low) needs to be created as a new Value_as_concept_id concept for this design. I think we only need to create 5 new concepts at most (first 1 digit, first 2 digits, and up to first 5 digits)
The advantage of this approach is that everything is in one record. The disadvantage is that people may mistakenly use 2050 as the actual result and not as range count. Also the newly created Value_as_concept_id concept somewhat deviates from the original meaning of this field.
Pretty cool ideas: “Overloading” the meaning of a field through a flag in another field, and cramping two values into one using a predefined mechanism also declared in another field.
But we should do neither one. Look: If we need fields we can make fields. The problem is different: Does fixing this rather obscure range problem not make the general use case unnecessarily harder and cumbersome? I think so. Instead of having a straightforward value and checking it, now you have to do this acrobatic because sometimes there are measurements with an uncertainty. I am with @hripcsa: All values have an error bar, in clinical labs they are just rarely explicit.
So, I’d go with the average. Plus: Who does counting cells in the microscope (resulting in these high power field counts) these days anyway? That’s so 1950s.