OHDSI Home | Forums | Wiki | Github

Certainty of a Condition, e.g. COVID: Where do we draw the line and put a concept into OBSERVATION?

(Christian Reich) #1


Here is a problem: In all domains but Measurement, Survey or Observation, all concept are by definition positive facts. So, e.g. SNOMED UK 1321101000000103 “COVID-19 excluded” is not a valid Condition concept. We need to put it into the Observation, or, if there is a test, into Measurement. But where do we draw the line? When do we still believe the patient has the disease more likely than not (there is never 100% specificity):

  • Suspected
  • Probable
  • Presumptive positive
  • Сonfirmed

I am sure the last one is in, but “presumptive positive”? Do you want this guy to show up if you are searching for “Covid-19 and all descendants”?


(Peter Rijnbeek) #2

In The Netherlands all GPs are asked to distinguish

  • Fear of Corona
  • Suspected Corona (symptoms but no confirmed test)
  • Covid-19 infection (only after positive test)

We see all these “code” (it is more a text added to available ICPC codes) now in our GP databases on a large scale.

(Alexander Davydov) #3

There is no choice already. ICD10CM Official Coding Guidelines says that presumptive positive COVID-19 test results should be coded as U07.1 what is confirmed COVID by the definition. So it’s already there.

This becomes uncertain when we’re looking into the various definitions of Probable case:

  1. WHO/ECDC says the following:

Probable case is a suspected case for whom testing for virus causing COVID-19 is inconclusive (according to the test results reported by the laboratory) or for whom testing was positive on a pan-coronavirus assay.

This is still a suspicion according to ICD10CM:

If the provider documents “suspected,” “possible,” “probable,” or “inconclusive” COVID19, do not assign code U07.1. Assign a code(s) explaining the reason for encounter (such as fever) or Z20.828, Contact with and (suspected) exposure to other viral communicable diseases.

  1. CDC’s definition of Probable case refers to this guide that stands for a COVID, confirmed clinically AND/OR epidemiologically. And this should be mapped to the Condition, I think.

• Meets clinical criteria AND epidemiologic evidence with no confirmatory laboratory testing
performed for COVID-19.
• Meets presumptive laboratory evidence AND either clinical criteria OR epidemiologic
• Meets vital records criteria with no confirmatory laboratory testing performed for COVID19.

So the terminology is not just messy but might be controversial and we need to make the different decisions for the various definitions of the same terms.

There might not be a certain borderline. Somebody might want to take just lab-confirmed or also include:

  • lab inconclusive (or positive in an unspecific panel) with any of clinical or epidemiological evidence;
  • lab non-tested, but with both clinical and epidemiological criteria;
  • lab non-tested, but with any of clinical or epidemiological criteria;
  • vital records criteria.

Such abstractions should indeed be built on the analytical side, but there is no place to map “Clinically confirmed” or “2 out of 5 criteria confirmed” COVID right now. The main reason - criteria are still not defined and agreed. Also, we cannot split the complex constructions such as OR/AND/OR. And maybe it’s not the best approach to imply the Measurement events from the Disease dianosis criteria.

One of the possible solutions is:

  1. Explicitly stated Excluded and Suspected Conditions go to Observation Domain and live in respective hierarchies of Clinical finding absent and Disease suspected. They may form subhierarchies. Let’s say, “Probable case” will be a child of “Suspected case”; “COVID excluded using clinical diagnostic criteria” will be a child of “COVID excluded”.
  2. We draw the borderline a little bit closer to suspected than to confirmed, so that clinically or epidemiologically or somehow else diagnosed COVID becomes Condition. Everybody doing the same now, including SNOMED and ICDs.
  3. For the data sources where lab/epidemiological/any kind of other useful data is not sufficient, we support the analytics by an introduction of 'Maps to status" relationship. The level of certainty of Diagnosis will be mapped to various Condition Statuses including, but not limited to Probable, Presumptive positive, Disorder confirmed (somehow), Laboratory confirmed, Clinically confirmed, Epidemiologically confirmed, Clinically or epidemiologically confirmed, Vital records confirmed.

(Stan Letovsky) #4

Hi Christian, long time no see! What about a probability? -Stan

(Christian Reich) #5

Indeed! Glad to have you here, @sletovsky.

Probability: You mean a field indicating the probability? We don’t have the data for that. All we have are the words “suspected”, “probable”, “presumptive” and “confirmed”. “Confirmed” is 100%, but the rest?

The problem is when do we believe it is COVID. It’s kind of like what the Jury is asked in a law suit: When is “beyond reasonable doubt”.

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #6

hi, in the UK, we have two SNOMED CT concepts identifying “confirmed” cases.

  1. COVID-19 confirmed by laboratory test
  2. COVID-19 confirmed using clinical diagnostic criteria
    the second one is to be used when the lab tests are negative but other diagnostic criteria confirms patient positive status, e.g. Radiologist test etc.

Apart from these two, any other probability is suspected cases. for the probability, we are using FIHR standard.



(Sebastiaan Van Sandijk) #7

Hi Leilei,
I take it you mean that you use the condition.verificationstatus valueset when you refer to FHIR (see below). Is this status captured in the source data/EHR/CDR, or do you derive it somehow from the source data? And is this status also explicitly stored for the confirmed cases? More interestingly even, do you know if anyone uses the ‘refuted’ status - e.g., in a native FHIR repository? I am not sure if we take such statuses (refuted and enter-in-error) into account at all when OMOPping…
best, Sebastiaan

verificationstatus valueset: unconfirmed | provisional | differential | confirmed | refuted | entered-in-error

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #8


No. We are talking about probability so I am referring to risk-probability to define risk categories. http://hl7.org/fhir/2020Feb/valueset-risk-probability.html

We are not really talking about condition.verificationstatus, which is a different thing.



(Sebastiaan Van Sandijk) #9

Ah, okay. I was still reading this thread as discussing the certainty of a condition - which in clinical practice is often the status (level of certainty) that clinicians attribute to their ‘diagnoses’. Hence, my assumption.
In my experience, verification-status can be assessed/derived rather unambiguously for a particular clinical setting (department, treatment center, group of medical specialists, i.e. related to a particular data source). risk-probability is in another domain, does not fully overlap with clinical certainty (e.g. confirmed is not the same as 100% probability).
Would be good to know what we actually discuss here :wink:
Best, Sebastiaan

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #10

Hi, I believe the original question is where to draw the line between “confirmed” COVID and “suspected” COVID. how probable is probable, which is a risk probability score rather than the disease verification status. risk probability has many factors attribute to. This is how we reflect it at our EHR system.

Apologize if you feel my answer is not relevant.


(Sebastiaan Van Sandijk) #11

Hi, thanks, good to hear that such probabilities are captured in clinical practice. I have no opinion about relevant or irrelevant - but I am surprised: I have not often seen such risk assessments being recorded in an EHR.
best, sEbastiaan

(Christian Reich) #12


this discussion shows very well why this subject is difficult: Because in observational data we expect facts, and we are not dealing with establishing these facts. So, all these fears, suspicions, half-confirmations are only meaningful locally, as @Sebastiaan_van_Sandi pointed out, and not that useful for network analyses. I am not saying they are not necessary or useful, it’s just we cannot deal with that problem at the OHDSI level.

the other point is that EVERY fact in the data only has a probability. And it is very hard to know the probability. The PheValuator package is trying to put some educated statistical guess around it, but it remains a problem.

I’d propose we stop spending too much time on these half-baked cases. Hopefully soon, the medical community will have high-quality tests, and this hassle will be over.