OHDSI Home | Forums | Wiki | Github

Certainty of a Condition, e.g. COVID: Where do we draw the line and put a concept into OBSERVATION?

(Christian Reich) #1


Here is a problem: In all domains but Measurement, Survey or Observation, all concept are by definition positive facts. So, e.g. SNOMED UK 1321101000000103 “COVID-19 excluded” is not a valid Condition concept. We need to put it into the Observation, or, if there is a test, into Measurement. But where do we draw the line? When do we still believe the patient has the disease more likely than not (there is never 100% specificity):

  • Suspected
  • Probable
  • Presumptive positive
  • Сonfirmed

I am sure the last one is in, but “presumptive positive”? Do you want this guy to show up if you are searching for “Covid-19 and all descendants”?


Suspected Diagnosis and its place in the OMOP CDM
(Peter Rijnbeek) #2

In The Netherlands all GPs are asked to distinguish

  • Fear of Corona
  • Suspected Corona (symptoms but no confirmed test)
  • Covid-19 infection (only after positive test)

We see all these “code” (it is more a text added to available ICPC codes) now in our GP databases on a large scale.

(Alexander Davydov) #3

There is no choice already. ICD10CM Official Coding Guidelines says that presumptive positive COVID-19 test results should be coded as U07.1 what is confirmed COVID by the definition. So it’s already there.

This becomes uncertain when we’re looking into the various definitions of Probable case:

  1. WHO/ECDC says the following:

Probable case is a suspected case for whom testing for virus causing COVID-19 is inconclusive (according to the test results reported by the laboratory) or for whom testing was positive on a pan-coronavirus assay.

This is still a suspicion according to ICD10CM:

If the provider documents “suspected,” “possible,” “probable,” or “inconclusive” COVID19, do not assign code U07.1. Assign a code(s) explaining the reason for encounter (such as fever) or Z20.828, Contact with and (suspected) exposure to other viral communicable diseases.

  1. CDC’s definition of Probable case refers to this guide that stands for a COVID, confirmed clinically AND/OR epidemiologically. And this should be mapped to the Condition, I think.

• Meets clinical criteria AND epidemiologic evidence with no confirmatory laboratory testing
performed for COVID-19.
• Meets presumptive laboratory evidence AND either clinical criteria OR epidemiologic
• Meets vital records criteria with no confirmatory laboratory testing performed for COVID19.

So the terminology is not just messy but might be controversial and we need to make the different decisions for the various definitions of the same terms.

There might not be a certain borderline. Somebody might want to take just lab-confirmed or also include:

  • lab inconclusive (or positive in an unspecific panel) with any of clinical or epidemiological evidence;
  • lab non-tested, but with both clinical and epidemiological criteria;
  • lab non-tested, but with any of clinical or epidemiological criteria;
  • vital records criteria.

Such abstractions should indeed be built on the analytical side, but there is no place to map “Clinically confirmed” or “2 out of 5 criteria confirmed” COVID right now. The main reason - criteria are still not defined and agreed. Also, we cannot split the complex constructions such as OR/AND/OR. And maybe it’s not the best approach to imply the Measurement events from the Disease dianosis criteria.

One of the possible solutions is:

  1. Explicitly stated Excluded and Suspected Conditions go to Observation Domain and live in respective hierarchies of Clinical finding absent and Disease suspected. They may form subhierarchies. Let’s say, “Probable case” will be a child of “Suspected case”; “COVID excluded using clinical diagnostic criteria” will be a child of “COVID excluded”.
  2. We draw the borderline a little bit closer to suspected than to confirmed, so that clinically or epidemiologically or somehow else diagnosed COVID becomes Condition. Everybody doing the same now, including SNOMED and ICDs.
  3. For the data sources where lab/epidemiological/any kind of other useful data is not sufficient, we support the analytics by an introduction of 'Maps to status" relationship. The level of certainty of Diagnosis will be mapped to various Condition Statuses including, but not limited to Probable, Presumptive positive, Disorder confirmed (somehow), Laboratory confirmed, Clinically confirmed, Epidemiologically confirmed, Clinically or epidemiologically confirmed, Vital records confirmed.

(Stan Letovsky) #4

Hi Christian, long time no see! What about a probability? -Stan

(Christian Reich) #5

Indeed! Glad to have you here, @sletovsky.

Probability: You mean a field indicating the probability? We don’t have the data for that. All we have are the words “suspected”, “probable”, “presumptive” and “confirmed”. “Confirmed” is 100%, but the rest?

The problem is when do we believe it is COVID. It’s kind of like what the Jury is asked in a law suit: When is “beyond reasonable doubt”.

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #6

hi, in the UK, we have two SNOMED CT concepts identifying “confirmed” cases.

  1. COVID-19 confirmed by laboratory test
  2. COVID-19 confirmed using clinical diagnostic criteria
    the second one is to be used when the lab tests are negative but other diagnostic criteria confirms patient positive status, e.g. Radiologist test etc.

Apart from these two, any other probability is suspected cases. for the probability, we are using FIHR standard.



(Sebastiaan Van Sandijk) #7

Hi Leilei,
I take it you mean that you use the condition.verificationstatus valueset when you refer to FHIR (see below). Is this status captured in the source data/EHR/CDR, or do you derive it somehow from the source data? And is this status also explicitly stored for the confirmed cases? More interestingly even, do you know if anyone uses the ‘refuted’ status - e.g., in a native FHIR repository? I am not sure if we take such statuses (refuted and enter-in-error) into account at all when OMOPping…
best, Sebastiaan

verificationstatus valueset: unconfirmed | provisional | differential | confirmed | refuted | entered-in-error

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #8


No. We are talking about probability so I am referring to risk-probability to define risk categories. http://hl7.org/fhir/2020Feb/valueset-risk-probability.html

We are not really talking about condition.verificationstatus, which is a different thing.



(Sebastiaan Van Sandijk) #9

Ah, okay. I was still reading this thread as discussing the certainty of a condition - which in clinical practice is often the status (level of certainty) that clinicians attribute to their ‘diagnoses’. Hence, my assumption.
In my experience, verification-status can be assessed/derived rather unambiguously for a particular clinical setting (department, treatment center, group of medical specialists, i.e. related to a particular data source). risk-probability is in another domain, does not fully overlap with clinical certainty (e.g. confirmed is not the same as 100% probability).
Would be good to know what we actually discuss here :wink:
Best, Sebastiaan

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #10

Hi, I believe the original question is where to draw the line between “confirmed” COVID and “suspected” COVID. how probable is probable, which is a risk probability score rather than the disease verification status. risk probability has many factors attribute to. This is how we reflect it at our EHR system.

Apologize if you feel my answer is not relevant.


(Sebastiaan Van Sandijk) #11

Hi, thanks, good to hear that such probabilities are captured in clinical practice. I have no opinion about relevant or irrelevant - but I am surprised: I have not often seen such risk assessments being recorded in an EHR.
best, sEbastiaan

(Christian Reich) #12


this discussion shows very well why this subject is difficult: Because in observational data we expect facts, and we are not dealing with establishing these facts. So, all these fears, suspicions, half-confirmations are only meaningful locally, as @Sebastiaan_van_Sandi pointed out, and not that useful for network analyses. I am not saying they are not necessary or useful, it’s just we cannot deal with that problem at the OHDSI level.

the other point is that EVERY fact in the data only has a probability. And it is very hard to know the probability. The PheValuator package is trying to put some educated statistical guess around it, but it remains a problem.

I’d propose we stop spending too much time on these half-baked cases. Hopefully soon, the medical community will have high-quality tests, and this hassle will be over.

(Erica Voss) #13

Maybe the Vocabulary as improved since this conversation started, but here is how I’m thinking about mapping some COVID19 values and wondering people’s thoughts:

Suspicion of COVID-19 with severe symptoms. Refer to the hospital
mapped to
37311060-Suspected disease caused by severe acute respiratory coronavirus 2

COVID-19 confirmed severe
COVID-19 confirmed mild
mapped to
37311061-Disease caused by severe acute respiratory syndrome coronavirus 2

No symptoms of COVID-19 at present
mapped to
0-No matching concept

Tagging some friends: @Azza_Shoaibi, @Rijnbeek, @Sebastiaan_van_Sandi, @tduarte, @sergiofbertolin

(Polina Talapova) #14

Hi everyone!

@ericaVoss, from the medical terminology perspective, values you shared have good mappings:slight_smile.

Suspicion of COVID-19 with severe symptoms. Refer to the hospital
mapped to
37311060-Suspected disease caused by severe acute respiratory coronavirus 2

The suspicion of an infectious disease cannot be considered as a disease proper until there is a laboratory confirmation (even though acute symptoms are present).

COVID-19 confirmed severe
COVID-19 confirmed mild
mapped to
37311061-Disease caused by severe acute respiratory syndrome coronavirus 2

According to the clinical course of COVID-19, currently, there are no reliable clinical definitions of mild/moderate/severe/critical novel coronavirus infection. Although WHO has already provided healthcare professionals with the guidance of the patients’ categorization, this can take years to be finalized. That is the probable reason why our Gold Standard ontologies have not yet provided us with equivalent concepts. Thus, to map such source values to the generic concept seems to be technically more relevant (at least now).

No symptoms of COVID-19 at present
mapped to
0-No matching concept

The vocabulary team upholds the OMOP rule stipulating that negative information has no entry in the CONDITION_OCCURRENCE table (read more). But if you do need to store this data, I suggest you populate the OBSERVATION table with 4287774 Absence of signs and symptoms of infection as observation.concept_id and 37311061 Disease caused by 2019-nCoV as value_as_concept_id.

(Andrew S. Kanter, MD MPH FACMI FAMIA) #15

Afraid I have to disagree slightly with Polina. IMO has produced an open source COVID terminology mapped to SNOMED and ICD. For

Suspicion of COVID-19 with severe symptoms. Refer to the hospital
mapped to
`37311060-Suspected disease caused by severe acute respiratory coronavirus 2’

We have a term suspected severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection mapped to that same SNOMED (as an infection is a disease in our editorial policy).


COVID-19 confirmed severe
COVID-19 confirmed mild
mapped to
37311061-Disease caused by severe acute respiratory syndrome coronavirus 2

IMO has:
laboratory confirmed diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) disease
mapped SNOMED 840539006 and SNOMED 395098000

We believe it is important to capture the severity classification although definitions are an issue. I believe that IMO would produce non-standard concepts for those mapped to a SNOMED expression and ICD-10 codes pending new code releases from SNOMED.

As for the negated “no symptoms present” this is ambiguous. Does it document asymptomatic infection or more likely that no screening symptoms were present (without laboratory testing)? So agree with observation.

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #16

We tried to standardised the data input level. in our hospital, clinical consensus is we use three terms related to COVID-19.

1.COVID-19 confirmed by laboratory test
2.COVID-19 confirmed using clinical diagnostic criteria
3. Suspected COVID-19

Number 1 and 2 are two ways of clinically confirming patient COVID-19 status in the UK.

I feel the most tricky one is “presumably positive”. I really don’t know how this term is created and will we be able to find out at what circumstances clinicians use it?

On another note, we define severity using the SNOMED CT COVID-19 severity assessment and FHIR standard, discreetly. We don’t use the pre-coordinated clinical terms on severity. There might be a time and space for such pre-coordinated terms but to maintain consistency of a terminology system, that also means the terminology content could be balloon up as in theory, every disease/symptom can have different level of severity.

(Alexander Davydov) #17

It’s the same in SNOMED:
840544004 Suspected disease caused by 2019-nCoV has associated finding 840539006 Disease caused by 2019-nCoV … is a 64572001 Disease (disorder).

We have pre-coordinated 704996 Patient meets COVID-19 laboratory diagnostic criteria and others that should be used together with the appropriate COVID Condition concept, look. This approach provides simple analytics (you look into the Observation only if you’re interested in certainty).
But I cannot see in the source any evidence of laboratory confirmation. It’s more about severe symptoms confirmed or COVID somehow confirmed?

SNOMED is Standard in OMOP and, unfortunately, such sort of post-coordination isn’t supported.

Isn’t it obsolete hierarchical branch with only 4 conditions in there? I thought all the disorders are confirmed (no matter how: in laboratory, CT room or clinically) by the definition in SNOMED.

Why wouldn’t post-coordination with SNOMED 703444 COVID-19 severity score or 703443 COVID-19 severity scale work?

Everything is supported in OMOP, please look above.

Please don’t confuse with ‘presumptive laboratory evidence’ what is different:

(Leilei Zhu (Clinical Data Standards Lead, UCLH, UK)) #18

Sorry for the confusion, I am simply commenting and I don’t have a question. I did read through the threads before replying. Thank you by the way.

We have a fully functioning EHR system that contains rich clinical data. We will only configure the system based on best practice guide. We will not configure the system to satisfy anything rather than clinical use. SNOMED CT is a rich knowledge base, we will able to get the information required for secondary uses including OMOP model. I have been working with colleagues on this so don’t worry.