OHDSI Home | Forums | Wiki | Github

New concepts for probabilities

We are often faced with situations where we need to show the probability of an event. This is most relevant for covering clinical trials datasets. 95% of these data require to reflect the probability of facts of adverse events with any effect - side effects, efficacy and safety of treatment, probabilistic assessment in relation to the prescription of the drug.
Another group that needs to reflect the probability is etiology and poisoning, when a probabilistic cause of the disease is required.
Third group: complications and consequences. For example, necrosis of the skin of the forearm with a probable cause: insertion of a catheter.

There are already concepts in the real world that describe probabilities.
For example, The WHO-UMC causality assessment system (https://www.who.int/medicines/areas/quality_safety/safety_efficacy/WHOcausality_assessment.pdf), Adverse Drug Reaction Probability Scale (Naranjo) (Adverse Drug Reaction Probability Scale (Naranjo) in Drug Induced Liver Injury - LiverTox - NCBI Bookshelf),
Maria & Victorino (M & V) System of Causality Assessment (Maria and Victorino (M & V) System of Causality Assessment in Drug Induced Liver Injury - LiverTox - NCBI Bookshelf), Roussel Uclaf Causality Assessment (Roussel Uclaf Causality Assessment Method (RUCAM) in Drug Induced Liver Injury - LiverTox - NCBI Bookshelf).

When we encounter this kind of data, we are using custom concepts, not standardized ones. Scales are often not specified and they are source-specific. And OMOP needs to describe this, link and use precordinated concepts in the measurement domain, where the reasons will be placed in value. So my suggestion is to create such concepts:

Event Probability
1 Almost certain probability of an event caused by
2 Likely probability of an event caused by
3 Unlikely probability of an event caused by
4 Very likely probability of an event caused by
5 Very unlikely probability of an event caused by

In older versions (up to 5.3 inclusive), these precoordinated concepts can easily be connected with an event through the fact_relationship table, and in the higher versions, direct links can be used. It also makes it possible not to create a huge number of unnecessary links.

Example of use (tuberculosis associated with taking infliximab):
concept name: Almost certain probability of an event caused by

| measurement_id       | xxx    |
| value_as_concept_id  | 937368 | (infliximab, RxNorm)


| condition_occurrence_id | 434557 | (Tuberculosis, SNOMED)


| domain_concept_id_1       |  434557  |
| domain_concept_id_2       |  xxx     | 
| relationship_concept_id   |  4165382 | (Associated with)

Another option for using these concepts may also be proposed. The new concepts will serve to indicate the connection between the two events. In this case, they will belong to the observation domain. The disadvantage of this idea is impossibility of using it in the latest versions. And it will also require building reverse links (Almost certain cause of, Likely cause of, Unlikely cause of, Very likely cause of, Very unlikely cause of).


| condition_occurrence_id | 434557 | (Tuberculosis, SNOMED)


| drug_exposure_id | 937368 | (infliximab, RxNorm)


| domain_concept_id_1       |  434557  |
| domain_concept_id_2       |  937368  | 
| relationship_concept_id   |  xxx     | (Almost certain probability of event caused by)

@Vlad_Korsik @Alexdavv @Dymshyts @Philip_Solovyev @ElenaTrach @MPhilofsky

Your example of necrosis of the skin of the forearm involving insertion of a catheter is a device issue.

Device manufacturers almost always implement the process in ISO 14971:2019 for medical device risk management. The manufacturer identifies hazards, which are a potential source of harm. There is a sequence of events leading to a hazardous situation. If the patient or user were exposed to the hazardous situation, then there is harm. Harm has severity and its frequency of occurrence (probability). Typically, a risk matrix combines the severity and frequency to estimate the risk. Evaluate the estimated risk against the predefined acceptability criteria. If the evaluation results in the need for risk reduction, then implement measures to reduce the severity, the frequency, or both. Reevaluate against the criteria to assure the risk is at an acceptable level.

The manufacturer determines the severity levels, the frequency levels, and the acceptability criteria. There are no standards that specify these levels. In addition, different manufacturers of similar devices could have made very different decisions. The manufacturer’s decisions are not public information, but the FDA has access on a case by case basis.

This is an interesting topic, thanks for bringing it up, and thanks for starting with the use cases, @Maria_Rogozhkina. However, all three uses cases don’t characterize a probability of an event, but the probability of two events being causally linked. And actually, it’s not the probability, but the confidence that they are associated. The events themselves are 100% certain.

The general answer is this: Observational research of the kind we use the OMOP CDM for does not have the associations as input. Instead, the research estimates those associations/links/risks. So, it is the output. That is why we don’t have a mechanism to capture the belief of somebody (usually study investigator) about whether or not the events are linked.

Still, they are legitimate use cases, and for example the whole spontaneous reporting system banks on it. So, what’s the problem with the FACT_RELATIONSHIP connectors? You could create event linkage relationships 1-5 and connect things. Why would that not work?

Are you involved in that, @doleary? Do you have data like that?

100% works.
There are already 2 events recorded in a proper way.
No need to create another event and play with the values. It simply doesn’t exist.

Why so? Fact_relationship_id concepts = Relationship Domain.

Versions of cdm? Why? Fact_relationship table is there.

As always in the Relationship Domain. But we can make them bi-directional.

I’m an independent consultant and work with medical device companies. One area is implementation of ISO 14971::2019, risk management. Over the years, I’ve been involved in about 25 implementations. I also teach courses in medical device risk management.

The process is interesting because initially (during the design phase) there is little information on harms, their severity, and their frequency of occurrence (probability). The standard requires post-market information collection and analysis to update the risk estimates and the effectiveness of the risk control measures.

The work is company confidential and not shared across the industry. However, in the US the medical device reports (MAUDE database) are public information, but not in other regulatory jurisdictions such as the EU.

The MAUDE database can provide the numerator of the frequency of occurrence, but not the denominator, i.e., the number of procedures with no patient or user harm.