OHDSI Home | Forums | Wiki | Github

Classifying surgical complications

Clavien Grades will be observations, so no need to mess with domains. Then, as proposed, you can link them to the actual events through fact_relationship, so that they will be linked. Should work as a temporary solution. As a permanent one, these grades should be added to a vocabulary (which one?).

1 Like

NCI vocabulary has these Clavien-Dindo Scores.
NCI is not OMOPed. And it will never be due to messy structure and lack of relationships to other vocabularies, right, @Christian_Reich?

Well, there are like 7 or so scores. We wouldn’t need NCI if we just wanted to add those.

@Christian_Reich, @aostropolets, @Dymshyts - I am working with @awrosen and this question of the Clavien-Dindo score came up. I see Anna’s proposal and think this makes sense. Let me know if your thoughts have changed since.

@clairblacketer or @rimma or @mgurley do you have any experience with Clavien-Dindo Grade in the CDM?

Let me try to parrot it back here:

  1. Create codes for the grades (something like this):
    2000000010 = Clavien-Dindo Grade I
    2000000020 = Clavien-Dindo Grade II
    2000000030 = Clavien-Dindo Grade IIIA
    2000000031 = Clavien-Dindo Grade IIIB
    2000000040 = Clavien-Dindo Grade IVA
    2000000041 = Clavien-Dindo Grade IVB
    2000000050 = Clavien-Dindo Grade V

  2. Store the Clavien-Dindo Classification in the OBSERVATION table where VALUE_AS_STRING (since it looks like it can have a character in the score)

  3. There can be a condition linked to the score (e.g. brain hemorrhage). The condition would be placed in the CONDITION_OCCURRENCE table and a FACT_RELATIONSHIP could be generated between the OBSERVATION record and the CONDITION_OCCURRENCE record.

@awrosen - I’d need to understand better how the conditions are derived and how you might use it in analytics. I would say we need to do Step 1 and 2 but still need to be convinced on Step 3.

@ericaVoss,

You don’t need to create custom concept_ids. Use MEASUREMENT.concept_id = 37311607 , Clavien-Dindo complication scale and then in MEASUREMENT.value_as_string for the Clavien_Dindo grade. I don’t see any concept_ids for Grade llla, so you will have to use the value_as_string field.

To add to @aostropolets suggestion, since the concept_id comes from the SNOMED vocabulary, maybe petition them to add the grades?

1 Like

It will be much better to go for value_as_concept_id, than value_as_string. To do it, please use Clavien-Dindo complication grade.
Then you can use simple values, not specifically grades.
E.g, II, IIIA. LOINC usage should not be bothering since it’s the only vocab that provide the set of these values.

The only thing to be polished here is transferring of ‘Clavien-Dindo complication grade’ concept into the Measurement Domain.

1 Like

So when we have a score we map to
37311606-Clavien-Dindo complication grade
Instead of:
37311607-Clavien-Dindo classification of surgical complications

Which 37311606 will map us to an OBSERVATION right now. @Alexdavv, will you or would you rather me put in a GitHub request to change the domain?

Then in VALUE_AS_STRING we could put the grade as received but also map the grade to VALUE_AS_CONCEPT_ID (which I can do better with NAACCR over LOINC):
35919088 - I
35919571 - II
35919065 - III
35919029 - IIIA
35919093 - IIIB
35919154 - IVA
35919532 - IVB
36310342-V (except this is LONIC)

This all seems like ontology hacking to me. We are swiping values to shove into value_as_concept_id from syntactically satisfying lists of values. But the semantics of these values are not an actual list of possible values for ‘Clavien-Dindo complication grade’. If the vocabulary is missing concepts, I don’t think it serves us well to just grab what seems kind of right. We should add the missing concepts.

Yeah, after posting I felt like that as well. Then the recommendation would be:

When we have a score we map to
37311606-Clavien-Dindo complication grade

Then in VALUE_AS_STRING we could put the grade as received but also map the grade to VALUE_AS_CONCEPT_ID we add these 2B Concepts
2000000010 = Clavien-Dindo Grade I
2000000020 = Clavien-Dindo Grade II
2000000030 = Clavien-Dindo Grade IIIA
2000000031 = Clavien-Dindo Grade IIIB
2000000040 = Clavien-Dindo Grade IVA
2000000041 = Clavien-Dindo Grade IVB
2000000050 = Clavien-Dindo Grade V

There is supposed to be a condition spin off from these, but I’m not clear on that yet. Need to talk to @awrosen.

Other sources also have CD grade. Wouldn’t it make more sense to either create concepts for grades as well or just use I/II/III etc from LOINC? May not be clean ontology-vise, but better than have disparate 2B concepts representing the same in different sources.
And if using existing LOINC concepts seems shady, then why not create scale-agnostic numbers? Can be used across multiple scales, drinking, smoking, sleep hours, whatever you may think of.

First I’d like to thank @Alexdavv forclearly explaining the difference between [1].

  • Pre-coordination term - a term that was coordinated and assigned a code before you needed it
    (e.g. “Clavien-Dindo complication Grade IIIB”)
  • Post-coordination - a term that you assembled from other terms at the point when you needed it
    (e.g. “Clavien-Dindo complication grade” & “Grade IIIB”)

Based on our discussion we have decided that we will use post-coordination for the Clabien-Dindo Complication Grade:

When we have a score we would write a record to the MEASUREMENT table with MEASUREMENT_CONCEPT_ID as:
37311606-Clavien-Dindo complication grade

Currently this code is of the OBSERVATION domain, however I’ve made a request to the Vocabulary team to move it to MEASUREMENT [2].

The grade can be stored as VALUE_AS_CONCEPT_ID taking @aostropolets recommendations to get concepts for grade created where we don’t have them LOINC:
45879260 = I
45883600 = II
36309815 = IIIA
2000000000 OR REQUEST ONE = IIIB [3]
2000000010 OR REQUEST ONE = IVA [3]
2000000020 OR REQUEST ONE = IVB [3]
36310342 = V


We also had a similar conversation about American Society of Anesthesiologist (ASA) Physical Status Classification System.

When we have a score we would write a record to the MEASUREMENT table with MEASUREMENT_CONCEPT_ID as:
4159411 - American Society of Anesthesiologists physical status classification

Then the score would be stored in VALUE_AS_CONCEPT_ID using the following codes:
45879260 = I
45883600 = II
45883601 = III
45879261 = IV
36310342 = V
2000000000 OR REQUEST ONE = VI [3]


Tagging @sbicty, @Eldar, @Sebastiaan_van_Sandi, @gregk, @larsh, @Alexandra_Orlova

These LOINC codes actually represent level of tumor invasion. For Melanoma. See here:

https://loinc.org/86667-3/
https://athena.ohdsi.org/search-terms/terms/45883601

So I am not sure reusing them for grade of a totally different stripe makes good ontological sense. We need to start taking seriously that when we bring in these external standardized vocabularies, just happening to find syntactically equivalent strings is not the same as semantic equivalence. We should introduce new concepts that are in the Meas Value domain. So that we can have an ontology that actually represents the concepts (not just strings) and list of possible values for a Measurement concept that are allowable for a Meas Value entry in MEASUREMENT.value_as_concept_id.

Sorry to be an ontological stickler.

1 Like

It’s not so straightforward if we’ll look into the LOINC files.
Indeed, when LOINC LA15460-1 ‘IV’ concept becomes a part of LL4442-1 answer list, it has displaytext (the answer string) = ‘IV’ and SubsequentTextPrompt = ‘Melanoma invades reticular dermis’. By LOINC specs, SubsequentTextPrompt is the text associated with answers such as “Other” that indicates what extra information the user should enter, for example, “Please specify:”

When we look into the details of the same LOINC LA15460-1 ‘IV’ concept when it’s a part of LL1685-8 answer list, it has displaytext = ‘IV’ and SubsequentTextPrompt = ‘NULL’. The only concept that uses this answer list is 67213-9 Stage only [PhenX] and it’s not in the area of tumor invasion.

The most representative thing for LOINC LA15460-1 ‘IV’ concept is LL10401-8 answer list where ‘IV’ means an intravenous route of administration.

This is how LOINC question/answer and most of the post-coordinated stuff build: the only thing that determines the meaning of the concept is its description and vertical hierarchical relationships. Once it matches the context, we can use it. LOINC doesn’t provide such relationships for answers (making just syntactical string equivalents of them), but SNOMED does. E.g. SNOMED 4125539 ‘IV’ concept is a Roman numeral so definitely might not be used for intravenous immunoglobulin.

Opposite stuff is SNOMED pre-coordinated concepts, e.g. 40481923 ‘pT1b category’ that implies the result of tumor pathology finding measured using TNM. Such concepts are not used as values, they are sufficient by themselves.

This is definitely what should be introduced, but we’re just at the beginning of a long walk. You already see how LOINC is good in this. A lot of other issues are still to be addressed:

  • selection of Standard between LOINC/SNOMED/NAACCR;
  • replacement of LOINC concepts in answer lists by another Standard or not;
  • value completeness: a selection between ‘1b’ or ‘T1b’ or ‘pT1b’ or ‘pT1b stage’ or ‘pT1b tumor stage’ and accordance with Measurements of different detail degree;
  • splitting of pre-coordinated concepts.

Well, the only vocab that provides all the options is NAACCR. But it has duplicates, still in the workshop and we all understand the context that goes with it. Some of the SNOMED 4152511 Roman numeral are good, but we can’t use 4152513 Upper case Roman letter here.

I really think that the above-mentioned LOINC’s and SNOMED’s mixture is a good choice at this point, but we still have nothing but NAACCR for IIIB, IVA and IVB. Let’s hear from @Christian_Reich @Dymshyts @mik @aostropolets and @mgurley what exactly to be created.

image
Let me try to repeat this trick :blush:
@sbicty, @Eldar, @Sebastiaan_van_Sandi, @gregk, @larsh, @Alexandra_Orlova

1 Like

You didn’t tag me, but let me still add 2 cents here.

I would strongly strongly suggest to pre-coordinate. Because it works better for analytical use cases and Atlas, because it says all it has to say in a single concepts, and because post-coordination is a mess:

  • Post-coordinated facts are harder to query, Atlas needs to know about the coordination.
  • You either pre-define what concepts can be coordinated (through “has answer” relationships), but then you may as well pre-coordinate.
  • You don’t pre-define, but then you will get a lot of garbage (Clavien-Dindo complication grades “3” or “VII” or “100.3” or “high”).

Also, you always run into the problem to decide wether or not the value_as_concept_id or answer has full semantic identity or not. In other words, is the “IIIB” only a Clavien-Dindo complication grade, or also a TNM Pathology Stage Group? Are all these “IIIB” one concept, even though they mean different things? Or are there many, in which case we have a ton of concepts “IIIB”, and we wouldn’t know the difference unless we follow the concept_relationship? Of course we could characterize the IIIB as a Clavien-Dindo IIIB or Stage Group IIIB, but that would be - botched pre-coordinatation!

Bottom line: Don’t do that. There are only two reasons why post-coordination is better, which really is only one reason:

  • There are too many answers to create all these pre-coordinations. Bad reason, because you got to have all the answers anyway.
  • You need to incorporate an infinite or unknowable amount of answers, like in truly numerical values (not the categorical ones I, II, III etc.). In that case you have no choice.

So, please pre-coordinate those poor grades.

Atlas supports this in cohort building. For the rest, custom covariates is a solution for now. But methods should follow the needs. And they will - we cannot pre-coordinate everything anyway.

In could look into both concept_relationship (in cohort definitions to support it) and into data (in characterization to generate the covariates from frequent post-coordinated combinations).

Agree, but there are always users that desire to have ‘II or III’, ‘between II and III’, and many other things that we even can’t imagine, but supposed to be useful by them.

And unless we don’t put new pre-coordinated OMOP Extension concepts and their SNOMED 37311607 Clavien-Dindo complication scale ancestor into the Condition Domain (blocking up value_as_concept_id) and don’t deStandardize another SNOMED agonist 37311606 Clavien-Dindo complication grade, people would use both designs for one clinical entity what is even more mess (just remember the COVID/Influenza testing cohorts). Can we do such forced standardization and leave just pre-coordinated Conditions? Or we have to deal with pre-/post-coordinated mix among the same terms forever?

We’re playing around with Domains, but, by the definition, it should be a Measurement (please don’t say to pre-coordinate there), maybe - Observation (will still enforce the mess), but not a Condition.

… while post-coordinated mess might be resolved by proper phenotyping and validation.

We do. But making 10-20% pre-coordinated will not solve the whole issue.

When we resolve one specific problem, yes. Once it comes to the general approach for the model, we need to multiply answers by questions and it becomes a good reason.

Numerical go to the value_as_number so there is no issue.
But what is the borderline and who will decide and when? A domain could be.
We still need a good solution for:

  • lab tests;
  • allergy to substance;
  • history/family history of;
  • disease suspected;
  • clinical finding absent/disorder excluded;
  • and others where people forced to go for pre-/post-coordination at once.

This would definitely work too.

@ericaVoss @awrosen @sbicty Sorry for long discussions, but these gonna be first OMOP Extension concepts after the COIVD-related ones.

1 Like

Dear @Alexdavv, that sounds amazing!

@ericaVoss, I’m sorry I didn’t notice you tagged me. Regarding how The Clavien-Dindo (CD) scores are derived and how they might be used. The score grades the severity of complications by how they are managed, so if a patient would have a post-operative complication say a urinary tract infection, it could be graded as a CD II if it is managed by antibiotics or CD IVa if it leads to an ICU stay due to single organ dysfunction. So they are derived based upon the management of the conditions. For analytics, it would probably mainly be an outcome we were interested in predicting or assessing the effect of different interventions on the risk of getting an, e.g.>= grade IIIa complication.
However, the CD scores could also provide valuable insight for predicting the risk of readmission for patients admitted to a department or using it as an exposure for late-term outcomes such as the risk of recurrence after a surgery for cancer.
I hope this answered your question, if not please don’t hesitate to reach out.

@Alexdavv & @Christian_Reich, what are we settling on here pre-coordinated or post-coordination? If I got a vote I think I would vote pre-coordination now because it is easier from the tool/analytics side, but I see how this is a nightmare on the Vocab side.

I’d agree with pre-coordination only if we come to the decision (for the whole community, in fact) to leave only one parent concept for each of these scales in the Condition domain and create/use the pre-coordinated Conditions placed below in hierarchy.
It means that people would not be technically able to post-cordinate using these concepts and we’ll get rid of ugly pre/post-coordinated mixture, at least in these 2 scales. So we need to conclude that things like “II or III”, “between II and III”, “nearly III” go to trash here :blush:

And again, what if someone will need to use them as Condition modifiers? It should be Measurement/Observation domain then.

I am not sure, but this plea for pre-coordination seems to apply to the OMOP-CDM as a whole, not just for the Clavien-Dindo complication grades. In that context, the other day I read a proposal by the WHO Working Group that works on the harmonization of the ICD, ICF and ICHI. Basically, they work on a unifying ontology or Content Model for this Family of International Classifications. Thing is,this WG calls post-coordination ‘key’ and ‘critical’ to the Content Model. I am not fully versed in this domain, but would this pose a problem? Can OMOP-CDM function on a preferred or fully pre-coordinated model with some of the main (standard) vocabularies choosing for a post-coordinated model? I can imagine that this complicates things, either for data custodians or on the side of the CDM.

Can you show what they are saying?

t