Disentangling celebral infarctions

Patrick_Ryan · August 14, 2018, 3:44pm

In the 06082018 vocabulary, I see several ICD9 and ICD10 source codes for concepts related to: “Persistent migraine aura with cerebral infarction”, which have two mappings to SNOMED standard concepts: 1) persistent migraine aura, and 2) cerebral infarction. On the surface, that seems reasonable to me based on the words alone. http://www.ohdsi.org/web/atlas/#/search/Persistent%20migraine%20aura%20with%20cerebral%20infarction

The issue I’m running into is for my analytical use case, I want to identify ischemic strokes, which would not include these migraine codes but would include other types of cerebral infarctions. Any suggestions for how to manage this situation (without resorting to source code selection)?

Eldar · August 14, 2018, 4:08pm

Hi @Patrick_Ryan,
I see no way to capture this in Concept sets.
Because of mapping to 2 concept_id’s there will be 2 records occurring the same day (and I assume with the same visit_occurrence_id - but it’s the question to ETL process of your data sets).
Based on this you may use additional criteria: cerebral infarction with 0 occurrences
of persistent migraine aura the same day (or within the same visit)

Chris_Knoll · August 14, 2018, 4:39pm

I think it’s a safe assumption hat if one source code maps to two occurrence records, they will be using the same visit occurrence that the source record will be based on…but, it’s always worth confirming it with your ETL logic.

Christian_Reich · August 14, 2018, 10:33pm

Why don’t you want the migraine one, @Patrick_Ryan ? It’s an infarction.

Patrick_Ryan · August 15, 2018, 3:04am

It’s a fair question that I’m wrestling with. My intention is to find ischemic stroke. My clinical colleagues inform me that, while ‘persistent migraine with cerebral infarction’ may involve some vasoconstriction that could potentially lead to a stroke, it on its own is not a stroke in the more contemporary sense. The ‘other’ cerebral infarctions (including descendents) are consistent with clinical expectations and also inline with prior literature ‘validating’ the ischemic stroke definition. But to reconstruct the literature-based ICD9 definition, I would have to resort to a source-code-specific conceptset, because the migraine codes map to ‘cerebral infarction’ but I don’t want them for my particular use case.

SCYou · August 15, 2018, 7:04am

I cannot think of a solution to your problem, @Patrick_Ryan

Overall, the accuracy identifying stroke in administrative data is not so good. But the accuracy can be improved by adding ‘inpatient /ED diagnosis only’ criteria. How about adding this criteria?
( https://www.ncbi.nlm.nih.gov/pubmed/27426016
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3412674/ )

Usually, if the migraine is only related with vasoconstriction, not caused by actual stroke, the patient would not be admitted.

Christian_Reich · August 15, 2018, 8:35am

And why is excluding the migraine not a solution, as @Eldar suggests?

SCYou · August 15, 2018, 8:56am

@Christian_Reich,Excluding the migraine can be a solution.
But if one wants to write a paper, excluding migraine in stroke patients seems a little bit weird to reviewers or other readers.

As I said, I don’t have a good solution for this… And This is why phenotyping is becoming more important in OHDSI.

mgkahn · August 15, 2018, 1:51pm

Patrick’s conundrum of using SNOMED mappings from ICD9/10 criteria is one that has been raised before. Many of our investigators come to us with a prior sentinel publication that has very specific ICD codes that our investigators want to match EXACTLY so that their study can be linked to the published definitions. As we know, mappings between ICD and SNOMED is not one-to-one. That is, “round trip” mappings from ICD -> SNOMED -> ICD does not always return the original set of ICD codes. In the past, others (unnamed!) on this thread have argued that the additional ICD codes are often also relevant and accepted by investigators or have very low counts or really represent local coding practices rather than being clinically meaningful. But our investigators push back on the addition of ANY additional ICD codes that are not present in the sentinel publication. This is when we drop back to using source_concept_ids to assuage the investigator’s need to be able to say that their cohort exactly matches the previous publication.

Patrick_Ryan · August 15, 2018, 7:43pm

Hahaha @mgkahn, yes, I certainly subscribe to using standards, and I recognize the consequences of that position. The legacy of using ICD9 codelists that were previously published as justification for future work is an inertia that we as a community will continue to have to push hard to overcome. My real question here is really more clinically motivated: if there are differences in types of cerebral infarctions (e.g. some are infarcts that are part of migraines, and some are infarcts that are full-blown strokes), then can or should we have a way to differentiate these types?

As a more specific illustration of the dilemma, I can currently select the descendants of ‘cerebral infarction’ and that finds me concepts with greater specificity (and in ICD9 world, the ‘stroke-related infarctions’ all map to these more specific concepts, so if all I cared about was replicating a ICD9-based publication, I can do it without issue). But it is possible that source data may have something that does not provide added specificity (e.g. in ICD10 world, there’s a nonbillable code of I63, which is just ‘cerebral infarction’), and depending on the circumstance, I may want these non-specific data to be included. Perhaps the answer is: ‘all non-specific cerebral infarctions should to be considered equivalently’, in which case the mapping question becomes: ‘is the cerebral infarction listed in the migraine source codes REALLY a cerebral infarction?’

Gowtham_Rao · August 16, 2018, 1:10am

Transient ischemic attack (TIA) vs stroke. The later has infarction, former has ischemia.

Christian_Reich · November 13, 2018, 12:33pm

@qiongwang: Want to help?

SCYou · November 14, 2018, 2:56am

I realize how annoying this situation is.

I found no one has persistent migraine aura with cerebral infarction (ICD10 G43.6x) in Korean NHIS-NSC database.

@Patrick_Ryan, @Christian_Reich, Could you count how many people have this diagnosis code in your databases?
(ICD-9CM: 346.6x, ICD-10CM: G43.6x)

After reviewing bunch of papers, I concluded that 433.x1, 434.x1 in ICD-9 CM and I63x in ICD-10 CM are the most validated diagnosis code in previous papers.
In OMOP, descendants from concept_id 443454, 4043731 can include all maps to OMOP concept_id for these ICD codes.
However, as @Patrick_Ryan said, this include unwanted ‘maps from’ ICD code such as G43.6x, I97.81x, G46.5, G46.6, G46.7, I97.8x, 346.6x, and 997.02
I hope to exclude ‘G43.6x’ and ‘346.6x’ because this code has never been validated in the previous studies.

So we have three options:

We can validate concept id of 443454, 4043731 in OHDSI (Recently, @jswerdel proposed PheValuator. And I can validate this by using Ajou university DB) - the best option
We can make a stroke cohort definition with excluding terms for same-day migraine
We can count how many people actually have these diagnoses in multiple databases. If no database has this condition, then I would be relieved.

Patrick_Ryan · November 14, 2018, 2:17pm

Hi @SCYou, just so its documented:

The options that are currently available to the OHDSI community for creating cohort definitions are:

Create a cohort definition that uses a conceptset that contains standard concepts. Pros: it will be a cohort definition that can run across the OHDSI network, independent of the source coding scheme. Cons: the standard concepts may have mappings from certain source codes that others may prefer not to see in their phenotype definition. In this example, when defining ‘cerebral infarction’, one group could reasonably argue that they should include ICD10 G43.6x ‘persistent migraine aura with cerebral infraction’ (because it explicitly suggests the presence of cerebral infarction), while other researchers could reasonably argue that this isn’t the same clinical construct of cerebral infarction that they are looking for.
Create a cohort definition that uses a conceptset that contains source codes. Pros: you can tailor your list of codes to whatever level of precision you want (e.g. cherry picking the ICD9 and ICD10 codes without regard to how they map up to SNOMED). Cons: this cohort definition will only be applicable to databases that use the same source codes.

My general preference is to support global research across the diverse community of researchers and databases throughout OHDSI, so I advocate for #1 whenever possible, but there are certainly instances when #2 is necessary and ‘good enough’ if you know you are only doing your study in your own data. In either case, both of these alternatives are fully supported in the OMOP CDM and also fully supported using ATLAS as OHDSI’s standard platform for defining and instantiating cohorts.

Now, to the specific question of ‘what did I do for stroke when we were designing the protocol for LEGEND?’, I did the same investigation tat you did @SCYou, I empirically evaluated how often the questionable ICD9/10 codes for ‘migraine with cerebral infarction’ actually arose across my databases. And it wasn’t never, but it was very uncommon…less than 1% of the total number of strokes across all the databases. The other thing I noticed is, in a good chunk of the cases, a person who had the ‘migraine with cerebral infraction’ code also had a ‘cerebral infarction’ code, meaning they’d be picked up in our phenotype definition whether or not we included that code. So, in the interest of enabling global research, I went with approach #1, keeping in this code that some would argue to keep in and some would argue to keep out, recognizing that it actually doesn’t have any practical impact one way or the other.

In general, I think the approach we should be taking is not to subjectively argue for or against a given code, but using the data to determine whether that code makes a difference in the prevalence and composition of a given phenotype. And I agree we should be trying to more systematically validating our phenotypes. One of the compelling aspects of @jswerdel’s PheValuator approach is that it provides an objective basis to compare the operating characteristics of alternative definitions. So, from this example, if we were worried about the impact of inclusion/exclusion of the ‘migraine with cerebral infarction’ code, we could create 2 phenotypes and then run PheValuator to see the impact on sensitivity/specificity/positive predictive value. And since the indepedent prevalence of the code in question is so low, we’d see that it doesn’t make a difference and so probably not where we should be spending our time wringing our hands.

hripcsa · November 14, 2018, 3:19pm

See also this paper on the effect of vocabulary mapping:

https://academic.oup.com/jamia/advance-article/doi/10.1093/jamia/ocy124/5159502

On 9 phonotypes, if you were thoughtful, the error in the cohort was only 0.0026% maximum due to the mapping. Part of the good performance was redundancy like Patrick points out (patients have several different codes). So it points to going for Patrick’s option #1.

The truth is that the other measurement error far outweighs the mapping errors, so the time should be spent using alternate coding sources to verify the diagnosis if that is important rather than perfecting the codes.

SCYou · November 14, 2018, 10:53pm

I appreciate your great contribution for validating OHDSI system again, @Patrick_Ryan @hripcsa
I discussed about this with @schuemie yesterday.

As @Patrick_Ryan suggested, I will generate various cardiovascular cohort with diverse specifiers (+/- inpatient visit +/- primary diagnosis +/- imaging study +/- including or excluding certain conditions) and then evaluate the PPV by manual chart review.

I’d be happy if other institution can join this work.

And then, the sensitivity and specificity can be validated again by @jswerdel 's PheVulator.

qiongwang · November 19, 2018, 3:03am

Here, I put my cohort in the public Atlas. Have a look at the concept definiation of Ischemic Stroke please.
Qiong-PLP-T(Ismchemic Stroke+Warfarin)
http://www.ohdsi.org/web/atlas/#/cohortdefinition/1769613

qiongwang · November 19, 2018, 3:24am

Here, I put my cohort in the public Atlas. Have a look at the concept definition of Ischemic Stroke, please.
Qiong-PLP-T(Ischemic Stroke+Warfarin)
http://www.ohdsi.org/web/atlas/#/cohortdefinition/1769613
Wish could useful for you.

Some Clinical thoughts wish could help:
Ischemic stroke can be divided into two main types: thrombotic and embolic .

A thrombotic stroke occurs when diseased or damaged cerebral arteries become blocked by the formation of a blood clot within the brain. Clinically referred to as cerebral thrombosis or cerebral infarction, this type of event is responsible for almost 50 percent of all strokes. Cerebral thrombosis can also be divided into an additional two categories that correlate to the location of the blockage within the brain: large-vessel thrombosis and small-vessel thrombosis. Large-vessel thrombosis is the term used when the blockage is in one of the brain’s larger blood-supplying arteries such as the carotid or middle cerebral, while small-vessel thrombosis involves one (or more) of the brain’s smaller, yet deeper, penetrating arteries. This latter type of stroke is also called a lacuner stroke.

An embolic stroke is also caused by a clot within an artery, but in this case the clot (or emboli) forms somewhere other than in the brain itself. Often from the heart, these emboli will travel in the bloodstream until they become lodged and cannot travel any farther. This naturally restricts the flow of blood to the brain and results in near-immediate physical and neurological deficits.