OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary 2023 P8 Parkinson's disease

Thanks everyone who joined and assisted with logic of PD cohorts.
We updated the concept set for “ingredients that cause parkinsonism, excluding those that are used in PD patients”
We require more discussion about latest vs earliest index/entry date.

For now, we will work with the “prevalent” cohorts based on “latest event” entry date still.
@Azza_Shoaibi is going to do a logic/quality check on the following cohorts and associated concept sets.
Then, they should be run on Cohort Diagnostics and I’ll review, comment and approve for a network study of prevalence within databases.

Unanimity cohort definitions
#1781748 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity
#1781760 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity with PD meds
#1781843 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity wo confounding meds
#1781844 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity w PD wo confounding meds

Tiered consensus without specialty:
#1781814 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus wo specialty 1yr (1 of 3)
#1781815 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus wo specialty 2yr (2 of 3)
#1781816 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus wo specialty 3yr (3 of 3)

Tiered consensus with specialty – this allows for precedence in neurologist diagnosed PD (vs non-PD) with 6 different cohorts: 2 specialty (neuro vs non-neuro) and 3 individual years of lookback in a progressive tier.
#1781732 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty neuro1year (1 of 6)
#1781809 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty neuro2year (2 of 6)
#1781810 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty neuro3year (3 of 6)
#1781811 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty noneuro1yr (4 of 6)
#1781812 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty noneuro2yr (5 of 6)
#1781813 [Pheb2023][ucepd] Persons with Parkinson’s disease tiered consensus w specialty noneuro3yr (6 of 6)

We discussed two other models (future discussion)

  1. Incidence estimates for a PD cohort:
    – this requires tracking the date of earliest broad parkinsonism occurrence to be used as incidence date of PD even if a 3 year timeframe to meet a cohort defintion for PD is met 5-10 years later. We do not wish to capture the date when the cohort defintion is met. We also wish to only look at the last 3 years of observation time (because of concern of false positive PD idetnified with a 3 year time frame before a later 3 year timeframe identifies a non-PD neurodegenerative parkinsonism).

  2. Cohort analysis – assessing persons going in and out of a cohort.
    Would like to generate a cohort definition and model such that we model, over time, patients enter the cohort once they meet criteria (3 year timeframe) and can EXIT the cohort once they do not meet criteria (3 year timeframe). We are very interested in the 5-10% of patients that start with not-PD (but meet entry criteria), then meet PD cohort defintion, then exit the definition when it becomes clear later that they do not actually meet a PD cohort defintion.

  3. We discussed PheValuator and will defer xSpec and xSens cohorts until more clarity about the timeframes needed for proper analysis are worked out.

Quick update.
@Gowtham_Rao is in process of running above cohorts (post 41) for Cohort Diagnostics.

The OHDSI team and UCE-PD team (UCLA-Calif PD Registry-EHR) will review Cohort Diagnostics output when ready.

We can run PheValuator on above cohorts as well.
For now, because all cohorts are constructed as “latest event”, I have created xSpec and xSens also with “latest event”.
@Azza_Shoaibi and @Joel_Swerdel, will be interested in learning more about how PheValuator handles “latest event” as entry criteron.

I have created two flavors of xSpec. One of the criteria we are basing xSpec on uses “at least one PD diagnosis by Neurologist” Because specialty is not universal among OMOP-CDM, I have created an xSpec with this criterion (1781899) and without this criterion (1781943).

In Atlas-demo:
1781900 [Pheb2023][ucepd] Persons with Parkinson’s disease xSens entry latest
1781943 [Pheb2023][ucepd] Persons with Parkinson’s disease xSpec entry latest
1781899 [Pheb2023][ucepd] Persons with Parkinson’s disease xSpec entry latest incl neuro

Dear all, here are my thoughts (peer review) about the P8 Parkinson’s Disease phenotype.

I reviewed the two most mature cohort definitions produced so far:

  • ...748 [Pheb2023][ucepd] Persons with Parkinson's disease unaminity
  • ...760 [Pheb2023][ucepd] Persons with Parkinson's disease unaminity with PD meds

CohortDiagnostics results are in https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/

These cohort definitions are adaptations of Szumski & Cheng, 2009 (link to paper). Here is my attempt at a visual representation:

The only difference between cohort definitions 760 and 748 is that 760 requires d. 748 does not use d. a' is allowed to be the same record as A because aA (a is a subset of A).

Things to think about in the context of this phenotype

  1. The distinction between true Parkinson’s Disease and differential diagnoses (a. non-PD parkinsonism, b. secondary parkinsonism) is difficult even for specialists. Clinicians may prefer putting down generic parkinsonism codes at first, then changing to true PD codes in subsequent encounters after the disease has manifested itself more clearly. The reverse also happens – patients who were thought to have PD later might get reinterpreted as non-PD parkinsonism.

  2. Establishing the point in time of PD onset is very difficult, if not an arbitrary choice in practice. PD is a progressive disease of unkown etiology and the pathophysiology develops for years before any symptom occurs.

Concepts included, and orphan concepts (standard only)

The original codes were taken from Szumski & Cheng, 2009, then revised by the team. The cohort definitions try to find true PD codes (a) in the absence of competing diagnoses (b and c). The “Orphan Concepts” algorithm of CohortDiagnostics does not consider this nuance, therefore I would not add orphan conceps to either side. I fear that adding orphan concepts to a would harm specificity, while adding to b or c would harm sensitivity.

That being said, some codes that were identified in Orphan Concepts would meaningfully impact cohort sizes (>1k patients), particularly in the IQVIA and Optum datasets:

  • 46236407 Unified Parkinson’s Disease Rating Scale (UPDRS) panel (23k patients in cdm_iqvia_amb_emr)
  • 443602 Adverse reaction to antiparkinsonism drug (could be used as surrogate to PD medications?)
  • 4314734 Dementia associated with Parkinson’s Disease
  • 44782422 Dementia due to Parkinson’s disease

Cohort characterization

I was pleased by the results here because they seem consistent with existing medical knowledge about PD:

  • Age groups: almost all cases are in patients 50+ y/o, prevalence peaks between 70-79 y/o
  • Comorbidities with hypertension, heart disease, and other neurologic conditions.
  • Gender ratio approximately around male 1.5:1 female (link: Statistics | Parkinson's Foundation).
    § Except in these datasets where PD was more common in females than males: cdm_jmdc_v2325, cdm_truven_mdcd_v2321.

Cohort Counts

By far the largest drop in patient counts (about 30-50% drop) is caused by the requirement of a second a code that is at least 30 days apart.

The requirement of absence of b and c causes more modest drops, mostly < 5% in relation to preceding patient count.

Incidence Rates and Time Distribution

Almost all databases exhibit a surge in incidence at the end of the available time period (incidence increases of about 4-10x). Correspondingly, all databases exhibit much longer time periods (about 6-10x larger) before the index date than after it.

I interpret this to be due to the choice of entry event being configured as the last occurrence for each patient, instead of first, in the cohort definition. I suppose the phenotyped patients (old individuals with PD) keep having new medical visits, producing PD diagnosis codes throughout their medical history. The last one thereby is merely the last encounter available in the dataset, or last encounter before the patient dies.

Index event breakdown

Virtually all patients in all databases have an a code (Parkinson’s Disease per se, strictly defined) on index date, despite the entry event requiring just an A code (parkinsonism, broadly defined; a is a subset of A). The lowest percentage was in the cdm_iqvia_amb_emr database, where “only” 96.9% of patients had an a code on the index event.

Visit context

No surprises here:

  • The overwhelming majority of cases happen during Outpatient Visit or Office Visit.
  • Many more visits before cohort start date than after. (agrees with the time distribution)

Agreement between cohort definitions 748 and 760?

This was highly dataset-specific. Agreement between the cohorts ranged from 21.6% (cdm_truven_mdcd) to 92.8% (cdm_cprd) of identified patients.

Summary

Q: Do I think these phenotypes are good to use?
A: Yes, as long as the timing of disease onset is not considered.

Q: Would I prefer cohort definition 760 (PD with meds) over 748 (no meds required)?
A: I would prefer 748, because the impact of requiring d (PD medications) on the cohort counts was heavily dependent on which dataset, and 748 can already be regarded as prioritizing specificity over sensitivity.

Positives:

  1. Descends from a phenotype (Szumski & Cheng, 2009) that was validated by chart review.
  2. Demographics and comorbidities of phenotyped patients demonstrate agreement with existing medical knowledge about PD.

Negatives:

  1. Timing of disease onset is unreliable for at least two reasons: a) it is hard to establish in clinical practice to begin with; 2) the cohort definitions are indexing on the last event, not first.
  2. It is conceivable that messy terminology mappings as well as billing practices could be causing other parkinsonism-related codes to become Parkinson’s Disease per se. a codes are very popular on the entry event, despite any A code would be accepted by the cohort definitions.

Potentials for improvement

  1. Reconsider whether indexing on the last occurrence of A is necessary, or if it could be replaced by indexing on first occurrence + requiring washout period.
  2. The absence of any competing diagnoses (groups b and c) in a 3-year period, I interpret that to be a strict requirement, given the blurry lines separating these conditions.

Kind regards!

Cross posting from last weeks post in OHDSI Phenotype Development and Evaluation workgroup

Click here to join the meeting

Thank you to OHDSI team and the peer review @fabkury !
We are committed to continuing this work to cohort “done” for PD.
We had a touch base meeting yesterday to define next steps.

Review of the existing cohorts posted to
https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/
was discussed. This reflects cohorts from 2/23/2023.

Observations:

  • Source codes show variation in representation of “PD” and related concept codes across databases. All databases have “PD” as the predominant code used. Some databases have detailed conditions/syndromes (EHR/EMR datasets); some have fewer conditions but do represent key non-PD neurodegenerative/secondary parkinsonism; some are PD with a scattered few other conditions; each database varies in type/tendency of secondary parkinsonisms noted. No unexpected findings.
  • No orphan codes of concern found.
  • Incidence is 0.5% (overall across all databases) - lower end of expected for PD for the unanimity cohort.
  • We found the criterion of 2 PD condition occcurrences to be met only 29-66%of the time. This was somewhat unexpected, as we thought 29% was particularly low.
    • discussion notes two possiblities:
      • patients are simply not seen by anyone that codes for PD
      • some databases source data pull condition occurrences from Problem List (person-level data) rather than Encounter-level (as claims data tend to do). When they come from PL, then we will rarely find 2 PD condition occurrences; this makes a medication/treatment criterion more important.

Medication criterion for PD meds as a treatment seemed unusally low

  • 15-70% of databases met this criterion
  • perhaps this is due to gaps in the database source
  • essentially, over time, all patients with PD will receive treatment (near 100% eventually with time). So low % of medication criterion seems to imply either
    – high proportion of non-PD conditions (this seems to be the case in some databases, like JMDC as previously discussed)
    – missing gaps in medication data in the data sources
    – lots of patients in a population database who are not treated?

A few databases have a very high rate of secondary parkinsonism – this raises a greater risk of false positives for some secondary parkinsonisms are coded as PD and it ends up staying in the EHR for a proportion.

Cohort characteristics

  • Much older due to most recent indexing in cohorts pushes this later and not a true incidence
  • including confounding med exclusion does improve male > female expected distribution in all but 1 database

Implications:

  1. there are variations in source data which WILL impact performance of cohorts
    (we expect about 80-85% of parkinsonisms in general to be PD; and we estimate about 5% will be non-PD neurodegenerative parkinsonism so about 10-15% roughly are expected to be secondary parkinsonism); some databases are person-level occurrences for PD vs encounter-level occurrences for PD which will affect definitions.
  • there are some tools that can be used to assess databases further (detect person-level vs enc-level tendencies) to validate applicability of a given cohort.
  1. indexing to most-recent challenges current efforts to estimate incidence and how to potentially interpret PheValuator and other tools
  2. remains uncertain that the med and condition criteria dropoffs are driven by nature of data sources, databases or true appropriate false negatives.

Interesting review of tiers
Despite concerns about how the tiers were designed, they show promise!
Aggregating the tiered cohorts, we find:

  • tiered consensus without specialty (combine 3 cohorts) - increases sensitivity (?specificity)
  • tiered consensus with specialty (combine 6 cohorts) - sensitivity is better than unanimity (as anticipated) and less than tiered consensus without speciatly (as anticipated); suggesting that the tiered consensus with specialty COULD improve sensitivity and specificity, but the latter has still to be demonstrated.

Note the cohorts that use neuro specialty show that 5 of the 12 databases tested by J&J do have specialties coded and detect neurology visits!

Next steps:

  1. Atlas at Northwestern is almost online. We will be able to do some deeper dive into the nature of these criteria with actual patient chart reviews (I am on the strategic chart review validation side).
  2. We want to now reindex the 4 core unanimity cohorts to earliest case. The 3 year lookback is arbitrary and will not be needed.
  • allows analysis for incidence much better
  • I have done this in atlas-demo and aligned all concept sets to match the cohorts we have been analyzing:
  • would like to run these new 4 cohorts to compare with the most recent one
  • anticipate some false positives, which are very interesting to examine closer
    #1781774 [Pheb2023][ucepd] Persons with Parkinson’s Disease unaminity reindexed earliest
    #1782247 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity with PD meds reindex earliest
    #1782248 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity wo confounding meds reindex earliest
    #1782249 [Pheb2023][ucepd] Persons with Parkinson’s disease unaminity w PD wo confounding meds reidx earliest
  1. If available, OHDSI team can we run above 4 cohorts for cohort diagnostics?

  2. We will do a deeper analysis of cohort criteria as soon as our Atlas local instance is up and running

  3. We would like to run PheValuator on above earliest indexed 4 cohorts.
    We have created and realigned xSpec and xSens cohorts to facilitate this:
    #1781859: [Pheb2023][ucepd] Persons with Parkinson’s disease xSens entry earliest
    #1781749: [Pheb2023][ucepd] Persons with Parkinson’s disease xSpec entry earliest

  4. I will work on consideration of how tiered consensus cohorts can be reindexed.
    – It will be a wonderful advance and publication to see if PheVal can be used to compare unanimity vs tiered consensus w/wo specialty.
    – so the immediate goal is to get to “Done” the unanimity cohort with appropriate conditions/commentary;
    – I see the next goal is to truly compare relative to each other unanimity vs tiered consensus using OMOP-CDM tools.

1 Like
t