OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary 2023 P8 Parkinson's disease

@allanwu as a follow-up to the discussion referenced above

this is an executable R package that may be executed as part of OHDSI-studies network study https://github.com/ohdsi-studies/PhenotypeParkinsonsDisease

I have executed this package on a few data source and the output is available below. I think the next step is to make some initial inferences on its performance characteristics. You may also use the output to check if there are logical errors in Cohort definition (ie if something looks unexpected, then probably there was some error someplace in our atlas definition design).

https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/ .

As you described above, this only has the Unanimity algorithm based cohort definitions. Tiered based cohort definitions need to be developed. I understand the main operating characteristic of interest is specificity and positive predictive value. So I think it’s reasonable to look at the population level summary characteristic in the tool to check for anything that looks unexpected and would be evidence that there are person’s in the Cohort that don’t have the phenotype of interest. As we learn, it is reasonable to iterate and improve the definition, if needed.

This is an interesting design and it is different from almost all definitions I have worked on before. Reason, it’s indexed on last encounter (index date) apply rules to improve specificity on time temporally prior to index date.

Looking forward to learning with you.

1 Like

Thanks @allanwu for driving this conversation, and thanks also @Gowtham_Rao for implementing the discussion and providing summary data for the community to review and learn from together.

One thing that I was wondering about in the cohort definition implementation: it appears these cohort definitions are indexing off the latest of the parkinsonism concepts and then looking back to see if persons have 2+ PD codes and no secondary or neurodegenerative codes. But the consequence of this definition is that a person’s cohort_start_date will be the latest point they were observed. Wouldn’t we consider a person to have Parkinson’s at the first date of diagnosis, so long as they satisfy the same criteria (>=1 code >30d apart and no secondary or neurodegenerative codes before or after)? Or is it specifically the case that once a secondary or neurodegenerative diagnosis is recorded, a person is reclassified out of the PD cohort (which means you’d have to ‘look forward’ from the initial PD diagnosis date to rule out future reclassification)?

I created a new cohort definition (JSON attached here) that tries to ‘reindex’ based on when the disease was first diagnosed. Interestingly, I couldn’t get the exact same patient count as the original cohort (it was very close, but not exactly the same), and it’s because theres a small proportion (<0.1%) who have a secondary or neurodegenerative concept AFTER the latest PD code.
PD reindex.txt (14.7 KB)

Addressing @Patrick_Ryan post, the issue is PD is not diagnosed either early or late. Patients are diagnosed with various parkinsonisms nonspecifically for several visits b/c it is too subtle or the provider remains unsure of the diagnosis. Later, it is often clearly reasonably coded/diagnosed as PD (even though it is actually a neurodegenerative parkinsonism),. Then later visits are most accurately diagnosing the person with either PD (if it does not change for many years) or with a PSP/CBD/MSA parkinsonism. So the most accurate diagnosis is usually considered the most recent “set” of most consistent diagnoses (usually neurologists) b/c the primary care docs often just keep copying forward the erroneous PD diagnosis code even if the neurologist has diagnosed something differently later.

This is supported by the note that the <0.1% proportion have a secondary/neurodegenerative concept AFTER the last PD code.

We are intensely interested in the earliest entry/incidence date in our cohort, but to classify if the person is actually PD or Not PD requires look at the last several visits (and we suspect specialty, but that’s to be tested). The literature suggests that once you figure out someone has PD later, then the earliest “broad parkinsonism” concept counts as start of cohort which is how Atlas seems to work. I do lose this in the way @Gowtham_Rao and I constructed the unaminity algorithm.

I tried to copy/paste JSON in the post into Atlas-demo and I didn’t see a difference in the indexing of cohort start. See cohort 1781774:

Thank you @Gowtham_Rao for sharing the initial characteristcs across 12 databases. In my first pass, I have a few observations. I hope the team here does not mind my feedback on use of tool mixed in here.

  1. Comparing the cohorts without and with med criteria allows us to see how much variability there is between databases in defining PD using a med criteria or not. I computed a % of dropoff in counts from the PD without a med criteria vs PD with a med criteria.
    These vary from 8% to 79% across 12 databases, likely reflecting how each database robustly (or not) represents medications. The overall dropoff is 33% (from 839,007 counts w/o meds to 561,482 w/meds. Interestingly, our literature paper showed a dropoff from 387 to 368 (4.9%) - lower than any of our OMOP databases.

Orphan concept picked up the need to ADD:
concept 4103534 progressive supranuclear ophthalmoplegia to the ‘parkinsonism (non-PD, neurodegenerative) conditions’ concept set.
@Gowtham_Rao can you make this change in the two unanimity packages?
I’ve updated it in Atlas-demo already.
Nice feature!

Couldn’t run incidence check: "Error: there is no package called ‘ggh4x’ "

In general, have to be careful reading the help text esp since these cohort defns are based on most recent observation rather than earliest… For example, How to interpret cohort start and end when entry point is “most recent” event?

Visit context useful – most databases were all outpatient.
The claims and pharmacy ones were much more heterogeneous. Likely accounts for some variation.
(it would be nice to see associated specialties with visits - but realize this is not yet standard in OMOP-CDM).

cohort overlap function to create table did not work (good graphs though).

Most cohorts had expected characteristics – male predominance.

  • the ages appear high but likely b/c our index event is “most recent”
  • associated events make sense - depression ,dementia make expected appearances; sleep disorders interestingly did not show up much at all - REM Behavior Disorder is something I was looking for and is a prodromal syndrome in PD/parkinsonisms
  • would be nice in the pretty version to be able to filter in/out classes of concepts. I was looking to focus on neuropsychiatric symptom/syndrome conditions and exclude some of the general medical geriatric conditions (hypertension, renal dz, etc).

Two databases show unusual characteristics, younger and F>M

  • these seem to have higher rates of concepts of secondary parkinsonism than the others (truven_mdcd, jmdc databases)
  • the fact these show up in our unanimity cohort suggest that either the source EHR/ETL maps ‘vague parkinsonism’ to ICD10 G20 → which becomes PD OR we are missing a secondary parkinsonism in our exclusion criteria somehow.
  • to figure this out, I would want to look more closely at the younger age female cohorts associated condition/concept characteristics in these 2 databases.

Proposed Atlas PD tiered consensus algorithm.
@Gowtham_Rao if time, can we have this logic reviewed?
Will be a great comparison to unaminity algorithm.
Atlas-demo did not have ability for me to populate Specialty – so those are blank but intended for “neurology” specialty.

I think I have worked out logic adapting the tiered consensus algorithm to Atlas.
Basically, because Atlas and OMOP cohorts work by successively criteria that narrow counts, it struggles with logic that has to do with multiple and sequential criteria (i.e. specialty and most recent dx)
So I set up 6 Atlas cohorts, each of which is designed to be based on non-overlapping distinct subcohorts. By assembling output from all 6, we achieve the counts needed for this algorithm.

Identify non-overlapping buckets. Each will be its own Atlas cohort
All have two PD conditions in the last 3 years (no restrictions by visit type/specialty)
Looks like 6 buckets - each with subset of what would fulfill criteria

  • neurologist/PD in last year
  • neurologist/PD in 1-2 years ago (not last year)
  • neurologist/PD in 2-3 years ago (not last 2 years)
  • non-neuro/PD dx in last year (never saw neurologist in 3 years)
  • non-neuro/PD dx in 1-2 years ago (not last year)
  • non-neuro/PD dx 2-3 years ago (not last 2 years)

neuroPD1year cohort - 1781732
Has seen neurologist in last 3 years (redundant do not implement)
Has PD condition by neurologist in last 3 years (redundant do not implement)
Has PD condition by neurologist in the last year
Has No exclusion conditions by neurologist in last year

neuroPD2years: cohort 1781779
Has seen neurologist in last 3 years (redundant)
Has PD condition by neurologist in last 3 years (redundant)
Has PD condition by neurologist in 1-2 years ago and NOT in last year
Has No exclusion conditions by neurologist visits in last 2 years

neuroPD3years: cohort 1781780
Has seen neurologist in last 3 years (redundant)
Has PD condition by neurologist in last 3 years (redundant)
Has PD condition by neurologist in 2-3 years ago and NOT in last 2 years
Has No exclusion conditions by neurologist visits in last 3 years

neuro-noPD (not a cohort - here for logic completion)
Has seen neurologist in last 3 years
Has NO PD condition by neurologist in last 3 years
Does not fulfill criteria for cohort
Can infer counts for this based on above 3 cohort exploration

nonneuroPD1year: cohort 4 of 6: 1781781
Has not seen neurologist in last 3 years
Has PD condition by visit in the last 1 year
Has No exclusion conditions in last year

nonneuroPD2year: cohort 5 of 6: 1781782
Has not seen neurologist in last 3 years
Has PD condition by visit 1-2 years ago and NOT in the last 1 year
Has No exclusion conditions in last 2 years

nonneuroPD3year: cohort 6 of 6, cohort 1781783
Has not seen neurologist in last 3 years
Has PD condition by visit 2-3 years ago and NOT in the last 2 years
Has No exclusion conditions in last 3 years

nonneuro-noPD (not a cohort; here for logic completion)
Has not seen neurologist in last 3 years
Has no PD condition with a visit in last 3 years
Can infer counts from above 3 cohorts

Great conversation.

1. Regarding re-indexing: @Patrick_Ryan regarding re-indexing - in the video recording (see link) we discussed this. @allanwu made a point that the operating characteristic he is looking for in current iteration was specificity and PPV. Index date misspecification error was acceptable at this time. Also the intent was to replicate a published algorithm, which was indexing on latest event. In the group discussion, it was decided to explore reindexing after achieving these above goals.

2. Regarding reindexed cohort definition developed by @Patrick_Ryan : Btw @allanwu i fixed this

3. Regarding comparing Tiered definition with unaminity algorithm:

This is purely technical - we cannot execute a cohort definition when the cohort definition is incomplete i.e. provider specialty. i.e. i cannot re build cohort definitions without this issue being fixed

So i suggest we focus on this topic first.

4. Regarding bugs in software at data.ohdsi.org/PhenotypeLibrary . The issue appears to the server infrastructure running the OHDSI tool. The error in incidence rate plot (tagging @lee_evans and @jpegilbert has been previously reported but I think there were some challenges in fixing it

@fabkury - I can’t seem to find your post now which I saw this morning where you added a Phea SQL tweak to a placeholder in Atlas-demo for the tiered consensus.

Two points to make on the logic as I recall. You are counting the # of PD and non-PD (neurodegenerative) parkinsonisms in each visit and only considering visits where # of PD conditions are greater than the other parkinsonism exclusions.

  1. The tiered consensus criteria takes all visits in the 3 year time frame and counts all conditions PD and (non-PD parkinsonism exclusions) in total across all visits. Then the tiered consensus algorithm/function outputs “PD” if the PD conditions outnumber the (non-PD parkinsonism exclusions).
  2. The (non-PD parkinsonism exclusions) should be the combination of (non-PD neurodegenerative parkinsonisms) and (secondary parkinsonisms) concept sets.
1 Like

@allanwu, yes on both items 1 and 2. Thanks for revising my message.

Here is a summary of the “Phea approach” for the tiered consensus algorithm for Parkinson’s Disease (adapted Szumski and Cheng 2009, see @allanwu’s message above).

What is the tiered PD phenotype, and what’s the matter with it?

The tiered consensus algorithm phenotype requires PD to be coded more frequently than competing diagnoses, within the past 3 years. The competing diagnoses are non-PD parkinsonism and secondary parkinsonism.

The problem is that Atlas can’t do such a “most frequent diagnosis among 2 groups in the past 3 years” calculation. In inclusion 4 above, @allanwu tried to approach that logic in Atlas’s Cohort Definitions tool by scanning the prior 3 years one year at a time.


Create a phenotype that correctly applies the tiered logic (“most frequent among 2 groups” logic). Do that by injecting Phea-generated SQL into an OHDSI-compatible cohort definition (cohortDefinitionSet).

Whether the visit is a neurology visit or not, was not considered. This aspect of the logic was ignored because we thought it would be too difficult and unreliable to map “neurology visit” across sites in a network study.

Logic to be computed

Tiered diagnosis criteria: Parkinson’s Disease is more frequent than competing diagnoses (non-PD and secondary parkinsonism).

Query logic:

  • At every visit occurrence, look back and count the number of occurrences of the two groups of conditions, PD and non-PD.
  • Eliminate the visits where the non-PD count is bigger than the PD count.
  • Patient meets the criteria if it has at least one of those “special” visits that remain.

Alternative logic (not used): Use most recent diagnosis, instead of most popular.

The hard part isn’t producing a SQL query that captures the above logic (and you have Phea to write that query for you). The hard part is correctly plugging in that “special SQL” code into the OHDSI’s ecosystem, i.e. into an OHDSI study package.


  1. Phea SQL needs to be compatible with the local SQL flavor. (not yet addressed, Postgres was used)
    Possible solution A: have the study package generate Phea SQL locally.

  2. Phea SQL needs to be compatible with CohortGenerator / OHDSI-SQL.
    Possible solution A: replace a placeholder criterion with Phea SQL. (this was the approach taken)
    Possible solution B: have Phea insert rows into @target_database_schema.@target_cohort_table directly, without going through CohortGenerator.

  3. I can’t test the code post-Phea, because neither Synthea nor Eunomia have any Parkinson’s disease diagnosis code.
    Potential solution: For testing purposes, surrogate conditions could be used. (this was not done)

How the “SQL replacement approach” works

  1. Manually copy the cohort definition created by the group (https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/) into a new one (ATLAS).

  2. Manually add an extra criterion in that copy: has visit where PD Dx is the most frequent.

(subsequent steps are done by R code, see TXT file attached)

  1. Download the cohort definition using ROhdsiWebApi::exportCohortDefinitionSet().

  2. Read the SQL file that was downloaded. Replace the extra criterion with the Phea SQL.
    a. Before this substitution, adapt Phea’s code to be compatible with OHDSI-SQL:
    i. Replace schema references with OHDSI aliases (e.g. @cdm_database_schema).
    ii. Retrieve code sets from the temporary table #Codesets.

  3. Generate a new study package using the modified SQL file. (did I do this correctly? I am not sure)

The modified study package is in MS Teams:

Comments and takeaways

  1. Local generation of Phea SQL (at data holder’s computer) is complicated by:
    a. Phea needs to read the column names to build the SQL query. But Phea doesn’t know the details about the local CDM like SqlRender does. Therefore, OHDSI-SQL aliases need to be resolved into real table names and provided to Phea.

  2. Probably can’t edit the cohort definition after the SQL replacement. It is likely that you will lose Phea’s SQL code when the cohort generation SQL gets re-generated.

  3. Alternative approach: Use Phea to directly insert rows into @target_database_schema.@target_cohort_table, with or without going through CohortGenerator.

  4. Long term best solution: Develop Phea all the way into a HADES-integrated R package.

Files in MS Teams’ Phenotyping Development Workgroup team

  1. Edited cohort definition: cohort 1781786 Phea.zip
  2. Edited study package: PhenotypeParkinsonsDisease-phea.zip

R code that generates the Phea SQL and performs the “SQL replacement” pipeline

parkinsons.txt (8.2 KB)

Final words

If anyone trusts me enough to download PhenotypeParkinsonsDisease-phea.zip from MS Teams and run the study on your local CDM instance, that would be nice! Take a look at file 310.sql if you want to see the edits that were applied (search for keyword “phea” in the file). But I admit the chance that it will work is very small. That is because, while I could test the query logic per se in my local Postgres server, I don’t have a fully-featured local OHDSI environment for running the study package. Moreover, the code was generated for PostgreSQL – maybe it happens to work in other SQL flavors as well, maybe not.

1 Like

Update on P8 Parkinson’s Disease.
@Gowtham_Rao what are next steps for this Phenotype?
Here’s what I think seem to be current status and next steps on this roadmap:

  1. Two proposed flavors of phenotypes: unanimity and tiered consensus.
  • last week, we created two initial unanimity phenotypes (without and with meds)
  • more about tiered consensus below
  1. Unaminity - proof of concept run results:
  • inital results posted above by @Gowtham_Rao
  • I have not seen the Atlas attrition tables for 309 and 310
  • review of Cohort Diagnostics suggests the following:
  • need to add concept Progressive Supranuclear Ophthalmoplegia to the “parkinsonism (neurodegenerative, nonPD) conditions” concept sets. I have updated in: Atlas-demo concept set 1872601 (2/13/23); not yet updated in atlas-phenotype.ohdsi.org yet
  • more details about how phenotype behaves in two datasets require attention: jmdc and truven_mdcd in that phenotype does not result in expected male, older predominance demographics; it seems there is more secondary parkinsonism in both datasets overall, but still have to figure out why. From phenotype point of view, would like to identify if I am missing possible secondary parkinsonisms concepts that should be excluded in last 3 years; one way to approach this is to partially address is to create a unanimity cohort that will exclude ALL secondary parkinsonisms for all time (not just 3 years) and see if that results in expected demographics in these datasets. Another option worth exploring is if we should include a limitation in visit types to office visits (as the original phenotype paper did).
  1. Would benefit from guidance on next steps to consider how to address above issues and to consider additional analysis or cohorts to help refine this phenotype.

For tiered consensus criteria

  1. I agree that specialty seems to be challenging for P8 for now.
  • Movement disorder specialty is not practical (once we have our own OMOP-CDM instance, we can post how much difference this might make)
  • there remains value in including neurologist visits if possible. Again, in published algorithms for detecting PD with specificity, the tired consensus perfoms with higher sensitivity, specificity and PPV compared to unanimity algorthm.
  1. I believe there is still value in assessing a tiered consensus criteria
  • even without a visit specialty criterion, a tiered criteria will allow a comparison between a full 3 year exclusion (unanimity) vs a tiered non-specialty consensus criteria.
    There are likely 3 approaches to take to tiered consensus criteria:
  • (1) validate/review the 6 atlas-demo cohort defintions as proposed by @allanwu on 2/8/23 and run those
  • (2) consider adaptation of those 6 atlas-demo cohort defintions and remove the specialty criteria and just focus on effect of the tiers year-by-year. suggested options for these 6 cohorts: “tiered consensus wo meds” (current), “tiered consensus w meds” “tiered non-specialty consensus wo meds” “tiered non-specialty consensus w meds” (only the first is available in atlas-demo currently).
  • (3) I remain unsure how scalable the Phea approach is for networks tudy

@fabkury I reviewed the Phea package and I think I see what you are trying to do. At each time point (essentially visit), you identify if that particular visit is included in the cohort or not. I would add that if a particular visit is NOT included in the cohrt, then at that point, the person would be consided to drop out of the cohort. This raises very interesting possiblities in tracking the evolution of patients.
However, for the basic P8 implementation of the tiered consensus criteria (understanding you did not implement specialty yet), one would merely need to apply the SQL/Phea approach to the entry point (most recent) visit, look back 3 years, and either include or not include that person into the cohort or not.

And I remain interested in any of above cohorts, in the future, how we would be able to do an inference of incidence (earliest entry point). I didn’t quite follow @Patrick_Ryan post on this.

Thank you @allanwu

atlas-demo.ohdsi.org has the following definitions - i have taken a snapshot of it.

Regarding Unaminity definitions:
Regarding output and shiny: I will update the shiny app output to reference the updated definitions in atlas-demo.ohdsi.org. this should have the attrition output. i.e. lets continue the iteration on atlas-demo.ohdsi.org and only move the mature definitions to atlas-phenotype.ohdsi.org at a future stage. This strategy will allow us to iterate and update quickly.

Regarding observations in the tiered definitions and how to address them: I suggest making a list of observations and frame them as sensitivity, specificity errors. You may also consider saying why you think they may be errors. This would help some of us non Parkinson’s experts provide ideas and collaborate on solving it. Examples are:

  • variability in male and older preponderance by data source (e.g… possible sensitivity errors, younger individuals less likely to have Parkinson’s disease and mean age lower than expected)
  • higher secondary parkinsonism (source of specificity errors?)

Regarding Tiered definitions: To pursue the tiered definitions, we need some time and it may well be atlas only, atlas + Phea or Phea only. But there are many outstanding issues in the atlas based definitions. Of note:

  • Blanks for provider specialty: @Chris_Knoll shared with me that it is still a valid cohort definition and it will execute - but will make it more explicit
  • There may be some logical errors in some of the definitions.

For next steps - for tiered definitions specifically - I think we should do a targeted re-review of the implemented cohort definition to make sure it is complete and has transformed the logic.

I am attempting to get an output on https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/ later today with updated output for at least the Unaminity definitions (it will have output of tiered definitions, but I think we should ignore them for now because of possible errors). I will update this thread when ready.

1 Like

‘valid’ has mixed meanings here. For purposes of SQL generation, it’s ‘valid’ to leave it blank and we’ll ignore it and generate a query. But is a blank provider ‘valid’ for this phenotype?

This design is very promising @fabkury . There maybe some potential errors in the current tiered Atlas cohort definition (not limited to the missing provider specialty information). Once those are addressed, I can attempt to use Phea SQL.

1 Like

@Gowtham_Rao and group,
I have made no changes in the unanimity phenotypes in Atlas-demo.
Since last post, I have updated Atlas-demo definitions for tiered consensus (2/13/23 10-11pm CST).
I was able to import desired neurology specialties into the specialty criteria, so I am comfortable with the content of the specialty neurology criteria (within CDM standard vocabularies). I realize not all OMOP-CDM instances may not all have similar implementations of specialty, so I have created two flavors of the tiered consensus criteria below
I believe these are ready for logic/technical review.

I have updated 6 phenotypes of the tiered consensus algorithm that include the neurology criteria these are tagged with [Pheb2023][ucepd]
Persons with PD tiered consensus w specialty [no]neuro#year (6 definitions)

  • 3 definitions are for persons who have neurologist visits in last 3 years (for each of the 3 years)
  • 3 definitions are for persons who do not have neurologist visits in last 3 years

I have created 3 further tiered consensus defintions - ignoring specialty criteria
Persons with PD tiered consensus wo specialty #year (3 definitions)

  • 3 definitions that ignore the specialty criteria but do evaluate PD defintions vs exclusions for each of the last 3 years.

The rationale is to evaluate the increase in PPV, sensitivity and specificity for detecting PD (as opposed to similar parkinsonisms and not-PD) in 3 sets of phenotypes:

  • unanimity definition w and wo meds - good PPV/specificity; lowest for all
  • tiered consensus wo specialty - better PPV, higher sensitivity, risk for specificity better or worse given allowance for some exclusions 2 or 3 years prior to current year
  • tiered consensus w/specialty - anticipate better PPV, sensitivity and specificity - testing assumption that weighting neuro specialty diagnoses provide added value to these EHR cohorts. But this likely works only for OMOP-CDM instances that have reliable specialty coding.

atlas-demo phenotypes:
1781748 unanimity wo meds
1781760 unanimity w meds
1781814 tiered consensus wo specialty last 1 year (1 of 3)
1781815 tiered consensus wo specialty 2 years ago (2 of 3)
1781816 tiered consensus wo specialty 3 years ago (3 of 3)
1781732 tiered consensus w specialty PD neuro last 1 year (1 of 6)
1781809 tiered consensus w specialty PD neuro last 2 years ago (2 of 6)
1781810 tiered consensus w specialty PD neuro last 3 years ago (3 of 6)
1781811 tiered consensus w specialty PD no-neuro last 1 year (4 of 6)
1781812 tiered consensus w specialty PD no-neuro last 2 years ago (5 of 6)
1781813 tiered consensus w specialty PD no-neuro last 3 years ago (6 of 6)

@aostropolets for vocabulary landscape assessment. I have struggled with provider specialty vocabulary.

@allanwu do you know how ‘movement specialist’ are documented in your data sources e.g. is it a designated with a specialist code in your health care system? I do not think it is a American Board of Medical Specialties certified specialty (or is it?). Is this a physician specialty or a non physician specialty?

As FYI OMOP vocabulary requests are handled here i believe Issues · OHDSI/Vocabulary-v5.0 · GitHub . If you show them there is an authoritative vocabulary that is generally accepted and if licensure allows them to import it into OMOP - they usually do it.

Current physician specialty list is here and more general provider specialty is here

As @Patrick_Ryan said here, the community does not have the experience with using Provider Specialty mostly because most data sources do not have this information in a reliable way - but if and when available, both the CDM and tools can support it.

Also discussed here

Hi @allanwu - the shiny application should have the most recent definitions here https://data.ohdsi.org/PhenotypePhebruary2023_P8_ParkinsonsDisease/

Some suggested topics to discuss on todays call at 12pm est (meeting invite - everyone welcome to attend)

  • status of local omop instance
  • unanimity definition to completion (is it complete e.g. concepts complete, rules accurate, orphan included, do we need unanimity specific subgroups e.g. jmdc and truven_mdcd issue, review attrition, evaluation using the framing of sensitivity, specificificity, index date misspecification errors)
  • tiered consensus definition using atlas only (review the 9 definitions output. note - i extracted all definitions in atlas-demo, so it has the ones not part of these 9)

Here are some notes from the evaluation we did today to the unanimity definition (Persons with Parkinson’s disease unanimity- c1781748 ):

A. Attrition/ Effect of inclusion criteria:

  • “has at least 1 PD specific code criterion” has no effect- suggesting that all the data is loaded on the “specific code” and the broader codes are not utilized.
  • " has 2 encounters with PD code that are at least 30 days apart" has no effect - that was due to a bug/error in the cohort definition which is now corrected-
  • “has no secondary parkinsonism conditions” and “has no (non-PD) neurodegenerative parkinsonism conditions” criteria lead to a loss of 0.4-7% each, which is inline with expectations- it is estimated that 5% of Parkinson patient have secondary Parkinson or (non-PD) neurodegenerative parkinsonism conditions.

B. Index event breakdown:

  • In US data sources “Parkinson’s disease-G20” accounts for 34% (in MDCD) -83 % (in optum EHR) of persons on index and Paralysis agitans accounts for 19-70%. In JMD, Germany France, 100% has
    “Parkinson disease” on index and no other code.

C. Time distribution:

  • on average data sources had a median time of 3 years before index. This is assuring, since all inclusion criteria is based on 3 years prior index.

D. cohort characterization:

  • In most data sources, patients were of older age and more likely to be males-which is inline with expectation to the known trends of Parkinson . However, JMDC and MDCD had higher proportions of younger age groups and higher females when compared to other data sources and when compared to expectations.
  • The age and gender distribution in JMDC and MDCR may be an indication of a specificity error. In specific, Parkinson disease among younger /female group is unlikely to be idiopathic Parkinson’s Disease but may be related to “drug induced Parkinson”.
  • In JMDC around 50% of the cohort have Schizophrenia compared to only 2.1% in optum DOD. Also a much higher proportion of the cohort in JMDC are on Olanzapine (15%) and Risperidone (20%) when compared to optum DOD (1.2% and 4.0% respectively). Antipsychotic drugs such as Risperidone and lanzapine are known to be associated with Parkinson disease (drug-induced Parkinson). The high prevalence of Schizophrenia and it’s treatments in JMDC may suggest that a proportion of the cases are drug-induced Parkinson, representing false positive cases.
1 Like

Thank you for great notes @Azza_Shoaibi

Current goal is to confirm the technical and clinical logic for the unanimity PD cohort defintions.
Once initial evaluation of logic is done (12 databases were available in discussion above), we can run as network study and fully evaluate the definition with potential sources of acceptable or understood error.

Revision of logic of unanimity PD cohort definitions based on discussion.
All work is in atlas-demo site.

  1. New “Parkinson’s Disease [ucepd][Pheb2023]” concept set that includes PD condition and children with appropriate exclusions to make sure all standard conditions that would include PD are captured. ID 1869664.

  2. Created “Medications associated with parkinsonism [ucepd][Pheb2023]” concept set. It includes both Classification and Ingredient standard concepts. It also includes several exclusions that are acceptable in a cohort defintion of PD (as described in clinical description). ID 1872667.

  3. Updated/created 4 versions of unanimity PD cohort definition.

  • fixed the 2 PD conditions separated by 30 days logic using nested criteria and “30 days before and 1 day before index start date” (not 0 days)
  • used the new PD concept set for defining PD conditions
  • all tagged wtih [ucepd][Pheb2023] to find
    1781748 Persons with PD unanimity (no med criteria)
    1781760 Persons with PD unanimity w PD med criteria (include PD meds)
    1781843 Persons with PD unanimity wo confounding meds (meds that cause parkinsonism)
    1781844 Persons with PD unanimity w PD med and wo confounding meds

The 1781843 is designed to address limitations in JMDC and MDCD datasets; we include an exclusion if the person is exposed, in the last 3 years, to a drug that causes drug-induced parkinsonism. This would exclude those that are coded with PD (but are suspected are more likely to have drug-induced parkinsonism) if they are treated with meds that treat schizophrenia (and can cause parkinsonism). This should increase specificity overall when including JMDC and MDCD at some acceptable loss of sensitivity/PPV.
1781844 then adds further specificity by including both med criteria (PD meds in support and drug-induced meds to exclude) with further loss of some sensitivity/PPV.

I believe these 4 unanimity cohorts are ok from clinical logic.
@Gowtham_Rao I believe these can be reviewed for technical logic.

I will continue to review/refine the tiered consensus cohort definitions w/wo specialty criteria and post when I feel those are also aligned with current comments.

1 Like

See video recording here Meeting in General-20230215_120115-Meeting Recording.mp4

@CraigSachson could you please post it to Phenotype Pheburary 2023 homepage when possible Phenotype Phebruary 2023 – OHDSI

@allanwu The shiny app should be updated based on today’s discussion related changes