OHDSI Home | Forums | Wiki | Github

Requirements Development for the OHDSI Gold Standard Phenotype Library

Glad to see the topic heating up! @schuemie, due to holidays, @apotvien and I will not be presenting at the community meeting until January. But this topic is clearly a high priority. I like @pavgra’s thinking around the digital assets more broadly, but I would agree with @schuemie that I’m not totally sure whether Athena should be home for the actual assets.

For the moment, @apotvien and I have posted a few demonstration entries on the OHDSI Wiki for people to review and provide feedback on what contents should be included. Find the links to the main page and a sample page here. The idea is that the actual content lives on GitHub or Atlas, while the wiki contains the metadata.

The larger issue is we’ve got to get all of these efforts pulled together! @pavgra Are you available to join our WG call at 10am ET this Weds? Anyone interested in this topic, if possible, would be great to have you join to discuss.

Thanks!

Jon

I have updated the invite on the page, thanks for the heads up @schuemie. Would also be great if @Ajit_Londhe @SCYou, @Juan_Banda, @jswerdel, @Nataly_Patino and others who are interested in the topic are able to join.

@jon_duke after speaking with members of our working group, we decided to cancel the Atlas/WebAPI working group for 11/14 so that we can also attend this discussion as well. Looking forward to it!

Is the call today, November 13th, or Wednesday November 14th. I am a bit confused about this. Thanks!

Updated sorry. Weds 11/14.

Just adding to the frenzy of discussions on this topic, here are my thoughts (that nobody asked for :wink: ) on the requirements for the Phenotype Library. This is mainly inspired by @SCYou’s recent work on creating standard phenotypes for cardiovascular disease :

Definitions
Types of phenotype definitions to support:

  • Rule-based phenotypes
  • Computational (probabilistic) phenotypes

We should be able to have multiple definitions per phenotype (e.g. ‘stroke broad’, ‘stroke narrow’).

Phenotype definitions should be clearly versioned.

Operating characteristics
For each definition we need to know its operating characteristics (sensitivity, specificity, PPV, NPV).

There are multiple ways to compute these characteristics, e.g.:

  • Manual chart review
  • Joel’s algorithm

Note: there’s even a need to compute operating characteristics per subgroup (e.g. within exposure groups) to quantify differential misclassification. Joel’s algorithm can do that (we tried), but I’m not sure if and where we need to store it in the Phenotype Library.

Meta data
Each definition should have meta data, such as literature references, study references (which definition was used in which study), rationale of the definition, copy-paste descriptions of the definitions to use in the protocol and paper.

Maintenance
Easy to add and maintain definitions, and request evaluations across the OHDSI network.

Persistence + security: Some way to make sure definitions aren’t changed without notice by unauthorized persons.

I would like an API for getting things out. Ideally, I would want to be able to plug one or more definitions directly into my study package, so it can be executed at each study site.

2 Likes

My $.02. Digital signing, encryption options, integration with ORCID.

Does that exist somewhere, @jswerdel?

Right now it only exists locally. We’re working on putting the package together.

1 Like

I’ve made first draft for ischemic stroke cohort in the public ATLAS
I used ancestor concept id of 4043731 (infarction-precerebral) and 443454 (Cerebral infraction) to define stroke. And added specifiers: Inpatient, primary or secondary diagnosis, or primary diagnosis only.

You can see the result of this cohort in our database.

I didn’t add ‘excluding migraine at the same day’ or ‘imaging study of brain’. Because excluding migraine at the same day looks weird and so contrived. And I don’t think vocabulary for procedure is fully standardized across OHDSI network.

Do you have any comments on this, @schuemie @Rijnbeek @Christian_Reich @Patrick_Ryan?

I’ll start to validate the PPV for each level of specifier using discharge summary from EHR. Could you join to validate this cohort in ICD-9 system, @hripcsa @rchen?

@jswerdel, Can you validate this cohort by using PheVulator?

2 Likes

Meeting to continue the discussion on the gold standard library today at 10am ET.

Invite Here

Minutes from last meeting (11/14)

Trouble for some dialing in today. We are switching to WebEx.

OHDSI Gold Standard Library
Hosted by Jon Duke

Wednesday 10:00 am | 1 hour | (UTC-05:00) Eastern Time (US & Canada)
Occurs every 2 week(s) on Wednesday effective 11/28/2018 from 10:00 AM to 11:00 AM, (UTC-05:00) Eastern Time (US & Canada)
Meeting number: 739 830 648

https://gtriconf.webex.com/gtriconf/j.php?MTID=mb32e552e5b0fcffb74e2205d746cc63f

Join by video system
Dial 739830648@gtriconf.webex.com
You can also dial 173.243.2.68 and enter your meeting number.

Join by phone
1-240-454-0879 USA Toll
Access code: 739 830 648

I just saw the post that we are having meetings on this topic. I have added the webex invite to my outlook calendar and will join next time.

When I look at concept 443454 (Cerebral Infarction) in Athena, the following shows up in the mapping:

When I select OMOP Stroke 1 I find that it also defines hemorrhagic stroke as follows:

Not sure if this is under that cerebral infarction definition or not (Im still learning about the ATLAS mapping) but I would be concerned about using this definition since ischemic stroke differs from hemorrhagic. @SCYou Let me know if I’m understanding the mapping correctly. I am very interersted in learning as much as I can.

@Nataly_Patino Fortunately, cerebral hemorrhage (376713) or subarachnoid hemorrhage (432923) are not included in the descendant concept ids of cerebral infraction (443454).
I don’t know the exact meaning of ‘HOI contains SNOMED’. Could you help us @Christian_Reich @Dymshyts?

OMG. Forget the HOIs. There are old concepts from the OMOP experiment. Old as dirt. Will remove them from the hierarchy.

Hey @jon_duke, when is the next Phenotype meeting taking place?

@Nataly_Patino and I would like to attend.

1 Like

Thanks for the love @Patrick_Ryan. It would normally be 12/28 but we are closed next week. So the following meeting is scheduled for 1/16, same bat time, same bat WebEx. Link

1 Like

I released the process and the result for validation of ischemic stroke phenotype to the Phenotype Library github.

The definition of cohort in json and in sql are available. I used
OhdsiRTools to insert cohort definition from ATLAS.

The condition concept identifiders, included concept ids, source code in ICD-9-CDM and ICD-10, and additional constraints (such as ‘first event only’ or ‘all event’, visit_concept_id, and condition_type_concept_id) are also summarized for each cohort at here.

I made very simple and primitive shiny app to visualize discharge note of each subject in the cohort and check the validity of them by manual chart review.

Positive predictive value for ischemic stroke (inpatient or ED) and primary condition was 0.69. Positive predictive value for ischemic stroke (inpatient or ED) and primary condition and first event was 0.81, which is summarized here. In conclusion, I suggest to use the last cohort definition for ischemic stroke (ischemic stroke (inpatient or ED) and primary condition and first event).

I hope this help a community.

I’ll validate the hemorrhagic stroke cohort, too.

1 Like

@SCYou:

Happy New Year. Couple questions:

  1. Nice Shiny app, but hasn’t the Atlas built a validation module?
  2. Have you notice in your validation the situation where there are both ischemic and hemorrhagic strokes coded simultaneously at a frequency that is to high for what would be expected?
t