Just adding to the frenzy of discussions on this topic, here are my thoughts (that nobody asked for ) on the requirements for the Phenotype Library. This is mainly inspired by @SCYou’s recent work on creating standard phenotypes for cardiovascular disease :
Definitions
Types of phenotype definitions to support:
- Rule-based phenotypes
- Computational (probabilistic) phenotypes
We should be able to have multiple definitions per phenotype (e.g. ‘stroke broad’, ‘stroke narrow’).
Phenotype definitions should be clearly versioned.
Operating characteristics
For each definition we need to know its operating characteristics (sensitivity, specificity, PPV, NPV).
There are multiple ways to compute these characteristics, e.g.:
- Manual chart review
- Joel’s algorithm
Note: there’s even a need to compute operating characteristics per subgroup (e.g. within exposure groups) to quantify differential misclassification. Joel’s algorithm can do that (we tried), but I’m not sure if and where we need to store it in the Phenotype Library.
Meta data
Each definition should have meta data, such as literature references, study references (which definition was used in which study), rationale of the definition, copy-paste descriptions of the definitions to use in the protocol and paper.
Maintenance
Easy to add and maintain definitions, and request evaluations across the OHDSI network.
Persistence + security: Some way to make sure definitions aren’t changed without notice by unauthorized persons.
I would like an API for getting things out. Ideally, I would want to be able to plug one or more definitions directly into my study package, so it can be executed at each study site.