Hi OHDSI community,
I’ve been thinking about a gap in phenotype sharing and would love your input.
The Problem I’m Seeing
When I review published studies using OMOP data, I often find:
- Paper describes: “AKI defined as creatinine increase >0.3 mg/dL within 48 hours”
- Actual implementation: Buried in 200 lines of SQL or Python, with implicit assumptions about baseline, measurement timing, etc.
- Result: Two teams implementing “the same” phenotype get different cohorts
ATLAS cohort definitions work great within OHDSI, but I’ve struggled with:
- Sharing with non-OHDSI collaborators - JSON exports require ATLAS to interpret
- Reviewing published logic - Hard to diff two phenotype definitions
- Temporal patterns - Expressing “rising trend over 6 hours” vs “single threshold”
What I’ve Been Experimenting With
I’ve been working on an open-source project called PSDL (Patient Scenario Definition Language) - a YAML-based format for clinical scenario definitions.
Example (AKI detection):
scenario: AKI_Detection
version: "0.3.0"
audit:
intent: "Detect early acute kidney injury"
rationale: "KDIGO criteria for AKI staging"
provenance: "KDIGO Clinical Practice Guideline (2012)"
signals:
Cr:
ref: creatinine
concept_id: 3016723 # OMOP concept
unit: mg/dL
trends:
cr_delta_48h:
expr: delta(Cr, 48h)
description: "Creatinine change over 48 hours"
logic:
aki_stage1:
when: cr_delta_48h >= 0.3
severity: medium
description: "AKI Stage 1 - creatinine rise ≥0.3 mg/dL"
Key features:
- Human-readable - Clinicians can review without SQL knowledge
- OMOP-native - Uses concept_ids directly
- Temporal operators - delta, slope, ema, min, max over time windows
- Audit metadata - Built-in intent, rationale, provenance fields
- Portable - Same file works across different OMOP instances with site-specific mappings
GitHub: GitHub - Chesterguan/PSDL: Patient Scenario Definition Language
Questions for the Community
- Is this a real problem? Or do existing tools (ATLAS, CohortDiagnostics) already solve this well enough?
- Temporal logic - How do you currently express patterns like “creatinine rising over 6 hours” in ATLAS? Is there a standard approach?
- Cross-network sharing - When collaborating with sites outside OHDSI, how do you share phenotype definitions?
- What’s missing? - Looking at the format above, what would make it more useful for your work?
I’m genuinely trying to understand if this fills a gap or duplicates existing efforts. Happy to contribute to OHDSI tools if there’s a better path forward.
Thanks for any feedback!
Links:
- GitHub: GitHub - Chesterguan/PSDL: Patient Scenario Definition Language
- Example scenarios: AKI, Sepsis, ICU Deterioration included
- Whitepaper: Explains the design rationale
This is an open-source side project - feedback and criticism welcome!
I