Mind Meets Machines - OHDSI Symposium 2025 - Phenotype Development and Evaluation Work Group

REMOTE

  1. Rita Silva, IPO-Porto, Portugal
  2. Phenotypes: Ovarian Cancer; Sepsis
1 Like

While I’m unable to attend, the use of imaging raises device questions. The description says Optical Coherence Tomography (OCT) is the gold standard and Fluorescein Angiography (FA) may be used adjunctively. Will the study identify the devices used for diagnosis and, in particular by UDI? This is an opportunity to determine whether devices from different manufacturers and different versions or models from the same manufacturer exhibit differences in their ability to assist with the diagnosis.

thank you @doleary . The content you pointed out is just for context or pre-read. The study is just to build code lists (e.g., list of SNOMED codes used to identify DME - as done by humans vs AI). I think what I am trying to say is, we wont be addressing the device question in this effort.

Thank you for the clarification about the scope and about the devices that may be nvolved.

That won’t work for ovarian cancer. In the claims data, you have no stages and barely any metastases. The Veradigm EHR is for ambulatory patients, where cancers are not diagnosed. And the hospital EHRs Optum is not for cancer patients and therefore the information you are looking for is sporadic at best. To get reliable cancer data you need to go to the tumor registries and EHRs from cancer centers.

1 Like

C01 Systemic Lupus Erythematous

Part 1: Vignette
You are an epidemiologist at a biopharma company. Your immediate objective is to deliver a high-PPV concept set for Systemic Lupus Erythematosus (SLE) — Systemic, to be embedded within a validated prevalent, active, moderate-to-severe SLE phenotype algorithm used for characterization, comparative effectiveness, and HTA-relevant steroid-sparing analyses.

Data Environment: You will work across large administrative claims and EHR datasets standardized to the OMOP Common Data Model (CDM). Concept sets must be authored in the Condition domain only (OMOP standard preferred). (Laboratory, procedures, drugs, visits, and specialty are handled by separate concept sets and the phenotype algorithm, not this concept set.)

Key Epidemiological Parameters (from the study design that will use this concept set)

Target population: Patients with prevalent, systemic SLE capable of having active, moderate-to-severe disease. Age inclusive (adult & pediatric). Pregnancy status not a restriction for cohort entry.
Index anchoring: Diagnosis-anchored; in the phenotype, confirmation may occur via repeat diagnosis, rheumatology proximity, SLE-directed therapy, or serology.
Required lookback: ≄365 days prior observation (for downstream baseline covariates).
Nature of condition for this use case: Prevalent SLE (not incident).
Core exclusions handled at concept-set level: cutaneous-only lupus, drug-induced lupus, neonatal lupus, “history of”/“screening for”/“rule-out” lupus, isolated antiphospholipid/lupus anticoagulant without SLE.
Confounding & severity: Managed in phenotype & study design via steroid dose strata, flare proxies, organ involvement flags, treatment history—not by broadening/complicating the diagnosis code list.

Structured Research Question (OHDSI Madlibs style, exemplar for downstream analyses) “Among patients with prevalent systemic SLE in OMOP-standardized claims/EHR data, what is the 12-month pre-index distribution of average daily oral glucocorticoid dose and the 12-month post-initiation risk of inadequate response after starting anifrolumab or belimumab, measured during a study-specific time-at-risk?”

Concept Set Challenge (singular, unambiguous)

Build the Concept Set: “Systemic Lupus Erythematosus — Systemic (Condition domain)”
Scope = diagnosis codes that explicitly denote systemic SLE (including “SLE with organ involvement”). Explicitly exclude: cutaneous-only lupus, drug-induced lupus, neonatal lupus, “history of”/“screening”/“rule-out”/“suspected” lupus, isolated antiphospholipid/lupus anticoagulant without SLE. Do not include labs, drugs, visits, specialty, or procedures here (they are separate concept sets).

Part 2: Structured Clinical Specification
(Guidance for code inclusion/exclusion; informs but does not encode phenotype logic.)
1. Core Clinical Definition
Case Definition (clinical meaning of the target concept) A chronic, systemic autoimmune disease characterized by multisystem involvement and immunologic abnormalities consistent with SLE (e.g., anti-dsDNA and/or anti-Sm antibodies, hypocomplementemia). In routine care, SLE is managed primarily by rheumatology and treated with antimalarials, immunosuppressants, biologics, and judicious glucocorticoids.

Diagnostic Criteria (for context; not encoded in this concept set)

Serology: ANA (entry), anti-dsDNA and/or anti-Sm positivity; low C3/C4.
Organ involvement: renal (proteinuria, biopsy-proven LN), hematologic, mucocutaneous, musculoskeletal, neuropsychiatric, serositis.
Classification frameworks (e.g., 2019 EULAR/ACR) inform measurement concept sets but are not used to gate diagnosis codes here.

Presentation & Course Relapsing-remitting with flares and remissions; severity ranges from mild mucocutaneous/arthralgia to life-threatening organ disease (e.g., nephritis). Long-term morbidity strongly influenced by cumulative glucocorticoid exposure.

Synonyms & Closely Related Terms (for search mapping) Systemic lupus erythematosus; SLE; systemic lupus; disseminated lupus erythematosus (historic); lupus with organ involvement (e.g., “SLE with nephritis”). Do not conflate with cutaneous lupus or drug-induced lupus.

Differential Diagnoses (Conditions to be distinguished and not considered inclusive) Cutaneous lupus erythematosus (CLE), drug-induced lupus (DIL), neonatal lupus, undifferentiated connective tissue disease, mixed connective tissue disease, systemic sclerosis, dermatomyositis/polymyositis, Sjögren’s, rheumatoid arthritis, antiphospholipid syndrome (APS) without SLE.

Common Treatments/Management (signals of disease, handled elsewhere) Hydroxychloroquine, azathioprine, mycophenolate, methotrexate, cyclophosphamide; calcineurin inhibitors (e.g., tacrolimus; voclosporin for LN); biologics (belimumab, anifrolumab); off-label rituximab; systemic glucocorticoids (oral and pulse). These support phenotype confirmation but are not part of this diagnosis concept set.

2. Scope Boundaries and Exclusions (deterministic)
IN SCOPE (include):

Diagnosis concepts that explicitly reference systemic SLE, including general SLE and “SLE with organ involvement” (e.g., renal, hematologic, neurologic).
Pediatric/juvenile SLE terms when clearly systemic.
Source codes that map (via OMOP) to standardized SNOMED CT “Systemic lupus erythematosus (disorder)” and descendants representing systemic disease (not cutaneous/drug-induced/neonatal).

OUT OF SCOPE (exclude):

Cutaneous-only lupus (all forms, including discoid, subacute cutaneous).
Drug-induced lupus (all forms).
Neonatal lupus.
“History of”, “personal history of”, “screening for”, “suspected”, “rule-out” lupus and other non-current/problem-list or administrative qualifiers.
Isolated antiphospholipid syndrome/lupus anticoagulant without an SLE diagnosis.
Non-SLE entities containing the word “lupus” (e.g., lupus vulgaris—cutaneous TB).

Edge-Case Resolutions (apply consistently):

“Lupus nephritis”:

Include when the code/text explicitly links nephritis to SLE (e.g., “SLE with nephritis”).
Exclude renal codes without an SLE reference (use a separate LN concept set for organ-severity analyses).

Overlap syndromes (e.g., SLE + systemic sclerosis): Include the SLE diagnosis if the code is an SLE code; exclusions for competing CTDs are handled in the phenotype algorithm.

Remission modifiers: If the code is a problem-status indicating history/remission rather than a current SLE diagnosis, exclude.

3. Temporal Context (crucial)
Temporality requirement for this concept set: Prevalent SLE.

Impact on code selection:

Exclude “history of”, “screening”, “suspected/rule-out” constructs.
Do not attempt to encode incident status, confirmation intervals, or activity here—those are phenotype logic elements using separate concept sets (labs, drugs, specialty, visits).

4. Clinical Granularity & Use-Case Requirements
Severity/Acuity: The diagnosis concept set is severity-agnostic by design. Severity (“active, moderate-to-severe”) will be operationalized in the phenotype via steroid dose strata, flare proxies, and organ involvement flags.

Etiology: Restrict to primary systemic SLE; exclude drug-induced lupus.

Sensitivity/Specificity Trade-off: For downstream comparative effectiveness & HTA, precision (PPV) is prioritized.

Prefer explicit systemic SLE terms.
Avoid broad/ambiguous “lupus” terms that do not specify systemic disease.
Keep cutaneously-scoped and drug-induced constructs out to protect PPV.

5. Population & Data Context
Population subgroups (handled in study/phenotype, not in this concept set): pediatric vs adult; renal involvement; baseline glucocorticoid dose strata; prior biologic exposure; refractory trajectory flags.

Do not include Measurement, Drug, Visit/Provider-specialty, or Procedure concepts here. Those belong to separate concept sets:

Serologies (ANA, anti-dsDNA, anti-Sm, C3/C4)
Glucocorticoids (oral/pulse), antimalarials, ISDs, biologics
Rheumatology specialty visits; ED/inpatient visits
Renal biopsy; urine protein/proteinuria measures

Title: SLE — Systemic (diagnosis, Condition domain; excludes cutaneous/drug-induced/neonatal; excludes history/screening/suspected; includes “SLE with organ involvement”)

Intended Use: Built for prevalent SLE phenotyping in CE/HTA workflows with PPV priority; severity/activity handled outside the diagnosis concept set.

C01 Systemic Lupus Erythematosus

Build a concept set for Systemic Lupus Erythematosus — systemic form that captures current, clinically active disease for use in a phenotype for clinically active, prevalent SLE supporting comparative-effectiveness work. The concept set must reflect explicit systemic SLE diagnoses (including “SLE with organ involvement”). The motivating question is generalized as: among patients with active systemic SLE, what are 12-month outcomes (e.g., steroid burden, inadequate response) after initiating Drug A versus Drug B?

Clinical case definition: A chronic, systemic autoimmune disease characterized by multisystem involvement and immunologic abnormalities consistent with SLE (e.g., anti-dsDNA and/or anti-Sm antibodies, hypocomplementemia). In routine care, SLE is managed primarily by rheumatology and treated with antimalarials, immunosuppressants, biologics, and judicious glucocorticoids.

Diagnostic Criteria (for context; not encoded in this concept set) Serology: ANA (entry), anti-dsDNA and/or anti-Sm positivity; low C3/C4. Organ involvement: renal (proteinuria, biopsy-proven LN), hematologic, mucocutaneous, musculoskeletal, neuropsychiatric, serositis. Classification frameworks (e.g., 2019 EULAR/ACR) inform measurement concept sets but are not used to gate diagnosis codes here.

Presentation & Course Relapsing-remitting with flares and remissions; severity ranges from mild mucocutaneous/arthralgia to life-threatening organ disease (e.g., nephritis). Long-term morbidity strongly influenced by cumulative glucocorticoid exposure.

Common Treatments/Management Hydroxychloroquine, azathioprine, mycophenolate, methotrexate, cyclophosphamide; calcineurin inhibitors (e.g., tacrolimus; voclosporin for LN); biologics (belimumab, anifrolumab); off-label rituximab; systemic glucocorticoids (oral and pulse). These support phenotype confirmation but are not part of this diagnosis concept set.

Clinical Scope and Granularity:

  • Disease entity: Chronic, systemic autoimmune disease with relapsing–remitting course; typical features include mucocutaneous, musculoskeletal, hematologic, renal, neuropsychiatric, and serosal involvement, with immunologic abnormalities consistent with SLE.
  • Temporality: The phenotype should identify patients with currently active systemic SLE. Both newly diagnosed (incident) and existing (prevalent) cases are in scope, provided there is evidence of disease activity. Historical disease in remission is out of scope.
  • Severity & acuity: All severities —from mild to life-threatening organ disease—are in scope when the diagnosis explicitly denotes active systemic SLE.
  • Manifestations: organ/system involvement that is linked to SLE
  • Etiology: Primary systemic SLE only; other etiology (for example drug induced is not within the scope)
  • Population: adult (18 and above) .

Related, differential or comorbid conditions that are not sufficient for inclusion.

  • Cutaneous lupus erythematosus (discoid, subacute cutaneous) without systemic SLE.
  • Drug-induced lupus; neonatal lupus.
  • Antiphospholipid syndrome without SLE.
  • Undifferentiated or mixed connective tissue disease; systemic sclerosis; dermatomyositis/polymyositis; Sjögren’s syndrome; rheumatoid arthritis.
  • Organ-specific diagnoses (e.g., nephritis, serositis, cytopenias, CNS vasculitis) without explicit SLE linkage.
  • Non-SLE “lupus” such as lupus vulgaris).
  • antiphospholipid syndrome (APS) without SLE.
  • Fibromyalgia

Synonyms

  • Systemic lupus erythematosus
  • SLE
  • Systemic lupus
  • Disseminated lupus erythematosus (historic)
  • SLE with organ involvement (e.g., “SLE with nephritis”)

C02 Rheumatoid Arthritis

Build a diagnosis concept set for Rheumatoid Arthritis (RA)—adult, systemic form that captures prevalent, established disease with high clinical validity. Include RA diagnoses and RA-linked extra-articular manifestations. This concept set will support an observational study that aims to answer the following research question:

Amongst patients who are diagnosed with [Rheumatoid Arthritis], what are the patient’s characteristics from their medical history (including demographics, comorbidities, HCRU, and total costs).

The concept set will be used in the phenotype of patients with Rheumatoid Arthritis (RA) as the target cohort in this research question.

Clinical case definition: A chronic, systemic autoimmune and inflammatory disease primarily characterized by persistent synovitis of diarthrodial joints, often symmetrical. The disease trajectory involves inflammation of the synovial membrane, leading to potential cartilage destruction, bone erosions, and joint deformity. Systemic features include the production of autoantibodies (Rheumatoid Factor (RF) and anti-citrullinated peptide antibodies (ACPA)).

Diagnostic Criteria: Diagnosis is based on established clinical criteria (e.g., ACR/EULAR 2010 criteria), which consider the pattern of joint involvement, serology (RF/ACPA), acute phase reactants (ESR/CRP), and duration of symptoms. In RWD, we rely on the clinician’s recorded diagnosis based on these criteria.

Presentation and Course: Presentation typically involves insidious onset of pain, stiffness (especially morning stiffness), and swelling in multiple joints (polyarthritis), commonly affecting the small joints of the hands and feet. The course is typically chronic and progressive without adequate treatment, characterized by flares and potential remission.

Differential Diagnoses: (Conditions to be distinguished and not considered inclusive)

  • Psoriatic Arthritis (PsA)
  • Ankylosing Spondylitis (AS) and other spondyloarthropathies
  • Systemic Lupus Erythematosus (SLE)
  • Gout and Pseudogout (Crystal arthropathies)
  • Polymyalgia Rheumatica (PMR)
  • Osteoarthritis (OA)
  • Juvenile Idiopathic Arthritis (JIA)
  • viral arthritis (e.g., from Parvovirus B19, Hepatitis C)
  • undifferentiated inflammatory arthritis

Common Treatments/Management: Conventional synthetic DMARDs (e.g., Methotrexate), biologic DMARDs (e.g., TNF inhibitors), targeted synthetic DMARDs (e.g., JAK inhibitors), systemic glucocorticoids.

Clinical Scope and Granularity

  • Disease entity: Chronic, systemic autoimmune inflammatory polyarthritis characterized by persistent, often symmetrical synovitis of diarthrodial joints with risk of cartilage loss, bone erosions, and deformity; commonly presents with insidious polyarticular pain, swelling, and morning stiffness; may include autoantibodies (RF/ACPA) and flares/remission over a chronic course.
  • Temporality: Prevalent/established RA. “History of RA” is in scope only when it reflects an ongoing established diagnosis, not a resolved remote event.
  • Severity & acuity: Include all severities (mild to severe) and states (active disease or remission).
  • Manifestations: Extra-articular disease that is explicitly linked to RA (e.g., rheumatoid lung disease, rheumatoid vasculitis, Felty’s syndrome) is RA.
  • Etiology: Autoimmune inflammatory arthritis; irrespective of seropositivity. non-inflammatory arthropathies are not in scope.
  • Population: Adults; pediatric entities (JIA, juvenile RA, Still’s disease) are out of scope.

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Psoriatic arthritis; ankylosing spondylitis and other spondyloarthropathies; reactive and enteropathic arthritis.
  • Systemic lupus erythematosus; systemic sclerosis; Sjögren’s; dermatomyositis/polymyositis; undifferentiated/mixed connective tissue disease.
  • Gout; calcium pyrophosphate disease (pseudogout/CPPD).
  • Polymyalgia rheumatica.
  • Osteoarthritis.
  • Any other arthralgia or unspecified arthritis.

Synonyms

  • Rheumatoid arthritis
  • Rheumatoid polyarthritis
  • Seropositive rheumatoid arthritis
  • Seronegative rheumatoid arthritis
  • Felty’s syndrome (RA with splenomegaly and neutropenia)

C03 Diabetic Macular Edema (DME)

Construct a diagnosis concept set for Diabetic Macular Edema (DME)—macular thickening and/or intra-/sub-retinal fluid explicitly attributable to diabetes mellitus—to support a treatment-anchored phenotype. The set must capture current, clinically active DME.

This concept set will support an observational study that aims to answer the following research question:

Among adult diabetic patients on drug-a anti-diabetic medication what is the incidence rate of developing Diabetic Macular Edema (DME).

The concept set will be used in the phenotype of the outcome Diabetic Macular Edema (DME).

Clinical case definition: Retinal thickening and/or intra/sub-retinal fluid in the macula due to diabetes mellitus (type 1 or 2). Diagnostic criteria is confirmed through OCT evidence of macular thickening and/or intra/sub-retinal fluid.

Presentation & Course: Blurred vision, metamorphopsia; may be asymptomatic early. Often chronic, managed with anti-VEGF, steroids, and/or focal/grid laser; treat-and-extend or PRN patterns are common.

Differential Diagnoses (Conditions to be distinguished and not considered inclusive)

  • Neovascular age-related macular degeneration (nAMD)
  • Retinal vein occlusions (CRVO/BRVO) with macular edema
  • Polypoidal choroidal vasculopathy; myopic CNV
  • Post-operative cystoid macular edema (Irvine–Gass)
  • Uveitic cystoid macular edema
  • Unspecified macular edema without diabetic attribution

Common Treatments/Management: Intravitreal anti-VEGF; intravitreal/periocular corticosteroids (dexamethasone, fluocinolone implants); focal/grid macular laser; serial OCT monitoring.

Clinical Scope and Granularity

  • Disease entity: Diabetes-attributable macular edema characterized by retinal thickening and/or intra/sub-retinal fluid at the macula; typical presentation includes blurred vision and metamorphopsia. Clinical confirmation commonly relies on OCT; management includes intravitreal anti-VEGF, corticosteroid implants, and/or focal/grid laser.
  • Temporality: Incidence and current disease; historical or resolved disease alone is out of scope.
  • Severity & acuity: All severities (center-involved or non-center-involved; unilateral or bilateral) are in scope when DME is explicitly diagnosed.
  • Manifestations: Diabetic retinopathy that is explicitly linked to macular edema. Both laterality or “center-involved” are within scope.
  • Etiology: Restrict to DME due to type 1 or type 2 diabetes; macular edema from other causes (post-operative, uveitic, retinal vein occlusion, neovascular AMD, myopic/PCV CNV) are not within scope.
  • Population: Adults with diabetes; pediatric use is not targeted by this concept set.

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Neovascular age-related macular degeneration.
  • Retinal vein occlusions (CRVO/BRVO) with macular edema.
  • Polypoidal choroidal vasculopathy; myopic choroidal neovascularization.
  • Post-operative cystoid macular edema (Irvine–Gass); uveitic cystoid macular edema.
  • Unspecified macular edema without diabetic linkage.
  • Diabetic retinopathy without macular edema.

Synonyms

  • Diabetic macular edema (DME)
  • Diabetic retinopathy with macular edema
  • Diabetic maculopathy with macular edema/ “with macular edema” modifiers
  • Center-involved DME (CI-DME)
  • Clinically significant macular edema (CSME) — when explicitly diabetic

C04 Deep-Vein Thrombosis

Develop a diagnosis concept set for Acute Proximal Lower-Extremity Deep-Vein Thrombosis (LE-DVT)—an incident, clinically acute thrombus in popliteal or more proximal deep veins—to support a cancer-associated thrombosis phenotype. The set must represent current, incident proximal LE-DVT (including incidental events).

This concept set will support an observational study that aims to answer the following research question: Among adults with active malignancy receiving therapeutic anticoagulation, what is the comparative effectiveness of drug A to prevent Lower-Extremity Deep-Vein Thrombosis.

The concept set will be used in the phenotype of the outcome Lower-Extremity Deep-Vein Thrombosis.

Clinical case definition: An acute thrombus in the proximal deep veins of the lower extremity—popliteal, femoral (common/superficial), deep femoral, iliac, or inferior vena cava segments—presenting symptomatically (e.g., unilateral leg swelling/pain) or incidentally on imaging in a patient with active malignancy.

Diagnostic criteria (objective confirmation).

  • Compression duplex ultrasound documenting non-compressibility/intraluminal thrombus in proximal segments; or
  • CT/MR venography confirming proximal LE thrombus. (Imaging evidence is captured by procedure codes in the phenotype but not embedded inside this condition concept set.)

Presentation & course (typical). Acute onset limb swelling, pain, warmth; occasionally asymptomatic if discovered during cancer imaging. Risk of extension/embolization without treatment.

Differential diagnoses (Conditions to be distinguished and not considered inclusive). Isolated distal calf DVT (peroneal/posterior/anterior tibial, muscular veins), superficial thrombophlebitis, chronic post-thrombotic changes, lympema/venous insufficiency, cellulitis, Baker cyst.

Common treatments/management (positive proxies). Therapeutic-intensity DOACs (apixaban/rivaroxaban/edoxaban) or LMWH, less commonly VKA; in select cases thrombectomy/thrombolysis/IVC filter. (Treatment will be enforced by phenotype logic; do not add drug codes to this condition concept set.)

Clinical Scope and Granularity

  • Disease entity: Acute thrombus in proximal lower-extremity deep veins—popliteal, femoral (common/superficial), deep femoral, iliac, or IVC—typically presenting with unilateral leg swelling, pain, warmth; may be incidentally detected on imaging in oncology care. Objective confirmation usually by compression duplex ultrasound or CT/MR venography.
  • Temporality: Incident/current event only; remote or resolved events are out of scope.
  • Severity & acuity: All acute severities are in scope, including iliofemoral extension and IVC involvement when proximal LE origin is explicit.
  • Etiology: Any cause (including cancer-associated or postoperative) provided location is proximal LE; pregnancy-related VTE is not within scope.
  • Anatomy: Deep and proximal veins only (ex. thrombophlebitis is not in scope).
  • Population: adults or pediatrics.
  • Ambiguity tolerance: Prefer site-specific proximal terms; avoid broad/unspecified venous thrombosis labels unless clearly proximal LE.

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Distal calf DVT (peroneal/posterior/anterior tibial, muscular veins).
  • Superficial vein thrombosis (e.g., great saphenous).
  • Chronic post-thrombotic changes; lymphedema; venous insufficiency; cellulitis; Baker cyst.
  • Upper-extremity DVT; pulmonary embolism.

Synonyms

  • Proximal lower-extremity DVT
  • Iliofemoral DVT
  • Femoral vein thrombosis
  • Popliteal vein thrombosis
  • Iliac vein thrombosis
  • Lower-limb DVT—proximal

C05 Ovarian Cancer

Our objective is to define a concept set representing active, primary epithelial carcinoma of the ovary, fallopian tube, or peritoneum. This set must capture a confirmed diagnosis of malignant disease, not a history of it or a borderline condition. It will serve as the foundational disease definition for subsequent phenotype algorithms used in comparative effectiveness and health outcomes research focused on women with relapsed or refractory disease, ensuring we begin with a high-fidelity patient population.

Clinical case definition: Primary epithelial malignancy originating in the ovary, fallopian tube, or primary peritoneum. Predominant histology is high-grade serous carcinoma; other epithelial subtypes include low-grade serous, endometrioid, clear cell, mucinous, malignant Brenner (transitional), squamous, and carcinosarcoma (MMMT) when primary to these anatomic sites.

Diagnostic Criteria

  • Tissue diagnosis from cytoreductive surgery or biopsy confirming epithelial carcinoma of ovary/tube/peritoneum.
  • Imaging supportive of pelvic/adnexal mass and peritoneal disease; CA-125 commonly elevated but not specific.
  • In routine data, diagnosis is represented via oncology malignant neoplasm codes mapped to SNOMED; histology specificity varies by site.

Presentation & Course (typical).

  • Often presents with abdominal distension, ascites, or pelvic mass; frequent peritoneal spread.
  • Relapse is common after platinum-taxane; subsequent studies apply phenotype logic (PFI ≀6 months) to classify PROC.

Differential Diagnoses

  • Non-epithelial ovarian tumors: germ-cell (e.g., dysgerminoma, yolk sac), sex-cord stromal (e.g., granulosa cell, Sertoli-Leydig).
  • Borderline (low malignant potential) epithelial tumors.
  • Metastatic carcinoma to ovary (e.g., Krukenberg from GI) or secondary peritoneal carcinomatosis from non-gynecologic primaries.
  • History of ovarian cancer without current active disease.

Common Treatments/Management (signals the disease exists; not for concept set logic). Primary cytoreductive surgery; 1L platinum-taxane ± bevacizumab; maintenance PARP in appropriate patients; subsequent non-platinum regimens ± bevacizumab; FRα-targeted ADC for FRα-high; IO/targeted regimens in trials/practice.
Clinical Scope

  • Disease entity: The scope is primary epithelial malignancy originating in the ovary, fallopian tube, or peritoneum. This includes all major epithelial histologies: high-grade serous, low-grade serous, endometrioid, clear cell, mucinous, malignant Brenner tumors, and carcinosarcoma (MMMT) when primary to these sites. Diagnosis is typically confirmed via tissue pathology.
  • Temporality: The scope is restricted to current, active malignant disease.
  • Severity & acuity: All stages and grades of invasive epithelial carcinoma are included.
  • Etiology: Only primary malignancies of the ovary, fallopian tube, or peritoneum are in scope.

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Non-epithelial ovarian tumors, such as germ-cell tumors (e.g., dysgerminoma) or sex-cord stromal tumors (e.g., granulosa cell).
  • Borderline or low malignant potential epithelial tumors of the ovary.
  • Secondary malignant neoplasms metastatic to the ovary or peritoneum from other primary sites, such as gastrointestinal primaries (Krukenberg tumors).

Synonyms

  • Epithelial ovarian cancer
  • Ovarian carcinoma
  • Fallopian tube carcinoma
  • Primary peritoneal carcinoma
  • High-grade serous ovarian carcinoma (HGSOC)
  • Ovarian carcinosarcoma
  • malignant Brenner tumor (ovary)
1 Like

C07 Systemic Sclerosis

This concept set will identify Systemic Sclerosis (SSc; systemic scleroderma)—a systemic, not localized, autoimmune fibrosing vasculopathy. The set must represent current, active systemic disease. This concept set will support an observational study that aims to answer the following research objective:

The study aims to explore treatment utilization among patients newly diagnosed with systemic sclerosis in real-world data.

The concept set will be used to phenotype patients with systemic sclerosis as the target cohort in this research question.

Clinical case definition: Systemic Sclerosis (SSc), also known as systemic scleroderma, is a complex, chronic autoimmune disorder characterized by three hallmark features: immune dysregulation (autoimmunity and inflammation), microvasculopathy (vascular injury and remodeling), and progressive fibrosis (excessive collagen deposition) affecting the skin and internal organs (e.g., lungs, gastrointestinal tract, heart, kidneys).

Diagnostic Criteria: Diagnosis is clinical, often informed by the 2013 ACR/EULAR classification criteria (score ≄9). Key elements include skin thickening proximal to the MCP joints (sufficient criterion), Raynaud’s phenomenon, digital tip lesions, telangiectasias, abnormal nailfold capillaries, pulmonary arterial hypertension (PAH) and/or interstitial lung disease (ILD), and SSc-specific autoantibodies (Anti-Scl-70/Topoisomerase I, Anti-centromere, Anti-RNA polymerase III).

Presentation and Course: The presentation is heterogeneous and the course is chronic and lifelong. It is categorized based on skin involvement:

  • Diffuse Cutaneous SSc (dcSSc): Extensive skin thickening, higher risk of early ILD and renal crisis.
  • Limited Cutaneous SSc (lcSSc): Skin thickening restricted distally. Includes CREST syndrome.
  • SSc sine scleroderma: Internal organ involvement without skin thickening.

Differential Diagnoses (Conditions to be distinguished and not considered inclusive): Undifferentiated Connective Tissue Disease (UCTD), Mixed Connective Tissue Disease (MCTD), Isolated Raynaud’s phenomenon, and various scleroderma mimics (see Exclusions).

Common Treatments/Management: Management is typically overseen by a Rheumatologist. Treatments include immunosuppression (e.g., Mycophenolate Mofetil, Cyclophosphamide) and targeted therapies for ILD (Nintedanib, Tocilizumab, Rituximab).
Clinical Scope

  • Disease entity: Chronic autoimmune disorder with immune dysregulation, microvasculopathy, and progressive fibrosis affecting skin and internal organs (lungs, GI tract, heart, kidneys). Diagnosis is clinical; 2013 ACR/EULAR elements include proximal skin thickening (sufficient criterion), Raynaud’s phenomenon, digital ischemic lesions, telangiectasias, abnormal nailfold capillaries, PAH and/or ILD, and SSc-specific autoantibodies (anti–Scl-70, anticentromere, anti–RNA polymerase III).
  • Subtypes: Diffuse cutaneous (dcSSc), limited cutaneous (lcSSc, includes CREST), and SSc sine scleroderma (internal organ involvement without skin thickening) all are within scope.
  • Temporality: capture prevalent established, and active current disease.
  • Severity & acuity: Include full spectrum from mild to life-threatening, acute manifestations and chronic progression.
  • Manifestations: Multisystem involvement is in scope when explicitly linked to SSc (e.g., SSc-ILD, SSc-PAH).
  • Etiology: Autoimmune/idiopathic SSc only.
  • Population: all population

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Localized scleroderma (critical exclusion): morphea (generalized/plaque/guttate), linear scleroderma, en coup de sabre.
  • Mimics/fibrosing conditions: eosinophilic fasciitis, scleredema, scleromyxedema, nephrogenic systemic fibrosis.
  • Induced scleroderma: drug-induced (e.g., bleomycin, taxanes) or environmental/occupational (silica, vinyl chloride, toxic oil).
  • Other/overlap: GVHD with sclerodermatous features; MCTD or UCTD unless “systemic sclerosis” is explicitly stated; isolated Raynaud’s, acrosclerosis, or sclerodactyly.

Synonyms

  • Systemic scleroderma; Progressive systemic sclerosis (PSS); CREST syndrome (limited cutaneous SSc Calcinosis, Raynaud’s phenomenon, Esophageal dysmotility, Sclerodactyly, and Telangiectasias).

Hi Gowtham - so a total of 6 conditions - C01-C07, skipping C06, right?

I’m looking at your clinical descriptions in this post. - Don’t see C06

It jumps from C05 to C07

Gotchya, I made a copy paste error (lack of sleep). The GitHub repository is the source of truth. GitHub - ohdsi-studies/MindMeetsMachines: The "Minds Meet Machines" Challenge. A concept set development study by the OHDSI Phenotype development and evaluation workgroup.

Here is the 6th one @jswerdel - thank you for pointing it out and helping me fix it.

C06 Posterior Uveitis

Develop a diagnosis concept set for active, non-infectious posterior-segment uveitis—encompassing intermediate uveitis, posterior uveitis, and panuveitis—to support a phenotype used in comparative effectiveness work (e.g., Drug A vs Drug B). The set must capture current, clinically active disease at or immediately preceding initiation of systemic therapy. The research aim is to compare time to treatment failure and steroid dependence after treatment start for patients with posterior uveitis.

This concept set will support an observational study that aims to answer the following research question: Among adults with non-infectious posterior-segment uveitis, what is the comparative effectiveness of drug A to prevent steroid dependency, compared to drug B.

The concept set will be used in the phenotype of patients with non-infectious posterior-segment uveitis as the target cohort in this research question.

Clinical case definition: Inflammation of the uveal tract with posterior-segment involvement—intermediate uveitis, posterior uveitis, or panuveitis—that is Non-Infectious (autoimmune/autoinflammatory; idiopathic or associated with systemic disease) and clinically active.

Diagnostic criteria:

  • Ophthalmic exam consistent with posterior-segment inflammation: vitreous cells/haze, “snowballs/snowbanking” (intermediate), chorioretinitis/retinitis, retinal vasculitis, optic disc edema; activity graded per SUN where available.
  • Imaging support when present: OCT (CME), FFA (leakage/vasculitis), ± ICGA/ultrawidefield angiography.
  • Exclusion of infection and masquerade by appropriate work-up.

Presentation and course: Chronic or recurrent disease with active flares; may be sight-threatening; often steroid-responsive but steroid-dependent without additional IMT.

Differential diagnoses (Conditions to be distinguished and not considered inclusive): Infectious uveitis (e.g., toxoplasma, HSV/VZV/CMV retinitis, TB, syphilis), intraocular lymphoma/other masquerades, isolated anterior uveitis, scleritis/episcleritis, non-inflammatory mimics.

Common treatments/management: High-dose systemic corticosteroids; steroid-sparing IMT (methotrexate, mycophenolate, azathioprine, cyclosporine, tacrolimus); biologics (adalimumab). Therapies are not part of this concept set.
Clinical Scope and Granularity:

  • Disease entity: immune-mediated, non-infectious inflammation of the uveal tract with posterior-segment involvement (intermediate, posterior, panuveitis); may be chronic/recurrent and sight-threatening.
  • Typical presentation: vitreous cells/haze; “snowballs/snowbanking” (intermediate); chorioretinitis/retinitis; retinal vasculitis; optic disc edema. Imaging may show CME on OCT and leakage/vasculitis on angiography.
  • Temporality: prevalent/current active episodes at or immediately before a systemic treatment would start.
  • Severity/acuity: all severities in scope, including steroid-dependent or refractory disease.
  • Manifestations: may manifest with posterior-segment findings that are linked to uveitis.
  • Etiology: non-infectious (idiopathic or associated with systemic autoimmune/autoinflammatory disease).
  • Population: All patients.

Related, differential conditions or comorbidities that are not sufficient for inclusion

  • Infectious posterior uveitides (toxoplasma; HSV/VZV/CMV; tuberculosis; syphilis).
  • Intraocular lymphoma and other masquerade entities.
  • Anterior uveitis, episcleritis/scleritis, non-inflammatory mimics.
  • Organ-specific findings such as cystoid macular edema or retinal vasculitis without an explicit uveitis diagnosis.

Synonyms

  • Intermediate uveitis; pars planitis; posterior uveitis; panuveitis; posterior cyclitis.
  • Retinochoroiditis/choroiditis (non-infectious context).
  • Named non-infectious posterior/panuveitis entities: birdshot chorioretinopathy; Vogt–Koyanagi–Harada–associated uveitis; sympathetic ophthalmia; sarcoid-associated uveitis; Behçet-associated posterior uveitis.

My name is Sima Mohammadi, a medical doctor and PhD candidate at the University Medical Center Utrecht (UMCU). My thesis focuses on semantic harmonization, and over the past three years, I have been involved in projects related to developing a semi-automated application for mapping multiple UMLS and non-UMLS vocabularies with the ontologies team. I am very interested in joining this workshop to learn from your experience and from other participants about how this process is handled within OHDSI, and to compare it with AI-driven phenotype generation approaches.

Welcome @Sima - it very nice to have an informatics trained PhD medical doctor in the group. You are welcome to join. The link is on top.

Mind Meets Machine YouTube Video

Part 1: https://youtu.be/igTQC4PkiCA

Part 2: https://youtu.be/7Ek3vF3Pu_E

Part 3: https://youtu.be/24FnC9FbaQU

Title: Mind Meets Machine Workshop Recap: A Scientific Evaluation of AI vs. Human-Led Concept Set Generation

The Phenotype Development and Evaluation Work Group convened a workshop, “Mind Meets Machine,” during the OHDSI 2025 Symposium. The session, co-led by @Azza_Shoaibi and @Gowtham_Rao, executed an informal exercise designed to address a critical question facing the OHDSI community: How do emerging Generative AI/LLM approaches for concept set generation compare to established human-led workflows?

This workshop directly supports the working group’s mission “to improve the quality and the reliability of the evidence we generate from observational data by advancing the science of phenotype development.” The goal was to scientifically evaluate the accuracy, completeness, and precision of these new tools before they are adopted into standard observational research processes.

The session began with a moving tribute to the late Jamie Weaver, honoring his significant contributions to the science of phenotyping and measurement error, setting a mission-driven tone for the day’s activities.

The Experiment: Design and Execution

The primary objective of the experiment, operating under a OHDSI QI project, was to compare the performance of Gen AI workflows against rigorous, consensus-based human workflows. The primary metric for evaluation will be the prevalence-weighted F-score.

The experiment involved several key phases:

  1. Human Workflow: Over 20 participants were randomized to different clinical ideas (e.g., SLE, DME, DVT, RA). They were given a strict 30-minute time limit to generate concept sets within a dedicated Atlas instance, based on provided clinical descriptions.
  2. AI Workflow: Four distinct Gen AI-driven methodologies, developed by community researchers, were submitted for the comparison.
  3. Gold Standard Creation: Recognizing that neither human consensus nor AI output constitutes a definitive ground truth, a “dynamically created gold standard” was established. Clinical experts for each disease area adjudicated codes where the human teams and AI workflows disagreed. The final gold standard was defined as the intersection of all human and AI codes, plus any disagreed codes validated by the adjudicators.

Showcase: Diverse AI Methodologies

A key component of the workshop was a showcase of three distinct AI approaches submitted for the evaluation, demonstrating the diversity of methodologies being explored in the community:

  • EPAM Systems (Presented by @darya.zhukova ): This approach utilizes a containerized architecture with a vector store for semantic searching (using AWS Bedrock). A key feature is user-controlled precision, allowing users to adjust the breadth of the search from exact matches to broad vector similarities, offering flexibility in balancing sensitivity and specificity.
  • King’s College (Presented by @Niko_Moller-Grell ): This research-focused “agentic workflow” breaks down the complex process of concept reasoning into smaller sub-tasks. It employs a hybrid approach, combining semantic similarity (vector search) with ontological reasoning by traversing the OMOP vocabulary relationships (knowledge graph) for finer-grained decision-making and sanity checking.
  • JNJ (Presented by Joel Swerdel): This open-source tool utilizes a unique two-stage process. First, it leverages the PHOEBE recommender system (along with descendants of a starting concept) to generate a broad list of candidates. Second, an LLM adjudicates each candidate concept using specific “proportional logic” (e.g., asking if >95% of patients with the candidate concept also have the target condition).

Key Reflections and Scientific Challenges

Following the concept set generation and adjudication activities, the working group reflected on the process and preliminary observations, highlighting several fundamental challenges in the science of phenotyping:

1. Significant Variability and the Challenge of Specificity
Initial findings revealed stark differences depending on the clinical idea. For Rheumatoid Arthritis (RA), there was a surprising 94% consistency between human and machine-generated codes. In contrast, Deep Vein Thrombosis (DVT) showed less than 24% overlap.

Workshop attendees noted that the specificity of the study question dramatically impacted the task. Defining “acute proximal DVT” proved far more difficult than chronic RA, as available clinical codes often lack the necessary granularity. This creates a tension between adhering to a narrow clinical specification and the reality that significant record counts often reside on more general, ambiguous codes.

2. The “Source Problem”: Clinical Practice vs. Research Needs
A major theme of the discussion was the fundamental disconnect between how data is generated in clinical practice and the needs of observational research. Clinicians emphasized that coding in practice is driven primarily by billing and clinical operations, not research precision. This systemic gap means the raw material for phenotyping (diagnosis codes) is often not generated with research-grade precision, complicating all downstream efforts.

3. The Importance of Context and Data
The group emphasized that concept set creation cannot be divorced from the broader study design and the underlying data. Workshop attendees expressed the need to see record counts to determine the relevance of esoteric codes and noted that strategies for inclusion/exclusion depend heavily on the intended cohort logic (e.g., looking for DME in an already defined diabetic population allowed for a more inclusive approach).

Next Steps and Future Vision

The immediate next step is the formal analysis of the exercise, calculating the F-scores for each human team and AI workflow against the adjudicated gold standard. The results will be shared with the community.

Looking ahead, the working group stressed that this exercise is a stepping stone. The community must move beyond evaluating concept sets in isolation. As @Azza_Shoaibi and @JudyRac (Dr. Judith Racoosin) noted, OHDSI does not view phenotypes merely as code lists. Conceptual debates without data are often unproductive, and the ultimate validation requires running different design choices in the data to see if they affect the final patient cohort.

The goal for Phenotype February 2026 is to advance this work by having both humans and AI build complex, data-driven cohort definitions for meaningful clinical ideas, which can then be robustly evaluated across the OHDSI data network.


We extend our gratitude to the organizing team, the AI development teams, the logistical support from Will Kelly (JHU, John Hopkins University), the clinical adjudicators—Dr. @Evan_Minty (DVT), Dr. @Christopher_Mecoli (Lupus/Systemic Sclerosis, John Hopkins University), and Dr. @cindyxcai (DME, John Hopkins University), Dr. @briantoy (Posterior Uveitis, University of Southern California USC Schaeffer Institute), Dr @Liz_Park (Rheumatoid Arthritis, Columbia University)—and all workshop participants for their contributions to this vital research.

Minds Meet Machines: A Comprehensive Report on the OHDSI Phenotyping Challenge

The “Minds Meet Machines” event saw high engagement, bringing together over 50 in-person participants and approximately 50 online attendees, including clinical experts, researchers, and informaticians.

The “Minds Meet Machines” challenge is now moving into the formal analysis and dissemination phase.

Current Status (October 24th 2025):

Validation Plan:
The analysis code is undergoing a rigorous, multi-stage validation process before the final results are generated Add Full MMM Challenge Analysis Pipeline (Phase 1–2) Implementing SAP by ghatesudi · Pull Request #7 · ohdsi-studies/MindMeetsMachines · GitHub

Timeline and Future Goals:

  • The study team is targeting the end of October 2025 to produce a draft of the scientific paper detailing the study’s methodology, comprehensive findings, and implications for the OHDSI community.
  • The findings from this challenge will be used to inform the design of a more advanced challenge for Phenotype Phebruary 2026. This future initiative may expand the scope of the evaluation from generating concept sets to developing full, executable cohort definitions.