Seeking input on services that the OHDSI Study Agent will provide

rkboyce · January 29, 2026, 9:13pm

Hello!

My name is Rich Boyce and I am helping to lead/coordinate the new OHDSI Study Design Assistant project. The project is a part of the OHDSI Generative AI workgroup and has the goal of developing an AI Study Agent to support OHDSI researchers to more rapidly, accurately, and reproducibly determine research feasibility and design research studies using OHDSI tools, including HADES and Atlas/WebAPI.

The Study Agent will be service-architected, providing AI-informed services to other tools used by researchers through standardized API calls. This will enable integration of the Study Agent into a variety of tools used by OHDSI researchers. See here for more information, including on architecture and an initial proof-of-concept.

In this post, I am seeking input on the set of services that we should target for the Study Agent. I have started a list of services that I think will help a researcher go from a study idea to a well-specified research question, and then from a research question to a computable study specification, complete with computable outcome and exposure phenotypes and parameters for executing the study. If this topic is of interest to you, would you please comment, edit, and or make suggestions in reply or post them on the project github as issue tickets?

Below is the first draft of study agent services based on what I am calling “study intent” (a narrative description of the research question) :

NOTE: at no time for any of the services would an LLM see row-level data (this can be accomplished through the careful use of protocols (MCP for tooling, Agent Client Protocol for OHDSI tool ↔ LLM communication) and a security layer).

phenotype_recommendations: Suggest relevant phenotypes from the thousands of phenotype definitions available from various credible sources (OHDSI Phenotype library, VA CIPHER, a user’s own Atlas cohort definitions) for the study intent. Write cohort definition artifacts for any phenotype definitions the user accepts as relecant.
phenotype_improvements: Review selected phenotypes for improvements against study intent. Of the use accepts, write the new artifacts (JSON cohort definitions or Atlas cohort records)
phenotype_characterize: Generate R code that the user will run, or request the user’s permission to run Atlas services, to characterize the population of individuals that match a selected phenotype (i.e., same as a cohort characterization)
phenotype_data_quality_review: Check for likely issues and propose mitigation based on information from the Data Quality Dashboard, Achilles Heel data quality checks, and Achilles data source characterizations over the one or more sources that a user intends to use within the study. For issues that the use acknowledges , patch the artifacts (JSON cohort definitions or Atlas cohort records)
phenotype_validation_review: Generate Keeper code for the use to run that will enable them to review case samples from the population of patients meeting a selected phenotype definition. The agent will write the code to make the sample such that the user can compare performance characteristics with their sample to known for the phenotype from other sources where it was tested.
cohort_definition_build: Write the Capr code for a use to define a phenotype or covariate relevant to the study intent for which a cohort definition has not yet been defined.
cohort_definition_lint: Review cohort JSON for general design issues (washout/time-at-risk, inverted windows, empty or conflicting criteria) and for execution efficiency (unnecessary criterion nesting, sub-optimal logical ordering of criteria) and write the proposed patches (new JSON or new cohort definitions in Atlas)
concept_set_recommendations:Based on a phenotype or covariate relevant to the study intent for which a cohort definition has not been defined, suggest relevant concept sets from sources available to the user (concept set JSON, Atlas) to use in a new cohort definition. If the user accepts, create the concept set artifacts.
propose_concept_set_diff: Review concept set for gaps and inconsistencies given the study intent. If the user accepts, patch the concept set artifacts.

schuemie · January 30, 2026, 9:42am

Thank you @rkboyce for leading this!

How about:

propose_negative_control_outcomes: Given a target (and optionally a comparator) recommend outcomes that are unlikely to be caused by the target (nor by the comparator)
review_negatve_control: Given a target and an outcome, judge whether they are unlikely to be causally related.
propose_comparator: Given a target, propose a comparator. This could leverage the OHDSI Comparator Selector tool.

Gowtham_Rao · January 30, 2026, 5:49pm

@rkboyce a lot of phenotype_ has already been posted!!

How about:

protocol_generator: understand the PICO/TAR and write a templated protocol
background_writer: based on PICO/TAR and hypothesis - do (systematic) research and write document justifying study.
protocol_critiquer: given a protocol critique it if it has required components and for consistency.