Databricks Genie Spaces for OMOP

We have clinical teams who would like to use OHDSI/OMOP to answer operational questions (often sizing and characterizing clinical cohorts) without having to become Atlas experts.

I’d like to compare notes with other who are experimenting with using Databricks AI to work with the OMOP data model to support that need. Databricks Genie Spaces are getting more robust - letting users use internal (HIPAA/BAA-covered) Anthropic Opus agents to plan and execute analyses. I’m seeing good success with smaller data models.

Databricks lets people create a set of benchmark questions along with ground-truth SQL to answer the questions; and uses them to test the LLM-generated SQL, and query results, against the same for the ground-truth one. I’d like to hear other’s experience with curating a set of benchmark questions. I expect a Genie Space might be OK up to a certain level, but there will be exceptions. I’m trying to understand where it might work well, and what guardrails we should consider.

Such a set of standard benchmark questions might also nicely complement DQD to help others avoid reinventing the wheel.

2 Likes