Hi everyone,
I am currently working on a project focused on improving the explainability of AI-generated SQL queries. One approach we are exploring is the development of a sanitiser/validator that reviews LLM-generated SQL and checks it against a set of rules to ensure the query complies with OMOP conventions.
For example, the sanitiser might validate that:
- tables referenced belong to the OMOP CDM schema.
- clinical concepts are identified using concept_id rather than string matching.
- standard concepts are used, or that non-standard concepts are appropriately mapped to standard concepts.
I am looking for documentation or resources that describe what makes a SQL query compliant with OMOP standards, particularly anything that formalises best practice or defines rules that could be used in an automated validator.
Does anyone know of existing resources or have suggestions for systematically defining compliance rules for OMOP SQL queries?
Many thanks in advance.
Shihao