Accelerating OMOP Conversion with free LLM-Powered Tool – Looking for Test Users

Hi everyone,

I’m developing a new tool designed to dramatically accelerate the conversion of source databases to the OMOP CDM, leveraging the power of large language models (LLMs). The goal is to minimize the time from gaining access to a source database to generating the first OMOPified result—typically within 30 minutes, including an initial Data Quality Dashboard (DQD) run.

Key Features:

  1. LLM-First Approach:
    We use an LLM trained to understand a our custom syntax for structural mapping. This allows users to describe mappings using natural language prompts.
  2. Mapping as a code Approach:
    We developed declarative special language, optimized for the structural mapping (alike terraform) to get all benefits from coding: collaboration, version control, code as documentation, tests, possibility to use LLM
  3. Declarative Mapping Syntax:
    Each OMOP field is defined with an elementary SQL expression for data extraction. Fields can also be flagged for semantic mapping.
  4. Automated Semantic Mapping:
    The system performs initial vocabulary (concept) mapping automatically. Users can then review and fine-tune the mappings as needed.
  5. Security by design: sends only statistical information to backend (like WhiteRabbit report) and may work with synthetic version of database and produce ETL to be applied to real data

Outcome:

  • Rapid onboarding: From database connection to initial OMOP CDM + DQD results in ~30 minutes.
  • Iterative refinement: Quickly adjust mappings and improve output over time.

I’m currently looking for test users who are willing to try the product for free and provide feedback. If you’re working on an OMOP conversion project and interested in reducing the time and effort required, I’d love to hear from you!

5 Likes

Hi Artem, you may count on me! Ping me on LinkedIn: https://www.linkedin.com/in/4iurchenko/

1 Like

I am interested as well

I am interested as well. Please could you message me on https://www.linkedin.com/in/mangeshswagle/

2 Likes

done

1 Like

messaged

Looks very interesting, how good is the AI mapping? Does it map everything end to end?

Yes, maps end-to-end. And full process from scanning source database, structural mapping few tables, semantic mapping and finally running DQD takes typically 30 minutes. And then each iteration add more and more tables until finish

1 Like

Just wanted to add, we had a fantastic meeting with Artem. During it, I explored the functionality of the tool. From what I learned, the platform provides automated schema mapping between datasets, with a supervisor mode for manual review and corrections where needed. This can reduce the manual overhead involved in harmonizing heterogeneous biomedical datasets. AI tools work great under the hood; sometimes, they need some course correction, but they help with 90+% of the work, allowing us to focus on the course correction instead of starting from scratch. It was amazing.

Looking forward to the new release, to test it.
@ents great work!

3 Likes

My experience using ChatGPT 4.0 to map ICD codes to SNOMED was really a dead end. Most of the recommended mappings were wrong, sometimes ridiculously so, as I recall. I gave up after a few tries. Granted, this was not schema mapping as described here, but I would check semantic mapping carefully if you are using an LLM to help with OMOP conversion.

2 Likes

Sounds cool! Happy to test when the new release is out. Thanks.

We have complex algorithm of semantic mapping, including LLM and classic search (like Usagi or athena)
I’m happy to show you our suggester and we may check accuracy on your examples

And we use LLM not only for semantic, but for structure also and this is noval. LLM analyses source table and fields and makes relation between source tables and fields and CDM

It was please to talk with and and to read your product feedback. BTW we have already added tasks to current spring based on your feedback

quite interesting, happy to have a look and provide feedback https://www.linkedin.com/in/albertolabarga/ we have several ongoing projects to transform data to OMOP-CDM in federated settings

2 Likes

Hi Artem,
I would like to volunteer as a test user,
please do ping me : https://www.linkedin.com/in/chinju-john-a95876b4/

1 Like

Hi! I could also take a look at your implementation and test it out.
Kindly send me the information on the next steps.

linkedin: https://www.linkedin.com/in/mithilesh-prakash/

BR,
Mithilesh
UEF, Finland

Are you still looking for test users? My Linkedin: Adam Johnson | LinkedIn