OMOP Concept Retriever - Multilingual Semantic Search for OMOP Concepts

Hello OHDSI Community,

I’m excited to share a new open-source tool that might be useful for researchers and developers working with OMOP CDM: the OMOP Concept Retriever with OpenAI and ChromaDB.

What is it?

This tool provides a powerful way to search and retrieve OMOP concepts using natural language queries. It’s particularly useful for:

  • Finding relevant OMOP concepts using everyday medical terminology
  • Supporting multilingual queries (English, Chinese, Japanese, etc.)
  • Performing semantic similarity searches across the entire OMOP vocabulary
  • Filtering by domain (Condition, Drug, Procedure, etc.)

Key Features

  • Natural Language Understanding: Finds relevant concepts even when the exact terminology doesn’t match
  • Multilingual Support: Query in your preferred language
  • Fast Retrieval: Optimized for quick concept lookups
  • Flexible Deployment: Can be used as a command-line tool or integrated into applications via API

How It Works

The tool uses OpenAI’s embeddings to create semantic representations of OMOP concepts, then performs vector similarity searches using ChromaDB. This allows for more intuitive concept searching compared to traditional exact or fuzzy matching.

Getting Started

You can find the project on GitHub: GitHub - lanesky/omoprag

The README includes detailed setup instructions, usage examples, and API documentation.

Example Use Cases

  1. Finding relevant concepts when the exact terminology is unknown
  2. Mapping local codes to standard OMOP concepts
  3. Exploring related concepts for research or analysis
  4. Building more intuitive interfaces for concept selection

Best regards,