OHDSI Home | Forums | Wiki | Github

OMOP licensing information?

Hello! I am exploring the OHDSI CDM for some future interest that may be pertinent to my organization. My org may fall under the ‘vendor’ category. What are the licensing information we need be aware of? Thank you!

You need nothing, @yulingjiang. All OMOP and OHDSI artifacts are Open Source and free. The only exceptions are proprietary vocabularies. In the Athena website, you will see which ones they are.

@Christian_Reich Thank you!!

When visiting Athena Portal the message pops up listing SNOMED and HemOnc license agreements. Does it mean that in order to use Athena, one needs to meet both SNOMED and HemOnc license agreement even if one does not want to use anything related to HemOnc?

Is there an authoritative document stating the terms of usage of Athena and/or OMOP Vocab and Mapping files?


SNOMED and HemOnc are so-called click-through licenses. By using Athena you are essentially subject to the stipulations. You cannot avoid SNOMED, but you could not download and use HemOnc. So, if you want to completely ignore them - you can do so.

Thank you for your clarification!

Hey @Christian_Reich, can you provide us with a written statement of what exactly people without a SNOMED license (in a country that does not have a SNOMED license) are allowed to do with the SNOMED-OMOP download, and what they are not allowed to do?

1 Like

@matentzn This is a pretty interesting question. I would really love to know its answer.

Here it is:

5.1. SNOMED International grants OHDSI access to SNOMED CT during the Term for the sole purpose of, and solely to the extent necessary for use only in OHDSI products that are available on the GitHub OHDSI organization (OHDSI products), such as OMOP CDM, Athena, Atlas, etc. Use of SNOMED CT extracted from OHDSI products and used locally by OHDSI users are not included in the Agreement and users should be directed to contact SNOMED International for license for use.

Bottom line: You can use it for ETLing OMOP CDM instances, and it can show up in OHDSI tools. This is the legal description for “you can do observational research in OHDSI”. What you cannot do is to extract it for other purposes or redistribute it further. What they really are allergic to is use in any patient care products (EHR, imaging systems, lab systems, pathology systems etc.).

Makes sense?

1 Like

Thank you @Christian_Reich for the explanation. It is a bit disappointing - OMOP should really try to use open vocabularies for their standard concepts. I don’t really mind that I cant use SNOMED, but what I find difficult is that users get their data aligned to OMOP (often at quite an investment), only to learn later that most of their data was aligned with SNOMED sourced OMOP standard codes, while their country does not have a SNOMED license. Maybe I am also still misunderstanding something. I work with a few communities that seek to align their data with OMOP. Everyone watches these (pretty awesome) talks by Kristin, learning about how they can group their data using OMOP - but, seems that for the most part, they are not really allowed to do many of these awesome things. Even something simple like building a Knowledge Graph with OMOP semantic relations is now not possible, or strictly speaking, I can’t even publish a Jupyter notebook that counts how many cardiovascular diseases have been reported in my cohort.

1 Like

Hm. Why not? You can do anything with that OMOP CDM instance. Go wild. Anywhere in the world.

You can’t take the SNOMED codes out and use them for something else that is not related to our research. Or distribute them further.

1 Like

Thank you @Christian_Reich for your patience to answer. Highly appreciated!

What I want to do: Take my internal OMOP patient data and merge it with the CONCEPT_RELATIONSHIP and CONCEPT_ANCESTOR tables downloaded from OMOP (i.e. Athena) and stick this into a Knowledge Graph. Now maybe I am misunderstanding where these relationships (or sub-concept relations) come from (the nice rich ones you use in your tutorials), but aren’t these coming from SNOMED (or other vocabs, most of which with license restrictions)? How do I get all the relationships between OMOP Standard concepts if not by selecting a specific vocabulary?

The relationship grap of SNOMED in the OHDSI vocabulary is different than the one found in UMLS/SNOMED stand-alone. If you are looking to do stuff with knowledge graphs, I recommend https://github.com/callahantiff/OMOP2OBO

It is, @Juan_Banda? How so?

That’s it. You are all set. Go do it. Make a graph. Analyze it. Tell us how it works and what you find.

(How else should I say “you are covered by the license”? :slight_smile: )

Semantic types and not all term types are in the OHDSI vocabulary: UMLS Metathesaurus - SNOMEDCT_US (SNOMED CT, US Edition) - Statistics and these are useful for most knowledge graph-related tasks. But I don’t advocate for them to be inside the vocabulary, as it is a different use case.

Got it. True. I was thinking you meant the relationships (edges of a graph), because they should be there. Let’s see what @matentzn comes up with.

@Juan_Banda I am a bit involved in OMOP2OBO with Tiffany, but this is at the heart of what I want to do: create a knowledge graph with all OBO ontologies on the one side and all OMOP standard concepts and all their interrelations on the other, and try to determine the difference in terms of query recall and expressivity between the two. It is still unclear when and how one or the other should be used as the “semantic layer” in a knowledge graph (given a good mapping between them).

@Christian_Reich what you say sounds great but I am not yet sure I am entirely convinced :slight_smile: Because of my anxiety regarding licenses, I will tell you exactly what the plan is:

  1. Download all relationships between standard concepts, including isa-parentage. For example: cardiovascular renal disease (Athena) is a Cardiorenal disease
    (Athena) and “Cardiorenal disease” --[has finding site]–>“heart”. As far as I understand, I must select the “SNOMED” box in the OMOP Vocabulary list (Athena) in order to get these relationships, right?
  2. I am building a LinkML (https://linkml.io/) model version of the OMOP data model (this can be done mostly automatically).
  3. I will convert the edge set in (1) to RDF using (2)
  4. I will generate synthetic patient data randomly according to the OMOP CDM and convert it to RDF using (2)
  5. I will stand up a public triple store that demonstrates the utility of OMOP for knowledge graph integration by loading (3) and (4) and then providing SPARQL queries like “find all patients that have been observed with a condition that affects the heart”.
  6. I will tweak the representation (knowledge graph structure) with my collaborators until it’s great and use the semantic layer (without the synthetic data) for my local private integration problem on real data.

So my worry stems from the fact that:

A. In order to do (1), I have to select the SNOMED vocabulary in the Athena vocab list to download
B. The triple store will have a public endpoint and may therefore constitute a case of re-distribution

Thanks for working through this with me :slight_smile:

Just to clarify one thing (in case it was not clear enough): the public triple store contains only synthetic data (randomly generated).

Ah, synthetic data. You did say that!

Usually, we are not getting into this type of trouble where we distribute the vocabularies de-facto as part of the data. Reason is: The data are usually well hidden and protected behind firewalls. With synthetic data though you can expose the “semantic layer” to the fresh air.

So, this is an edge case very few people will face, and the agreement is designed for the standard OHDSI research cases. Clearly, your research is not intended to do any of the things the license is trying to ban, but to make sure let me find out what SNOMED thinks about that.