We’re preparing to instantiate an OMOP CDM v5.3.1 on SQL Server and load it from our Epic Caboodle database. I want to ensure we’re getting off on the right foot, especially in regards to “future proofing”. I have some SQL Server data type questions that are probably best posed to @clairblacketer, @Christian_Reich, and @Patrick_Ryan as the authors of the SQL Server DDL.
Why VARCHAR(1) for Boolean flags instead of TINYINT or even CHAR(1)? (I can understand hesitancy over using SQL Server’s BIT data type.)
Did OHDSI make a conscious choice to use VARCHAR instead of NVARCHAR? (Epic uses NVARCHAR in Caboodle.)
Like many academic organizations, we will use our OMOP instance both for our internal researchers and for contributing data to various research networks. I do not want to make any technical decisions that would get us into trouble later for either of these use cases. Thanks so much!
One other point that I found is to make the collation = SQL_Latin1_General_CP1_CS_AS for the underlying SQL Server database. This caused problems when I started out with the default as SQL_Latin1_General_CP1_CI_AS
The changes you have identified will not create any problems in creating your ETL and should not be a problem when running the Data Quality Dashboard, Achilles or Atlas. But you will be the ones testing these changes, so please update this post as you start running the QC and analytic programs.
I can confirm that the data type changes I enumerated above have not caused us any problems over the past 2 years with the OHDSI tools that @DTorok mentioned:
ATLAS
ACHILLES (including HEEL)
Data Quality Dashboard (DQD)
For the record, we’re using Microsoft SQL Server 2019 Enterprise Edition (version 15.0.4153.1).
@quinnt
I am the head of IT at the Cliniques Universitaires Saint-Luc Hospital in Brussels, Belgium.
We are at the beginning of the OMOP implementation project with our EMR EPIC and you seem to have already done this exact same project. Can we contact you for feedback and some guidance?