We’re preparing to instantiate an OMOP CDM v5.3.1 on SQL Server and load it from our Epic Caboodle database. I want to ensure we’re getting off on the right foot, especially in regards to “future proofing”. I have some SQL Server data type questions that are probably best posed to @clairblacketer, @Christian_Reich, and @Patrick_Ryan as the authors of the SQL Server DDL.
- Why VARCHAR(1) for Boolean flags instead of TINYINT or even CHAR(1)? (I can understand hesitancy over using SQL Server’s BIT data type.)
- Did OHDSI make a conscious choice to use VARCHAR instead of NVARCHAR? (Epic uses NVARCHAR in Caboodle.)
- Why FLOAT in the COST table instead of NUMERIC? Any concerns with aggregating monetary amounts?
- Why FLOAT in the LOCATION table instead of NUMERIC(10,8) for latitude (+/-90 degrees) and NUMERIC(11,8) for longitude (+/-180 degrees)?
- I imagine there is no harm in future proofing our CDM v5.3.1 by using BIGINT now for the non-concept ID primary keys?
- Confirm that we are free to increase the length of any VARCHAR column, if we need to do so?
Like many academic organizations, we will use our OMOP instance both for our internal researchers and for contributing data to various research networks. I do not want to make any technical decisions that would get us into trouble later for either of these use cases. Thanks so much!