OHDSI Home | Forums | Wiki | Github

Location.Address maxlen mismatch for ETL

Hi folks,

I’m working on designing an ETL process into OMOP CDM.
I discovered that the “Location.Street” field has the maxlen=220 in the original data model. In OMOP CDM, “Location.Address1” and “Location.Address2” together yield maxlen=100, which is still too short.

Has anyone else run into such a problem? What would be the appropriate way of solving it?

Thanks,
Sofia

Hello @slis,

According to the CDM Data Model Conventions documentation, it is permissible to increase the length of any VARCHAR (i.e., string) columns:

The OMOP CDM is platform-independent. Data types are defined generically using ANSI SQL data types (VARCHAR, INTEGER, FLOAT, DATE, DATETIME, CLOB). Precision is provided only for VARCHAR. It reflects the minimal required string length and can be expanded within a CDM instantiation [@quinnt - emphasis mine]. The CDM does not prescribe the date and datetime format. Standard queries against CDM may vary for local instantiations and date/datetime configurations.

1 Like

Hi Tim, thank you so much for the quick reply!

Just to make sure I understood: so we are free to change the string MAXLEN values in our OMOP schema (e.g., from varchar(50) to varchar(220)) as we see fit, is that correct?

@slis,

Yes, that is my understanding. We are doing this for our OMOP database (in addition to switching the VARCHAR data types to NVARCHAR for our particular database).

I have wondered what happens when we share our OMOP data with other organizations (like clinical research networks). My guess is that these fields get truncated by the receiving organization, unless they have increased the length of their own VARCHAR columns.

Perhaps @clairblacketer can confirm?

t