OHDSI Home | Forums | Wiki | Github

Use of Schemas in WebAPI and in parameterized SQL

All,
I’ve run into some issues regarding how we refer to a ‘schema’ in translate sql and what some of these third party libraries assume we mean when we talk about ‘schemas’.

So, first a bit of background:
At J&J, we have multiple SQL servers, and within each sql server, there is multiple CDM databases. Example: we have a server XX07, contains CDM_TRUVE_CCAE, and all tables within are attached to the dbo schema. There is also a CDM_TRUVEN_MDCR database, and it’'s tables are attached to the dbo schema within it’s own database.

When we do analysis, we don’t create tables in the CDM database, we have a separate database that we write to (we’ll call this Results) and we can write queries in that reference the CDM tables by doing CDM_Truven_CCAE.dbo.CONDITION_OCCURRENCE. We can even join across database instances by saying somethign like

RESULTS.dbo.MyLookupTable L JOIN CDM_Truven_CCAE.dbo.CONCEPT C on L.concept_id = C.concept_id.

I should also point out: the reason we do this is that between CDM builds, we start with a brand new database to store the CDM. If we had Results inside the CDM database, it would be blown away between CDM builds, and we don’t want that.

So, here’s the thing: the schema for both those db instances are ‘dbo’. So, what we do in our template sql is somethign like @ResultSchema.MyLookupTable and @CDM_schema.CONCEPT, and we set the @ResultSchema to RESULTS.dbo and our CDM_schema to CDM_TRUVEN_CCAE.dbo. This falls down when talking to database libraries such as hibernate and flyway where we tell it that the schema is ‘CDM_TRUVEN_CCAE.dbo’ when we actually mean that the schema is dbo in the databaseName=CDM_TRUVEN_CCAE.

I think the problem is that in the MSSql world, we’re treating the databases as a type of ‘schema’ and while that works for our dynamic sql generation across different platforms (just put @schema in front of your tables and it works), calling it a schema is not exactly correct.

My question to the community: how are schema handled in your environment (postgres and oracle for example?) How do you intend to implement the OHDSI Result schema in your implementation? Part of the same CDM database or somehow separate it by a ‘database instance’? If you want to join between tables across different instances (or in a SQL statement, selected from one database/schema and insert into another database/schema) how does that syntax look?

@schuemie, @alfranke, @Frank, @cahilton: these answers will have direct impact on how we’re doing business so I am interested in what the implications are for you.

-Chris

t