OHDSI Home | Forums | Wiki | Github

Postgres connection for Olympus

Update: I just created the empty database with the latest CDMBuilder (v0.0.2). This creates 39 tables in the public schema. Public schema also contains the 3 cohort tables. Should they be in this schema?

I’m not sure exactly which tables CDMBuilder creates. If it is just the CDM schema, then none of the Olympus tables are created. There is a table called ‘cohort’ which exists in the CDM schema and the OHDSI schema, but it is differentiated by the schema you specify for your CDM tables vs. the schema you specify for your OHDSI tables. This means that these schemas cannot be the same.

Re: What is Flyway?
Flyway is a database migration tool which ensures that before the OHDSI applications start up, the database has the appropriate schema and tables created.You can read about it at http://flywaydb.org

I think the core of the problem you are facing is that our database table creation scripts (DDL) do not have tables fully qualified with the ohdsi schema. @lee_evans, would you care to take a crack at modifying the postgres DDL found under this folder:

You can find examples of replacing DDLs with ‘placeholders’ at this url:

And to specify flyway placeholders in maven config:

I think what needs to happen to get this working properly is the migration scripts in db/migration/postgresql needs to be altered as:
CREATE TABLE ${ohdsi_schema}.SOMETABLE AS …

In POM.xml, we need to add the following placeholder:

<project>
    ...
    <properties>
        <!-- Properties are prefixed with flyway. -->
        <flyway.user>myUser</flyway.user>
        <flyway.password>mySecretPwd</flyway.password>

        <!-- List are defined as comma-separated values -->
        <flyway.schemas>${datasource.ohdsi.schema}</flyway.schemas>

        <!-- Individual placeholders are prefixed by flyway.placeholders. -->
        <flyway.placeholders.ohdsi_schema>${datasource.ohdsi.schema}</flyway.placeholders.ohdsi_schema>
    </properties>
    ...
</project>

@lee_evans, The reason why I’m tapping you for this is that you have access to a postgres serer and will probalby need to run a few tests on a blank database instance, where you have an existng CDM schema specified in 1 schema, and you’ll want to kick off a WebAPI instance pointing to this new database, but configured to create the tables (ohdsi_schema) into another schema, to test if the migration scripts work with the placeholder schema. Do you think you could try this?

-Chris

Not sure I get the direction to use “flyway placeholders”. WebAPI should already use the Flyway dataSource and the flyway config already has the schema information. Why do you think flyway is not executing the scripts in the appropriate schema? flyway.schemas=${flyway.schemas}.
I just looked at the 5-29 log file @mdewilde attached an it looked correct, cdm schema ‘public’, ohdsi schema ‘results’.
The following log line confirms that Flyway at least thinks it’s up-to-date.
2015-05-29 17:02:11.860 INFO qtp921045298-23 org.flywaydb.core.internal.command.DbMigrate - - Schema "results" is up to date. No migration necessary.

@mdewilde, what was the problem after the 5-29 log file? The comment about Circe being “empty” within Olympus? If so, this will be the case when you don’t have a local WebAPI running. Configuration page, ‘launch’ button.

I noticed in the 5-29 log file that you did have WebAPI running. I revisited the thought about internet connection. Most of the apps do NOT seem to work w/o internet, so likely this is why Circe was empty. Although I do not have a database locally to test with WebAPI running, I tried Olympus running w/o internet and have the following feedback. Looks like most of the apps try to retrieve resources from CDNs, like I originally suspected.

W/o internet:
Achilles - Seems to display
Circe - no display
Calypso - no display
Hermes - Welcome to Hermes message but similar js errors
Athena - no display
Job Status - should display

Partial output from Circe:

require.js:34 GET http://cdn.rawgit.com/jquery/jquery/1.11.2/dist/jquery.min.js net::ERR_NAME_NOT_RESOLVED
require.js:8 Uncaught Error: Script error for: jquery
require.js:34 GET http://cdn.rawgit.com/knockout/knockout/v3.3.0/dist/knockout.js net::ERR_NAME_NOT_RESOLVED
require.js:8 Uncaught Error: Script error for: knockout

The schemas that are in flyway.schemas property are just the schemas that flyway ensures exist before executing the migration scripts because if you try to create table X in schema Y, it will fail if schema Y does not exist. That’s why you can put schemas A, B, C in the flyway.schemas property, and this doesn’t mean that each CREATE TABLE is put in A B and C, it means that schema’s A B and C will be created so that any migration scripts that create tables into schemas A B or C will exist.

I am a little confused that it says that the schema ‘results’ is up to date, but I ran a test:

2015-06-04 14:54:47.792 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbValidate -  - Validated 9 migrations (execution time 00:00.597s)
2015-06-04 14:54:47.810 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbSchemas -  - Creating schema [schema1] ...
2015-06-04 14:54:47.813 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbSchemas -  - Creating schema [schema2] ...
2015-06-04 14:54:47.816 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbSchemas -  - Creating schema [schema3] ...
2015-06-04 14:54:47.842 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.metadatatable.MetaDataTableImpl -  - Creating Metadata table: [schema1].[schema_version]
2015-06-04 14:54:48.472 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbBaseline -  - Schema baselined with version: 1
2015-06-04 14:54:48.480 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.dbsupport.sqlserver.SQLServerDbSupport -  - SQLServer does not support setting the schema for the current session. Default schema NOT changed to [schema1]
2015-06-04 14:54:48.558 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Current version of schema [schema1]: 1
2015-06-04 14:54:48.558 WARN org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - outOfOrder mode is active. Migration of schema [schema1] may not be reproducible.
2015-06-04 14:54:48.558 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.1
2015-06-04 14:54:48.755 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.2
2015-06-04 14:54:48.850 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.2.1
2015-06-04 14:54:49.331 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.3
2015-06-04 14:54:49.475 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.3.1
2015-06-04 14:54:49.657 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.4
2015-06-04 14:54:50.077 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.4.1
2015-06-04 14:54:50.159 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.5
2015-06-04 14:54:50.423 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Migrating schema [schema1] to version 1.0.0.6
2015-06-04 14:54:50.580 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.command.DbMigrate -  - Successfully applied 9 migrations to schema [schema1] (execution time 00:02.092s).
2015-06-04 14:54:50.583 INFO org.ohdsi.webapi.WebApi.main() org.flywaydb.core.internal.dbsupport.sqlserver.SQLServerDbSupport -  - SQLServer does not support setting the schema for the current session. Default schema NOT changed to [dbo]

You’ll note here, I pointed my local webapi to a blank database, set 3 schemas in my config file (schema1, schema2, schema3). It only created one schema_version table to track the migrations in the first schema. But the other 2 schemas were created so that the other schemas were created. The second thing is that in my environment, using the default WebAPI migration scripts, ALL of the tables get created in ‘dbo’! This is because some of the DDL do not specify a schema (such as the V1.0.0.1__schema-create_spring_batch.sql) and others do specify a schema but they hard code it to ‘dbo’ (such as: V1.0.0.3__cohort_definition_persistence.sql). Btw: both these files are the ones form db.migration.sqlserver.

So, my point is, flyway manages the schemas, but we’re not telling the migration scripts to put the tables in the schemas that flyway manages. Sometimes we don’t specify a schema at all, and in PostgreSQL case, it puts it into public! In MSSQL case, it will be the default schema for the user. We need to be explicit with which schema our tables get created in, but the schemas that the webapi use are defined in a config file. The only way to pass along that config information is to use these placeholders.

-Chris

Ok, I see what you are saying. I assumed flyway was “altering session, setting current schema”. A quick search resulted in the following page which talks about flyway doing exactly that (setting current schema).

http://flywaydb.org/documentation/maven/migrate.html

Case-sensitive list of schemas managed by Flyway.
The first schema in the list will be automatically set as the default one during the migration. It will also be the one containing the metadata table.

This also supports why you saw the metadata table only in your first schema.

So, I don’t think “flyway placeholders” are necessary yet. I think @mdewilde was pretty close once he established a schema for ohdis/results (cdm=public, results=ohdsi). The other task is changing apps not to use CDN for resources (i.e. require internet connection), so that Marcel can use OHDSI apps w/o internet.

Hi,
I’m going to do something on the MSSQL side because from my test (see logs above) it clearly says that the default schema can not be set. And, it created all tables in the ‘dbo’ schema when the only schemas in the flyway configuration was ‘schema1, schema2, schema3’. In addition, although I don’t have the environment to test this, but this is what I’ve read about schemas on the postgres site:

In the previous sections we created tables without specifying any schema names. By default such tables (and other objects) are automatically put into a schema named “public”. Every new database contains such a schema. Thus, the following are equivalent:

CREATE TABLE products ( … );
and:

CREATE TABLE public.products ( … );

Now, on another page about CREATE TABLE documentation, it says creating a table without schema puts it into the ‘current’ schema. Even if flyway sets the first schema as the ‘current’ schema (which doesn’t work on MSSQL), if you want to put a table into a second schema (like put all spring batch related tables in a schema of it’s own) and this other schema isn’t listed first, then the spring batch tables will either go in the public schema (wrong) or the schema specified first (also wrong).

-Chris

Hi all,

I never expected that a simple connection question can lead to such a big thing involving more and more people. Many thanks to everyone involved now!!!

Reading all these messages I cannot completely follow all the details flying around. But I have a feeling that Olympus can take a cdm database as a starting point and automatically adds a lot of extra tables it needs for its differnt modules. And this automatic thing fails now in postgres because the tables are ending up in wrong schemas. Is this correct?

If thats the case: is it an option to give me the all create scripts needed for Olympus per schema and that I run this manually? Then Olympus starts on a database that already contains the tables in their correct schemas. Hopelfully Olympus will recognise these tables and will not create any new ones and just skips the whole flyway stuff. Or is this flyway thing doing complex things besides creating tables?

By the way together with Erica we figured out last week how to set the active schema:

Set schema 'cdm';

Yesterday I left the office in the following state: Database containing 2 schemas: public and cdm. The cdm schema is now created and filled using Martijns CDMBuilder. Schema public is empty. Just before leaving yesterday I started a backup to make sure I have a “clean” restorepoint if anything goes wrong. Today is my standard daddy-day at home (only work 4 days a week). Since the server is not on the internet I cannot do tests. Monday when I come back I will try the suggestion of Chris to try it in this setup described above. If this is not working I can restore the database and maybe rename the schemas or something like that.

Given the current software I think this is the only thing I can do? Or do I have to wait for an update before continuing my adventure?

Have a nice weekend,
Marcel

Yes, I feel that is what is happening.

I think with the CDM schema in it’s own location, and public is now empty, you can go ahead with the olympus install (during installation set your cdm schema to ‘cdm’, results schema to ‘public’ and Olympus will add additional tables that are required to allow it to function. You should around 10-15 new tables in public after it starts up and everything is successful. I’m sorry that we didn’t fully set up the flyway migration scripts properly, but with the cdm i its schema and if Olympus can put its tables in public, you shouldn’t have any more porblems with setup.

The remaining issue you have is that some of the web applications in Olympus pull content from the internet. I’ll see if I can put together a new push that can have a ‘fallback’ to a local copy of the javascript libraries so that if you are not able to pull them from the internet hosted provider (known as a CDN or ‘content delivery network’) it will try to load it from the local webserver. I think i can configure it to work this way, I just haven’t done i tyet.

-Chris

I just pushed an update to CIRCE/master which adds fallbacks to the CDN sourced libraries. I tested this locally, and it seems to work. I’m not sure of Olympus is auto-updated on these checkins, @jon_duke, is it automatic?

Olympus is manual right now. I’m not sure if we have a regular scheduled update, right @alfranke?

@Chris_Knoll - i don’t think we need to worry about supporting multiple “flyway.schemas”. If WebAPI introduces the need to manage more than 1 schema, from the flyway perspective, maybe we cross that bridge at that time (and add schema prefixes/placeholders for additional schema scripts only, preventing the need to wipe out / reset DBs, etc.).
Not sure why sqlserver isn’t working as expected. I might search for others that experience this…
For postgres, I believe it is working exactly right. Marcel originally had defined the “ohdsi/results” schema as IPCI_Dopey_CDM/public, which created a brand new schema, named IPCI_Dopey_CDM/public and created the tables in that schema.
If Marcel wants the cdm schema to be “public” and the ohdsi to be “results”, I think that makes sense to me. I don’t see a requirement to have the ohdsi/results schema as “public”. He could wipe out the results schema, align Olympus settings, and I think we’ll be good.

Updating Olympus is manual.

Unrelated, anyone not get discourse notifications, randomly? I didn’t see any notifications about Chris’ latest posts…

Hi all,

back again. Startpoint: complete new database build. All tables created with CDMBuilder are now in a schema named “cdm” and are filled. Schema public still there but completely empty. As a potential rollback I created a backup of this phase before starting Olympus.

Here is what I did:

  • Before starting Olympus I removed the temp directory in the AppData folder. I want to start as fresh as possible.

  • Start Olympus and logged in.

  • Started the configuration with the following parameters:

    SQL Dialect: POSTGRESSQL
    Host Server: localhost
    Host port: 5432
    SID/Service Name: IPCI_Dopey_CDM
    Username: postgres
    Password: test
    CDM Database Name / Schema: cdm
    Results Database Name / Schema: public
    Cohort Database Name / Schema: cdm

  • Saved the result. Takes a minute (Resolving host…) but without errors.

  • Lauching also tabes a minutes and automatically return to the main window with the message “There are remote WebAPIs configured. Applications taht support toggling between WebAPI’s will allow you to use therse via the gear/settings”. So far so good I think.

  • Started Heracles and typed the name of our cohort. No results…

  • Also tried toggling the gears that shows 2 options : local and public. No results.

  • Checked the database schema cdm: still has 39 tables so nothing seems to be changed here. Ok.

  • Checked the database schema public: now filled with 26 tables (expected) also including a couple of cohort tables (unexpected??).

  • Then I also noticed that the cohort tables are very different:

  1. In table cohort type attribute subject_id has bigint vs integer (public vs cdm schema)
  2. cdm schema contains table cohort_attributes and this is not available in the public schema
  3. Table cohort_definition is very different in both schemas.
  4. Two extra tables in public schema that are not in the cdm schema: cohort_definition_details and cohort_generation_info.

So… although the configuration was setup to look in the cdm schema for the cohorts it looks like it is using the public schema for that. Also the cohort tables are not compatible between both schemas.

A couple of questions:

Am I on the right track here? Do you need more info?

We have our cohort defined the the cdm cohort tables created with the “official” CDMBuilder. As far as I know the 3 tables (cohort, cohort_attributes and cohort_definition) are compatible with the v5 documentation. Are these tables compatible with Heracles? Or is this another postgres issue we bump into?

If I’m on the right track… Is it possible to fill the new cohort tables in the public schema with our own created cohort? And will Heracles then work with our own cohort? At the end this is what we want: to use Heracles on-top of our own defined cohort.

Thanks everyone,
Marcel

Hi all,

I have a MSWord document with all my steps, some screenshots and logfile. But I’m not able to post it on this forum since its not possible to post it on this forum (due to size and file formats).

If someone is interested: send me a message with you email address and I can send it via mail.

Thanks,
Marcel

I’m interested at lisa.schilling@ucdenver.edu. thanks!

Marcel,

Can you post as a link to Dropbox or as a via something like Senduit?

If not, please send to me at jonduke (at) regnenstrief.org

Thanks!

@mdewilde,

In which schema is your cohort_definition table populated with data? Can you provide the fields of this table and one example row of data?

Hi Jon,

the filled cohort_definition table is in the cdm schema and it contains the following fields and values (build using the CDMBuilder of Martijn):

 cohort_definition_id = 1
 cohort_definition_name = "IPCI Time"
 cohort_definition_description = "Standard research period"
 definition_type_concept_id = 56
 cohort_definition_syntax= NULL
 subject_concept_id = 0
 cohort_initiation_date = 2015-06-01

Table cohort is filled as follows (just one patient as example):

 cohort_definition_id = 1
 subject_id = 1
 cohort_start_date = 1996-06-01
 cohort_end_date = 2006-07-01

Table cohort_attributes is empty.

Just to prevent people are going to to the same thing: I just talked with @ericaVoss on the phone. She proposed to try and see if it is possible to give me a script to copy the content from the filled tables in the cdm schema to the cohort tables in the public schema. I will let you know if that works. After this the expectation is that the cohort will pop-up in Heracles…

For the demo I need the results (without any patientdata) of Heracles on a laptop. I assume it is enough to backup the tables from the public schema and restore it to the laptop to get the Heracles viewer started. Is this correct or do I need other tables as well? Is it correct that the public schema does not contain any data on a patient level?

Many thanks,
Marcel

@mdewilde,

I was looking through your results and was puzzled by the difference between your cdm and public formats for cohort_definition. It seemed initially to me that your cdm version was correct:

CREATE TABLE cohort_definition (
cohort_definition_id INTEGER NOT NULL,
cohort_definition_name VARCHAR(255) NOT NULL,
cohort_definition_description CLOB NULL,
definition_type_concept_id INTEGER NOT NULL,
cohort_definition_syntax CLOB NULL,
subject_concept_id INTEGER NOT NULL,
cohort_initiation_date DATE NULL
);

and your public schema one was incorrect:

CREATE TABLE cohort_definition(
id Number(10) NOT NULL ,
name Varchar2(255) NOT NULL,
description Varchar2(1000) NULL,
expression_type Varchar2(50) NULL,
created_by Varchar2(255) NULL,
created_date Timestamp(3) NULL,
modified_by Varchar2(255) NULL,
modified_date Timestamp(3) NULL,
CONSTRAINT PK_cohort_definition PRIMARY KEY (id)
);

But when I took a look at the latest WebAPI, it does appear that some confusion has emerged.

  1. @Chris_Knoll-- you’ve created an entirely new cohort_definition table structure in v1.0.0.3 and it is enforced by flyway.

  2. @cahilton’s Heracles does not use this new table structure to getCohortDefinitions

  3. @alfranke-- was Olympus supposed to create cohort and cohort_definition tables in the public schema when @mdewilde specified cdm for his cohort location?

Chris, Alex, Charity-- can you guys provide insight? Chris and Charity, have you guys had a chance to sync up on this new table structure?

All I can say is that the apps (e.g. Heracles) will use what WebAPI defines as the cohort_definition table (as you indicated Jon, this was introduced by Chris and is enforced by Flyway). I don’t know about an old cohort_definition table structure but yes, it appears that WebAPI “correctly” created the necessary (for WebAPI and therefore apps) cohort tables in Marcel’s public schema (ohdsi/results schema).

@cahilton will have to confirm whether Heracles is still using the ‘getCohortDefinitions’ sql that you found… I’m not sure about how that is being used (or if it is being used). I would have said that Heracles is using the “new” table structure (i.e. the one created by WebAPI in the public schema). Heracles is consulting WebAPI for the list of cohorts. This would explain why Marcel does not see any options in the Heracles search bar. Maybe @cahilton can provide more info about Heracles and the getCohortDefinitions.sql file.

t