Hi, Don,
Re:Question 1: Yes, we have the same issue in our own internal Large Data Appliance where it doesn’t support identity columns, etc. That’s why we extended webapi to execute the analysis queries (such as cohort creation) against any datasource, as long as the analysis query doesn’t leverage anything like identity columns. So, you should be able to have the webAPI (via the flyway configuration parameters in the maven’s settings.xml file) configured to point to postgresql, and the individual CDMs defined in the source table can be hosted on redshift, and the JDBC url in the SOURCE table will point to your redshift server…
The second part, the SOURCE_DAIMON table, contains the ‘table qualifier’ (or database schema) that should be appended to table names when performing the queries (we don’t use anythign like a ‘default schema’ in our queries, all tables are explicity qualified’. So if your cdm schema is called ‘cdm’, you’d insert a row into the SOURCE_DAIMON table (source ID 1 is assumed to be the connection settings to your redshift) like so:
INSERT INTO ohdsi.source (source_id, source_name, source_key, source_connection, source_dialect) VALUES (1, 'My Cdm', 'MY_CDM', '{your redshift jdbc connection string', 'redshift');
INSERT INTO ohdsi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) VALUES (1,1,0, 'cdm', 0);
INSERT INTO ohdsi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) VALUES (2,3,0, 'results', 0);
(This assumes that you have a schema on your postgresql database called ‘ohdsi’) This will configure the daimons to look for the CDM tables in the cdm schema (the table_qualifier was 'cdm) and the results tables are in the results schema, both these schemas defined on y our redshift database.
When you start up the tools, you will have the choice of executing a cohort defiintion query against one of your configured SOURCEs. Using the information in the DAIMON table, it will know which schema to look for the CDM and RESULT tables.
Re: Question 2: we need to create documentation about which tables need to be created in the external SOURCE database to support each daimon. The CDM daimon is simply the CDMV5 tables, the the RESULTS is something that needs to be documented. But I can tell you here, you can create the DDL for the following tables in the postgresql WEBAPI database and apply it over to your redshift database (these tables should not have any redshift restricted column defs)
cohort (*) (for CIRCE)
feas_study_index_stats (for CALYPSO)
feas_study_result (for CALYPSO)
feas_study_inclusion (for CALYPSO)
heracles_reults (for HERACLES)
heracles_results_dist (for HERACLES)
(Note: all of the above tables belong in the RESULTS schema)
I say you need to create a cohort table in the ‘results’ schema because the WebAPI tools write the data to the results schema, and the cohort definition results (the people found by the cohort defintion) are written to {results schema}.cohort. The CDM’s cohort table is not used by WebAPI.
-Chris