How to populate WebAPI tables?

question · October 13, 2020, 4:35am

I just installed WebAPI and have questions how to populate data into the tables:

I assume the default data like vocabulary will come from Athena download https://athena.ohdsi.org/vocabulary/list. Is this correct?
Some older documents mentioned CDM V5 data. Is CDM v5 data the same thing as Athena?
I am looking for postgresql statements to load Athena data into WebAPI tables. The closest thing I find is https://github.com/OHDSI/CommonDataModel/tree/master/PostgreSQL. But it seems out of dated. It still talks about creating tables such as cohort and cohort_definition, while all the tables are already created during WebAPI installation. I am not sure about the Alter statements though. Is there any recent document on loading data into WebAPI tables? I feel WebAPI document isn’t so great covering this part.

Chris_Knoll · October 13, 2020, 4:57am

WebAPI’s database is completely independent from CDM sources (the database that contains the CDM/Vocabulary tables and the results schema for WebAPI analyses). When you say ‘WebAPI tables’, it sounds (to me) like you’re referring to tables that are in the WebAPI database…the WebAPI database tables are automatically managed so you do not need to do any create opertaions to any WebAPI table. You do, however, need to insert records into the webapi.SOURCE and SOURCE_DAIMON tables to register CDMs that you have set up.

CDM tables need to set up manually, which does involve fetching a vocabulary from Athena, and converting your own patient-level data into the CDM format and loading it. You would use the DDL from the CommonDataModel repository to create the tables.

Within the CDM database, you will need to create the WebAPI results schema. Instructions on feting the DDL from WebAPI and subsequent INSERTs into WebAPI to register a CDM source for use in WebAPI can be found here.

question · October 13, 2020, 5:24am

Thank you Chris. I was totally confused by the WebAPI database and the CDM database. So I need to create a new CDM database and fetch Athena data into it. I thought Athena data goes into the WebAPI database!

I am really new to this. My original goal is to set up a WebAPI server so I can make API calls to it. I found this link http://webapidoc.ohdsi.org/index.html. It gives me some hint. But I really don’t know how to make the API capabilities working on my localhost. When I was testing http://localhost:8080/WebAPI/vocabulary/1PCT/search, it returns the following error:

{“payload”:{“cause”:null,“stackTrace”:[],“message”:“An exception ocurred: javax.ws.rs.NotAllowedException”,“localizedMessage”:“An exception ocurred: javax.ws.rs.NotAllowedException”,“suppressed”:[]},“headers”:{“id”:"…",“timestamp”:…}}

Can you help me how to make the vocabulary call work?

Chris_Knoll · October 13, 2020, 1:36pm

I just checked in my env, and the URL:
/WebAPI/vocabulary/VOCABULARY_20200525/search?query=psoriasis
(my source key was VOCABULARY_20200525, while yours is 1PCT)…

It worked for me, I’m not sure why you are getting that exception…

Could you go to the endpoint:
/WebAPI/info : to get information about your version
/WebAPI/source/sources : to get information about your installed CDM sources. 1PCT should be one of them.

Let me know what these are and I’ll have more information about what may be wrong.

question · October 13, 2020, 2:12pm

Hi Chris, I haven’t set up a CDM database or sources yet. I only have a ‘webapi’ schema in my OHDSI database. Shall I create a new schema called ‘vocabulary’ in the OHDSI database and then load Athena?

I am testing ‘/WebAPI/vocabulary/1PCT/search’ because it is from the source code of Criteria2Query. I guess the error is because I don’t have a vocabulary schema? Could you articulate the steps you did to install CDM sources and create your source key VOCABULARY_20200525? Thanks!

Chris_Knoll · October 13, 2020, 2:13pm

That information should be covered here:

I would also suggest that you don’t go with the option to separate the vocabulary schema from the CDM: the vocabulary tables are part of the CDM schema and you should have the associated vocabulary tables aligned with the patient level data that was ETL’d. So, set up your CDM database, create a cdm schema, load the vocabulary and patient level data into it, and then add the SOURCE and SOURCE_DAIMON records to WebAPI so that you can query against it via WebAPI.

question · October 13, 2020, 2:28pm

Is the CDM database a separate database from the OHDSI database? During WebAPI installation, the OHDSI database is already created. I am using postgresql. The schemas under OHDSI include webapi, public, pg_catalog, and information_schema. Is it OK to create new schema under OHDSI? Or I do need a separate database for the CDM (vocabulary, patient data)?

Chris_Knoll · October 13, 2020, 3:31pm

Sorry for the confusion, I used the term ‘WebAPI database’ earlier, and I would have said ‘OHDSI’ database if I had known that’s the label you’re putting on the database that WebAPI references in its configuration.

The CDM database is a separate database from the OHDSI database. You can have many CDM databases configured and referenced by a single WebAPI instance (stored in the ODHSI database). In the OHDSI database, the tables that WebAPI uses (and automatically manages) are found in the webapi schema.

Technically, you CAN put your individual CDM databases into separate schemas within your OHDSI database.

The separation I’m suggesting, tho, allows you to set up different failover and redundancy configurations for your WebAPI database (which is heavily transactional) vs your CDM datbases (which is lightly transactional but only in the ‘result’ schema where the output of an analysis is stored, such as cohort generation, or characterization).

The accounts used to access WebAPI also have elevated permissions to alter WebAPI tables between version updates, and those elevated permissions would never be used against any of your CDM schemas, so that may be a security concern.

Typically, CDMs require far more space and have different operating characteristics than the WebAPI database…WebAPI may have several thousand rows of data in it, while CDMs may have billions. So your hardware configuration may be different to support WebAPI workloads comapared to CDM workloads.

You may want to take certain CDMs offline independently of the WebAPI database.

So, many reasons to separate those, so I’d just get comfortable with that configuration now instead of trying to separate ‘internal schemas’ from WebAPI later when you wish you’d set it up as separate databases.