OHDSI Home | Forums | Wiki | Github

Supporting DBMS platforms other than SQL Server, PostgreSQL, Oracle and RedShift


(Martijn Schuemie) #1

There’s been a request for adding support for our applications for Netezza (see issues in SqlRender and DatabaseConnector. Before we decide whether this is something we can and want to do, here’s my impression of what would be needed to add any DBMS to our list:

  1. Adding the JDBC driver to DatabaseConnector. In the case of Netezza, this should be straightforward since a JDBC driver seems to exist, but it is not publicly available (I can’t find it). We’d need to see how to get it, and what the licensing conditions are.

  2. Add translation rules to SqlRender. Just like RedShift, Netezza seems to claim to use similar SQL syntax as PostgreSQL, but at least for RedShift there are still some major differences. We can start by copying all the rules we have now for PostgreSQL, and see where it breaks.

  3. Create a testing environment for Netezza. I’m currently testing all our code against local SqlServer, Postgres, and Oracle databases before a release, and @mkhayter is our unofficial RedShift guinea pig. (sorry Mark!). Luckily, @lee_evans is building a testing environment that will be available to everyone, although I’m pretty sure for now it won’t support RedShift or Netezza.

  4. Every software we create needs to be debugged using this new testing environment. Just as an example of what you might encounter: I found that RedShift has no way to implement the equivalent of “DELETE TABLE IF EXISTS”, and I therefore had to create a workaround in DatabaseConnector specifically for RedShift for some applications.

In short, adding support for another DBMS would, in my humble opinion, be non-trivial and require additional resources. If someone is willing to make those available, there should be no problem.


Grouping by complex expressions
OMOP DDL for IBM DB2
(Aaron Browne) #2

Hi!

This is Aaron Browne, technical lead for the PEDSnet Data Coordinating Center implementation and source of the request for Netezza support. The folks we are working with at Seattle Children’s on PEDSnet use a Netezza database and we had hoped to make work for them. My thoughts in reply to Martijn are:

  1. I’ve asked Nigel, my contact at Seattle, to join this discussion, and hopefully he can shed light on the licensing of the jdbc driver.
  2. I have no way to predict where Netezza will break using PostgreSQL compliant code, but maybe Nigel can take a look at the existing translation rules and spot any incompatibilities.
  3. I obviously can’t help with your testing environment, but maybe Nigel and the Seattle team would be willing to act as your Netezza guinea pig?
  4. See above responses to 2 and 3.

I greatly appreciate your dedication to testing code before releasing it, but I’m also trying to get Achilles working for Seattle pretty quickly, what would you think about a PEDSnet fork of DatabaseConnector and SqlRender as a middle ground (this might also get around any driver licensing issues since the driver wouldn’t be so public)?

Thanks for your time and consideration on this.
Aaron


(Martijn Schuemie) #3

Hi Aaron,

If the Seattle team is willing to be guinea pigs, then that would give Netezza the same status as RedShift (meaning: you can run it, but we can’t guarantee it will work).

I would suggest you just create Netezza branches in both DatabaseConnector and SqlRender. That way, if things work OK we can merge these branches into the masters, and its easy to keep the branches up to date if needed.

In DatabaseConnector you’d need to put the JDBC driver JAR file in the inst/java folder, and make sure to add the necessary code to the connect and createConnectionDetails functions.

In SqlRender, in the csv file I would suggest you copy all the rules for postgresql and make the target netezza.

Let me know if you have any problems.


(Christian Reich) #4

Aaron:

I can serve as a guinea pig and get some testing done on a Netezza box as well, if Nigel runs out of steam. I am 90% JDBC comes with Netezza, it’s already expensive enough, charging a $10k for an essential piece makes no sense for a $1M piece of iron.

C

From: Aaron Browne [mailto:notifications@mail132-6.atl131.mandrillapp.com] On Behalf Of Aaron Browne

Sent: Sunday, December 07, 2014 3:09 PM

To: reich@ohdsi.org

Subject: [OHDSI Forums] [Developers] Supporting DBMS platforms other than SQL Server, PostgreSQL, Oracle and RedShift

aaron0browne

December 7

Hi!

This is Aaron Browne, technical lead for the PEDSnet Data Coordinating Center implementation and source of the request for Netezza support. The folks we are working with at Seattle Children's on PEDSnet use a Netezza database and we had hoped to make work for them. My thoughts in reply to Martijn are:

  1. I've asked Nigel, my contact at Seattle, to join this discussion, and hopefully he can shed light on the licensing of the jdbc driver.
  2. I have no way to predict where Netezza will break using PostgreSQL compliant code, but maybe Nigel can take a look at the existing translation rules and spot any incompatibilities.
  3. I obviously can't help with your testing environment, but maybe Nigel and the Seattle team would be willing to act as your Netezza guinea pig?
  4. See above responses to 2 and 3.

I greatly appreciate your dedication to testing code before releasing it, but I'm also trying to get Achilles working for Seattle pretty quickly, what would you think about a PEDSnet fork of DatabaseConnector and SqlRender as a middle ground (this might also get around any driver licensing issues since the driver wouldn't be so public)?

Thanks for your time and consideration on this.

Aaron


To respond, reply to this email or visit http://forums.ohdsi.org/t/supporting-dbms-platforms-other-than-sql-server-postgresql-oracle-and-redshift/138/2 in your browser.

To unsubscribe from these emails, visit your user preferences.


(Aaron Browne) #5

I don’t have write permissions on the OHDSI repos, so I’ve forked to my personal account and created branches for netezza support. If you want to make netezza support branches in the OHDSI repos, I’m happy to submit PRs, or testers can get the beta code from my repos. The relevant links are:

Aaron


(Jon Duke) #6

Aaron,

You can submit a Pull request to the OHDSI repo. Probably best way to keep in one place. Martijn can of course also add you to the repo directly.


(Aaron Browne) #7

Right, the two pathways are:

  1. Submit a PR to the master branch of the relevant repos, then
    • testers will get code from my repos to test
    • once sufficiently tested, the PRs will be merged
  2. Someone with OHDSI repo write perms creates netezza-support branches, then
    • I submit my PRs there and they get merged (relatively) quickly
    • testers can get code from OHDSI repo netezza-support branches (and I can send additional PRs with updates)
    • once sufficiently tested, the OHDSI netezza-support branches can be merged into OHDSI masters

I just wasn’t sure which pathway you guys preferred. Let me know.
Aaron


(Jon Duke) #8

Will await Martijn’s input on this question specifically. @Frank, @lee_evans, I guess we’ve got to get those Developer Guidelines cooking… :shaved_ice:


(Martijn Schuemie) #9

I would prefer developing in branches in the OHDSI repo’s. @aaron0browne: I’ve invited you to become a member of the developers team, so you can push to the repositories directly.

I do request that merging into the masters be left to me, so I can run my full suite of tests to make sure nothing breaks somewhere else (unlikely, but still).


(Aaron Browne) #10

New branches created and PRs submitted. The PR is not intended to indicate the code is ready for merging, just to allow everyone to easily review the code changes. I’ll continue to update the branches as comments and test results come back.

Aaron


(Nigelhartell) #11

Hi all,

This is Nigel Hartell with Seattle Children’s who Aaron mentions in the above thread. I’m pretty sure licensing should be a non-issue with the jdbc driver, but will follow up with IBM to confirm.

Seattle can be the guinea pig Netezza environment if that is required.

I can also take a look at the code to see if there are non-compliant coding issues as we port to Netezza from PostgreSQL.

Thanks,

Nigel


(Martijn Schuemie) #12

I merged the Netezza DatabaseConnector branch with master, because I didn’t want them to run too far out of sync and the code looked fine. Feel free to branch again if you want to make further changes.

Has any testing been done on Netezza? I’m expecting more changes will be needed on the SqlRender Netezza branch, so will keep that one separate.


(Hajarhomayouni) #13

Hi all,

I need to add support for Google BigQuery for OHDSI tools. Based on your comments, I know that I should change both SqlRender and DatabaseConnector. Any idea if it is possible at all, and where I should start?

Thank you,
Hajar


(Martijn Schuemie) #14

Hi Hajar,

Note that adding another DBMS to the list of platforms supported by OHDSI will be a major undertaking. My first guess is that just to get the OHDSI R packages working you’ll need a half person-year of an efficient Java + R + SQL programmer. Because there’s no JDBC driver for GBQ, adding GBQ to the WebApi will probably take an equal amount of effort. Make sure you are willing to commit the resources to do so before you proceed.

If you do want to go ahead, here are the steps:

First, support needs to be added to DatabaseConnector. This package relies on the JDBC architecture, but unfortunately there does not appear to be a (viable) JDBC driver for GBQ. This is not the end of the world, but it means that all functions of DatabaseConnector will have to be reimplemented for Google’s specific API, including these functions: connect, executeSql, querySql, querySql.ffdf, and insertTable.

Second, support needs to be added to SqlRender. This means rules need to be added to the translation table. Most importantly, these rules need to be tested with all the OHDSI packages.


(Hajarhomayouni) #15

Thank you Martijn, but what about starschema JDBC, isn’t it a viable JDBC driver for GBQ?


(Martijn Schuemie) #16

Oh, I hadn’t seen that one. Yes, that should make life a lot easier. First step would be to add that to the DatabaseConnector package.


(Hajarhomayouni) #17

Thank you, I have one more question, it seems that GBQ does not support some queries (I could not find any for insert and update!!). Is this a big issue? Is it resolvable?

Many Thanks,
Hajar


(Patrick Ryan) #18

Yes, that’ll be a big issue. Across the OHDSI tools, we make extensive use
of various ANSI SQL functions, including CREATE, INSERT, SELECT, DELETE
operations, various functions including date comparisons and
partitioning/windowing functions, COUNT/COUNT_BIG/SUM, etc. If GBQ only
provides functionality for a subset of ANSI SQL, you will very likely run
into instances where your platform will not support the OHDSI analysis
capabilities.


(Hajarhomayouni) #19

I found a JDBC for GBQ that supports other queries. I added required code in DatabaseConnector.R and tested it executing select queries and it works correctly. However, when I try to insert a table using:

==============================================================
data <- data.frame(a = c(1, 2, 3), b = c(“a”, “b”, “c”), stringsAsFactors = FALSE)
insertTable(connection = conn,
tableName = “temp”,
data = data,
createTable = TRUE,
tempTable = TRUE)

I get the following error, am I missing something?

DBMS:
googlebigquery
Error:
execute JDBC update query failed in dbSendUpdate (executeUpdate does not accept SELECT statements. Use executeQuery instead.)
SQL:
IF OBJECT_ID(‘tempdb…#temp’, ‘U’) IS NOT NULL DROP TABLE #temp


(Vojtech Huser) #20

I am glad to see interest in more dialects being supported.

I have not seen UPDATE sql command in many studies or tools. Mostly a table is dropped and re-created.

For INSERT - Achilles is using plain INSERT INTO VALUES.
But I have not seen studies using INSERT VALUES code.

As for the R function insertTable() - I also think it is not heavily used.

If you can suport CREATE table, SELECT, DELETE and temp tables in GBQ - the bulk use may be covered.

Trying to run Achilles on GBQ is a good test of functionality. Look at SQL files here to get an idea: https://github.com/OHDSI/Achilles/blob/master/inst/sql/sql_server/Achilles_v5.sql


t