Adopting SQLRender to Netezza

Christian_Reich · February 6, 2015, 11:07pm

Martijn:

Couple of questions. Since all the functionality is hidden in some java program and I am trying to stay away from it: Is it enough to edit the file “replacementPatters.csv” and all will be good?

If so, if I want to get rid of a command entirely, I have to state it in the Pattern field with all parameters, and put an empty string in the Replacement column, correct? Concretely, there is no “CREATE INDEX” or any of it’s variants in Netezza. So, I replace “CREATE INDEX @name ON @table (@variable)” with nothing?

Why do some of the Patterns have a semicolon at the end, and others have not?

Does the order make any difference?

schuemie · February 8, 2015, 11:11pm

Yes, you should be able to do everything you want to do by just modifying the rules in the replacementPatterns.csv file. If for some reason that really doesn’t work and you need additional functionality, please let me know.
Yes, you can make commands go away by setting the Replacement value to empty
A pattern cannot have a wildcard value (e.g. ‘@a’) at the beginning or the end, so USE @schema will throw and error, and USE @schema; will not. Remember that these wildcards can stand for anything, from a single word to en entire code block (although never spanning multiple SQL statements), which is why SqlRender needs to know when to stop.
The order is the order in which the replacements are executed (top to bottom). I’ve used this fact on several occasions, as you can tell by some of the weird patterns already in the file. A simple example is the rules to fix the fact that Oracle always needs a FROM clause. The first rule (line 36) just adds FROM DUAL everywhere. The second rule (line 37) removes FROM DUAL everywhere where you now have FROM twice in the same SELECT. (Do not ask me why Oracle breaks when there is no FROM clause, but is happy when there is a FROM DUAL clause. Oracle moves in mysterious ways)

Christian_Reich · February 9, 2015, 6:40pm

Makes a lot of sense.

Next question: why do you use par_open and par_closed in some places, but most everywhere else you just use ( and )?
In Netezza, we cannot do the if exists clause for dropping tables. The workaround is a stored procedure. That one can be created or replaced, though. However, that would require that the SQLRender must be used in context with the DatabaseConnector, otherwise it won’t have the procedure initialized. Is that the case?

saradempster · February 10, 2015, 3:30am

Hi Martijn,

Thanks for your input and offer of additional funtionality if needed.
Christian has created a modifed replacementPatterns.csv file, and we are working to have it fed into the code rather than the default file that was provided. I found this comment in the java source file SqlTranslate.java

 * @param pathToReplacementPatterns
 *            The absolute path of the csv file containing the replacement patterns. If null, the csv file inside the jar is used.

However, based on our efforts (see details below), it appears that even if we define and pass a path to the new csv file designed to work with Netezza, the code still reads the one inside the jar. It would be great to turn on the functionality to pass this file? Or maybe we’re doing something wrong? Could you help trouble shoot?

First we modified the line of code in SqlRender.R
changed:
pathToReplacementPatterns <- system.file(“csv”,“replacementPatterns.csv”, package=“SqlRender”)
To:
pathToReplacementPatterns <-"/gpfs/users/sara/R_libs/replacementPatterns.csv"
sourced SqlRender.R and no difference in the output.

To test this further, we tried the following direct calls to the java TranslateSql.jar in an R session.

pathToReplacementPatterns <-"/gpfs/users/sara/R_libs/pathToReplacementPatterns.csv"
translatedSql <- rJava::J(“org.ohdsi.sql.SqlTranslate”)$translateSql(“select getdate() from SARA;”, “sql server”, “oracle”, pathToReplacementPatterns)
translatedSql
[1] “select SYSDATE from SARA;”
translatedSql <- rJava::J(“org.ohdsi.sql.SqlTranslate”)$translateSql(“select getdate() from SARA;”, “sql server”, “netezza”, pathToReplacementPatterns)
translatedSql
[1] “select getdate() from SARA;”

Whereas the expected answer for Netezza should be
“select current_date, from SARA;”

Let us know what you think.

Thanks,
Sara

saradempster · February 10, 2015, 4:28am

PS- @Christian_Reich - could you upload our .csv file. When I attempted it, I received a message that new users are not allowed to upload files.

Christian_Reich · February 10, 2015, 8:07pm

replacementPatterns.xlsx (14.1 KB) Here you are. It is csv, I just saved it as xlsx so the Wiki would take it.

schuemie · February 10, 2015, 10:23pm

Your code works fine when I try it. One thing to remember is that the SqlRender Java component only loads the replacement patterns once, the first time when you call it. After that it has the patterns in memory, and will ignore the path to replacement patterns parameters.

(Also, you may have a typo in the first line of your code. I don’t know if your file was really called ‘pathToReplacementPatterns.csv’)

schuemie · February 10, 2015, 10:30pm

The fact that Netezza doesn’t support DROP TABLE IF EXISTS is pretty bad. I don’t know exactly how these stored procedures work. Do you already have the procedure stored on your server, and can we just assume it exists?

Else we can use the hack in DatabaseConnector I was previously using for RedShift (seems Amazon finally implemented the functionality in a recent update). People that do not use DatabaseConnector would be out of luck, but fortunately most of them would never drop a table (since they’re calling from web services which typically are not given write privileges).

Christian_Reich · February 11, 2015, 4:15am

If it only loads the replacement patterns once, how can I test and debug? How do I force it to load it?

Yes, the file was called ReplacementPatterns. Sara must have got that wrong when reporting here. On the other hand, the code isn’t exactly verbose about what it is doing.

Yeah, I will initialize the stored procedure when using DatabaseConnector. Plus, unless I drop the stored procedure it will just be there. So, it’s not ideal, but it will work.

What about the <par_open> and <par_close> variables, or the <start_of_insert>?

Vojtech_Huser · February 11, 2015, 12:53pm

I like very much the idea of adding Netezza (or even more official supporting RedShift).
The CSV pattern file may need some updating over time.

And perhaps even documentation of best practices of writing source TSQL. (known limitations)
Maybe one day we may even have more than one source dialect… (OK, maybe not )

I also posted a separate problem with the patterns (hours difference)

saradempster · February 11, 2015, 4:05pm

Thanks Martijn. Yes - that was just a typo in translating to the wiki.

So I tried starting a fresh R session, loading the SqlRender package and passing the file path on the first call to SqlTranslate which did the trick! Now we have:

translatedSql <- rJava::J(“org.ohdsi.sql.SqlTranslate”)$translateSql(“select getdate() from SARA;”, “sql server”, “netezza”, pathToReplacementPatterns)
translatedSql
[1] “select CURRENT_DATE from SARA;”

To @Christian_Reich’s point, if we’re actively changing the csv file and debugging, we have to clear the SqlRender package and reload each time? Is there any workaround so that each call to translateSql will pick up the new csv file each time?

Vojtech_Huser · September 22, 2016, 7:40pm

I spoke today to someone who has their OMOP CDM data in Netezza. @saradempster, did you finish your goal of extending the translation file?

I don’t see netezza listed in this file https://github.com/OHDSI/SqlRender/blob/master/inst/csv/replacementPatterns.csv

If the decision was to give up, what was the set of reasons not to implement Netezza in the end? (just the drop if exists?)

mgkahn · September 25, 2016, 11:14pm

A bit off-topic but Colorado gave up bringing Google BigQuery and Google CloudSQL into the SqlRender fold because of missing functionalties that we couldn’t figure out (or didn’t spend enough time creating a hack). With Google BigQuery, there are rapid releases of new SQL syntax functionality so we may be able to bring that environment into the SqlRender world in the future. CloudSQL, which is based on MySQL, may be a lost cause…

Robert_Carroll · January 12, 2017, 2:37pm

@saradempster @Christian_Reich

Hi everyone, bumping an old thread here. We are in the process of implementing the CDM in a netezza environment at Vanderbilt, and we are wondering if there’s been any progress on the translate file. I’d be glad for any follow up information, though we can start from what was posted earlier in the thread if there’s nothing new.

Thank you for the help!

Christian_Reich · January 12, 2017, 8:29pm

@Robert_Carroll:

@saradempster needs to weigh in, but as far as I know nothing more happened. I left AZ (and the Netezza box I used to love), and then there was nobody to pick it up. Sara is a scientist. So, sorry. However, if you run into trouble I might help with advice. Maybe.

Robert_Carroll · January 12, 2017, 8:44pm

Not a problem, thank you for the response. We’ll start with the above posted translate file. Were there any big “to-do’s” on your list? Either way this is a helpful start!

lee_evans · January 13, 2017, 3:39pm

Hi @Robert_Carroll I used to manage the AstraZeneca Netezza box that Christian mentioned.

My company http://ltscomputingllc.com has been looking at how to add Netezza support to SQLRender.

Send me an instant message on this forum with your email address and let’s talk.

Vojtech_Huser · January 13, 2017, 4:39pm

Two other netezza sites (Marshfield Clinic and NIH IRP) would also be interested in netezza support. The SQL patterns is one problem I can help with (and a rather small problem, I would say) but the driver files (their proper inclusion in the SQL Render package) is something that requires technical expertise (e.g., @lee_evans)

lee_evans · January 13, 2017, 5:02pm

Hi @Vojtech_Huser do you have contact info for the interested folks at those two Netezza sites? I’d be happy to speak with them if you’d like to instant message me.

Gowtham_Rao · January 13, 2017, 6:59pm

@Vojtech_Huser @lee_evans @Robert_Carroll We are in the process of deploying OMOP on Netezza and this would be a good opportunity to collaborate. Happy to facilitate a phone call. Please PM me.