OHDSI Home | Forums | Wiki | Github

DatabaseConnector new function call request

I want to be able to document ETL process for OMOP database using R Notebooks. DatabaseConnector is built on top of RJDBC. I would like to add a function to DatabaseConnector to return a reference to the JDBCConnection object. Then I could use that reference in SQL code chunks in R Notebooks. Currently I am making a separate connection using RJDBC, it would be more convenient if Database connector which has already established a JDBC connection, that I use in R code chunks could provide this reference so I don’t have to make two separate connections in order to accommodate R Notebooks.

Happy to do it myself if I can get a little guidance, but again, I would need help in getting code change approved and tests approved in order to have this added to the main branch. Do not want to be forking off my own branch for this function.


Bradley Slavik

Hi @BradleySlavikAstella . DatabaseConnector is not built on top of RJDBC. It used to be, a long time ago, but for various reasons it no longer is.

A DatabaseConnector connection should be fully DBI compliant, so should just work in your R Notebooks out of the box?

That is great news. I regret that I present you with a tiny setback. It is ALMOST compatible with R Studio Notebooks. When I use the DatabaseConnector in the SQL code chunk like this:

rsConnectionDetails <- DatabaseConnector::createConnectionDetails(
                                   dbms = dbms,
                                   server = server,
                                   user = paste0(user),
                                   password = pw,
                                   pathToDriver = "~/redShift-jars",
                                   port = port)
# Connection to RedShift database
rsConnection <- DatabaseConnector::connect(connectionDetails = rsConnectionDetails)
```{sql connection=rsConnection}
SELECT pat_id, rec_count FROM sandbox_xyz.bs_enrollees_top LIMIT 1000

I get this error every time:

Error in rJava::.jcall(res@content, "[Ljava/lang/String;", "getString", : java.lang.ClassCastException: class [J cannot be cast to class [Ljava.lang.String; ([J and [Ljava.lang.String; are in module java.base of loader 'bootstrap') Failed to execute SQL chunk

So it appears that the class is not quite correct. It is looking for a LIST of String, but instead is getting J (Java object?) if I am reading it correctly.

Perhaps you understand what might be happening better than I do.

Bradley Slavik

Could you provide a reproducible example? I’m not familiar with R Studio Notebooks. Did you use specific R commands?

Here is an example that works using RJDBC.

R code chunk:

redshift_driver_class_name <- 'com.amazon.redshift.jdbc42.Driver'
drv <- JDBC(redshift_driver_class_name,
jdbcConnection <- dbConnect(drv, paste0("jdbc:redshift://", server, ':', port, '/sandbox_xyz'),  paste0(user), pw)

SQL code chunk

```{sql connection=jdbcConnection}
SELECT pat_id, rec_count FROM sandbox_xyz.bs_enrollees_top LIMIT 1000

The only thing special going on is that you need to pass to the SQL chunk a reference to the DBI connection object.

Here is a nice web page which goes over how to use SQL code chunks in R Notebooks.

Using SQL in RStudio

Of course, here is the official document on SQL code chunks in R Studio notebooks.

2.7 Other language engines | R Markdown: The Definitive Guide

Your code shows how you connect using RJDBC. Could you provide the code using DatabaseConnector that causes the issue?

Please look again at POST 3.

This looks very similar to this dbFetch issue; the code in the top post is being executed by database connector but then the call to java is attempting to convert the result to a string. The way the code currently works is to treat anything that isn’t numeric as a string (though the numeric classes aren’t properly implemented).

I have written a fix for this but will need to test this example. @BradleySlavikAstella Can you get a stack trace from the example in post 3?

I’m not sure what the return type referenced here, J would map to.

I have confirmed that this a bug that should be fixed by the dbFetch patch. However, I’m not certain that the fix will process the data type in the example you provided. What are the data types of the columns in the sql below?

SELECT pat_id, rec_count FROM sandbox_xyz.bs_enrollees_top LIMIT 1000

pat_id is a VARCHAR(3000) type, and rec_count is an INT.

Ok, I can confirm that these should be handled by the patch to Database Connector. You can install this code but it is not recommended until the package is updated.

An alternative while waiting for the fix would be to use an R code block instead of SQL, the example above:

sql <- "SELECT pat_id, rec_count FROM sandbox_xyz.bs_enrollees_top LIMIT 1000"
DatabaseConnector::renderTranslateQuerySql(rsConnection, sql)

It was never a question of getting R code blocks to work. They work perfectly. SQL code blocks are better for LARGE blocks of SQL because I can still read the code and don’t have to paste together rows with quotes around them or drop large text blocks in the middle of R code. Thank you for looking into this problem. If I can assist in creating tests or any other activity to get this fix ready for merging into the main branch, please message me.

@BradleySlavikAstella I can confirm that the fix was merged into the develop branch of DatabaseConnector earlier today. If you would like to try out your code above again install the develop branch with:

devtools::install_github("ohdsi/DatabaseConnector", ref="develop")

Letting us know of any issues you encounter would be very helpful.