OHDSI Home | Forums | Wiki | Github

Database Connector cannot connect to Amazon Redshift, even though RPostgreSQL can

Hi,

I am attempting to connect the DatabaseConnector package for R to an Amazon Redshift instance that contains data in CDM v5. I am currently trying to do this as part of running the CohortMethod package, but I’ve had the same problem with other OHDSI packages that use DatabaseConnector.

The error I receive is:

Connecting using Redshift driver
Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : 
org.postgresql.util.PSQLException: Connection refused. Check that the hostname and
port are correct and that the postmaster is accepting TCP/IP connections.

The code (details obscured) I run to receive this error is:

library(DatabaseConnector)
connection <- DatabaseConnector::connect(
  DatabaseConnector::createConnectionDetails(
    dbms = "redshift",
    server = "aaaa.us-east-1.redshift.amazonaws.com/bbbb",
    user = "cccc",
    password = "dddd"
  )
)

However, I am able to connect using the same connection details with the RPostgreSQL library from CRAN:

library (RPostgreSQL)
connection <- DBI::dbConnect(
  DBI::dbDriver("PostgreSQL"),
  host = "aaaa.us-east-1.redshift.amazonaws.com",
  port = "5439",
  dbname = "bbbb",
  user = "cccc",
  password = "dddd"
)

This works. I have taken advantage of this fact to run other OHDSI packages such as StudyProtocols by re-writing functions such as extractAndWriteToFile to use RPostgreSQL, but I’d prefer a less hacky solution!

Can anyone suggest what I might be doing wrong?

Thanks

Hi Gwyn,

I’m going to do a similar test on my side using RedShift to see how it works in our environment. At first glance, I notice that your RedShift connection details does not contain a setting for the port number - you may want to add this and see if that helps. Also, in reviewing our RedShift connection information, we require SSL to connect which I’m assuming is also required on your connection? I think this too may be causing a problem.

I’ll circle back once I have an opportunity to test this out here.

Thanks,
Anthony

Thanks for the suggestions. The port is the default so I don’t think that’s the problem, but SSL sounds more likely. How would I use SSL here?

So I was looking through the DatabaseConnector source code and the default port is “5432” for RedShift and not “5439” so I’d recommend you explicitly set this parameter in your connection parameters. Let me know if this gets you a bit further.

I need to look into using SSL settings as part of the connection as well.

Right, I tried specifying the port like this:

library(DatabaseConnector)
connection <- DatabaseConnector::connect(
  DatabaseConnector::createConnectionDetails(
    dbms = "redshift",
    server = "aaaa.us-east-1.redshift.amazonaws.com/bbbb",
    port = "5439",
    user = "cccc",
    password = "dddd"
  )
)

…but received the same error. I was wondering if the SSL connection could be made as per the following posts:

and

However, these approaches seem to require giving the dbname as a separate parameter, whereas DatabaseConnector::connect seems to require a combined server/dbname parameter, so it doesn’t accept adding sslmode=require on the end.

Reading the JDBC PostgreSQL documentation, I thought the following might work, but it does not:

connection <- DatabaseConnector::connect(
  DatabaseConnector::createConnectionDetails(
    dbms = "postgresql",
    server = "aaaa.us-east-1.redshift.amazonaws.com/bbbb?ssl=true",
    port = "5439",
    user = "cccc",
    password = "dddd"
  )
)

Your correct in that it does not work that way unfortunately. I’m making some amendments to the DatabaseConnector package to allow for specifying these extended connection properties. I’ll circle back when this is complete. Thanks!

Fantastic, thank you very much!

Hi Gwyn,

The DatabaseConnector package has been updated on Github to allow for adding extra connection settings (like SSL) as part of the configuration. To show you how this might work, I’ve updated your example from above:

connection <- DatabaseConnector::connect(
DatabaseConnector::createConnectionDetails(
dbms = “redshift”,
server = “aaaa.us-east-1.redshift.amazonaws.com/bbbb”,
port = “5439”,
user = “cccc”,
password = “dddd”,
extraSettings = "ssl=true&sslfactory=com.amazon.redshift.ssl.NonValidatingFactory"
)
)

These “extraSetttings” will be included in the underlying JDBC connection and allow for whatever connection settings are required. In the example above, I included an additional parameter for the “sslfactory” that was required when using RedShift here and thought you may also need that setting as well.

Hope this gets you connected.

Thanks Anthony. It still does not connect but the error message has changed. When using the following:

connection <- DatabaseConnector::connect(
  DatabaseConnector::createConnectionDetails(
    dbms = "redshift",
    server = "aaaa.us-east-1.redshift.amazonaws.com/bbbb",
    port = "5439",
    user = "cccc",
    password = "dddd",
    extraSettings = "ssl=true&sslfactory=com.amazon.redshift.ssl.NonValidatingFactory"
  )
)

The error message is now:

Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : 
  java.sql.SQLException: [Amazon](500150) Error setting/closing connection: Connection timed out: connect.

I also tried with extraSettings = "ssl=true" and the same thing happened.

This looks like a network connection error (e.g a firewall issue). Just to be clear: is this from exactly the same machine where you can successfully connect using RPostgreSQL?

Hi Martijn,
This is indeed from the same machine and indeed the same R session that can successfully connect with RPostgreSQL.
Thanks

Hi Gywn,

Can you verify the port that your using to connect? As a test, I changed my port configuration from the default (5439) to 5438 and received the same error you did. The only other thought I have is that another process is blocking on port 5439 which is causing the connection error. I’d make sure you’ve properly closed and disposed of all connections using RPostgreSQL (or any other apps that may be connected on this port).

Hope this helps.
Anthony

That is the correct port, but I can see now this is not a problem with DatabaseConnector, because manually composing the connection string and running RJDBC::dbConnect results in the same error. I’ll keep investigating. Thanks for your help and the extraSettings update.

t