OHDSI Home | Forums | Wiki | Github

DQD (Data Quality Dashboard) tool Configuration in CDSW with Hive

Hi,

Can I have some guidance for configuring the DQD tool in CDSW (Cloudera Data Science Workbench) environment? I don’t seem to find any.

I have however managed to install DQD in CDSW in a “Workbench” environment and not R-Studio server. Please note that my CDSW is on Citrix and therefore no internet from within the CDSW.

Post DQD installation, I am unable to connect with Hive. I have the JDBC URL of Hive but it seems like some more libraries are required for this connection. Can you please suggest what is missing for Hive connection?

Thanks much,
Rana

Hi there, please check out the documentation for DatabaseConnector here. This is the package DQD (and most other OHDSI R packages) use to connect to a database. It doesn’t look like Hive is listed as a supported database system but maybe you could try searching the Forums here to see if anyone else has done that before.

Same goes for CDSW - I recommend searching the Forums to see if other OHDSI folks have experience setting up the tools in that environment.

Hi Katy, Thanks much for the suggestion. I did search for any thread of CDSW integration with DQD before posting but unable to find any. Maybe I will check again to see if I get lucky :slight_smile:

Also thanks for the link to DBConnector. I did see Hive not listed there as Hive isn’t a database (but warehouse) but what I did see were the databases supported for Hive to work (via HMS/HCatalog) like Oracle/PostgresSQL etc and our side we are using one of them.

Maybe what I am looking for is a configuration via JDBC or Jars which could connect with the Hive services and translate them to fetch the results for DQD to act on.

Any help on this aspect will be highly appreciated. Thanks again!!

t