OHDSI Home | Forums | Wiki | Github

Databricks / Spark Update

All,

A few updates on Databricks / Spark support of OHDSI tools:

  1. I’ve introduced some bugfixes for Spark in SqlRender, it’s in the develop branch: https://github.com/OHDSI/SqlRender/tree/develop. It should make it to master branch soon.

  2. The DDL can be obtained via the CommonDataModel R package rather than being a static set of files. Use the below R commands:

    remotes::install_github(“OHDSI/CommonDataModel”)
    remotes::install_github(“OHDSI/SqlRender”, ref = “develop”)
    cdmDatabaseSchema ← “” # name of intended schema for CDM tables
    ddl ← CommonDataModel::createDdl(cdmVersion = “5.3”)
    ddl ← SqlRender::render(ddl, cdmDatabaseSchema = cdmDatabaseSchema)
    ddl ← SqlRender::translate(sql = ddl, targetDialect = “spark”)

  3. Full Atlas/WebAPI Spark support is targeted for v2.11.

  4. DatabaseConnector, I’m putting finishing touches on. We had some code for bulk inserting data that failed some unit tests.

  5. Please use the new Databricks JDBC driver (old one had the log4j security vulnerability): JDBC Drivers Download - Databricks

  6. Achilles works; CohortMethod in progress.

Thanks,
Ajit

1 Like

Hi @Ajit_Londhe that’s great news!

Is Spark support already in place in a branch of WebAPI?

@jon_duke not yet; there’s a dependency on ArachneCommons we’re waiting to get cleared up. But the Odysseus team is working on getting that resolved asap.

t