OHDSI Home | Forums | Wiki | Github

The new Working Group for Hadoop


(Shawn Dolley) #1

Hello. On 10/4/16 conference call, we introduced a solicitation of interest of people in creating a working group around the increasing use of Hadoop (and other ‘big data’ technologies presumably) to stand up CDM. As stated in the conference call, there are now a handful of pharma, research institute, and data syndication organizations who have manually stood up/handcrafted OMOP CDM on Hadoop now and interest is growing. If you have interest, please respond to this post or email me at sdolley@cloudera.com. (Perhaps to state the obvious, any internal investment Cloudera does to write code will be given back/committed to OHDSI immediately and we have no intention of doing anything other than make it as easy for prospective CDM implementers to select a Hadoop platform as it is to select Postgres or Oracle or other platform today). A number of parties have shown interest in this working group already.

H/W and DB requirements for Korean national claim data in CDM
(Taha Abdul-Basser) #2

I am interested.

(Mark Danese) #3

@jenniferduryea and I will participate

(Frank DeFalco) #4

I’m interested. Could I suggest that everyone joins the next OHDSI architecture call (10/20 at 1PM EST) to discuss how this would fit into the architecture? I think people on the architecture call could provide useful insights into what it would take to achieve your goal.

(Taha Abdul-Basser) #5

I second Frank’s suggestion concerning joining the Architecture Group call.

(Mohammad Azimi) #6

I would like to participate as well.

(Steve Lyman) #7

I would also like to participate.

(Shawn Dolley) #8

Hi. For people who have contacted me outside of the Forums, via email, I am pointing them to this post in the Forum to register their interest. Question: is the Hadoop Working Group relevant across OHDSI projects (for example, someone involved in the NLP project expressed interest) and if so, where should Forum posts ‘live’? Or can it live within the “CDM Builders” section where it is today?

(Cornelius Raths) #9

Hi. I am interested in participating.

(Malcolm McRoberts) #10

Hello. I would like to participate.

(Jason Poovey) #11

Georgia Tech is interested in participating. We will join the architecture call on 10/20.

(Shawn Dolley) #12

Frank - can you point newbies (me) or others to the page that shows the log in information or con call dial in information for the OHDSI Architecture Call?


I would like to participate. (From UTHealth)


(Derek S. Kane) #14

I’m all in too.

(Frank DeFalco) #15

The next OHDSI Architecture working group call is October 20th at 1PM EST.

The dial-in and WebEx information can be found here:

Looking forward to the discussion.

(Tom White) #16

I’m interested in participating.


I’m working now for DSE support for OHDSI. Does anyone have experience with graph db for OHDSI?

(alexander) #18

I have some experience with no sql dbs, mongo in particular. what do you
have in mind?


I’m trying to convert CMD (5) relational DB into graph. I successfully did this for Neo4J, but have significant problems with DataStax GraphDB. DataStax works in Hadoop environment, using HDFS, Solr, Spark, Cassandra.
Does anyone try to use Cassandra for OHDSI?

(Naga Eskala) #20

Hello, This is Naga Eskala. I am from QuintilesIMS. We use hadoop to transform HL7 compliant CCDA xml documents into OMOP CDM 4.0. Interested to learn what others do on this platform.