OHDSI Home | Forums | Wiki | Github

Need help and guidance to implement CDM


(Mehrdad Forouzanfar) #1


I have introduced myself before, but still quite new to OHDSI. I have been watching activities and reading materials to understand CDM and OMOP. I have acquired SEER-Medicare data and now is the time to practice and implement OHDSI approach and tools as a RWD infrastructure for my company. I appreciate any help and advice. I’d really appreciate if someone accepts to show me how to start and guide me through different steps.

I have MD and PhD of epidemiology. I have been involved in number crunching and data analysis at academy and industry. I use R and Stata and I am familiar with python, database structure and sql commands somehow. I have a good IT and database support at my company for IT technical aspects.


(Kristin Feeney Kostka, MPH) #2

Hi Mehrdad!

Welcome, welcome! As they say, “A journey of a thousand miles begins with a single step.”

To start, what does your technical stack look like? Are you using AWS by any chance? @JamesSWiggins has an incredible reference architecture available and can point you to his CloudFormation templates.

If you’re not on AWS, tell us a bit more about your environment. What kind of database layer are you using? What kind of analytics layer do you have? @Chris_Knoll, @anthonysena, @Ajit_Londhe, and @Frank are friendly folks who can direct you to the right documentation for your architecture.


(Mehrdad Forouzanfar) #3

Hi Kristen,

Thanks for the reply. I never used AWS (any web server). We are planning to save the data on a local server/database. I don’t know the answer to your questions but I would be happy to followup with my IT colleagues. Would you or other colleagues send me some materials to study?


(Mark Danese) #4

The SEER Medicare data can’t be placed on any cloud server according to the data use agreement. Just FYI. Part of the application asks for the location of the data.

(Mark Danese) #5

You should also talk to people in the oncology workgroup (@rimma @mgurley). They are wrestling with some issues specific to oncology data. For example, one challenge is mapping the SEER location, histology, and behavior information into a single standard concept (i.e., lung = location, 9050, 9051 and 9052 might indicate “mesothelioma”, and “/3” would be used to indicate malignant). These 3 ideas don’t fit neatly into one standard concept yet, as far as I am aware. I believe people are working on that and there may be something available for use or testing. LOINC codes can be used to map other oncology features like grade, stage, etc.

(Rimma Belenkaya) #6

Thanks to @Dymshyts and his team, precoordinated standard concepts corresponding to the SEER combinations of histology/topography/behavior are in the OMOP vocabulary, ready for testing. Please use these documents for the background information and ETL instructions:


Don’s hesitate to reach out.

(Mehrdad Forouzanfar) #7


(Anton) #8

Hi Mehrdad,

The CDMBuilder tool (link below) has ETL for SEER -> CDM v5.2. It is windows application and it supports MS SQL and Amazon Redshift databases. If you need help with CDMBuilder just let me know.


(Mehrdad Forouzanfar) #9