Getting Started: Can someone validate these steps?

John.F.Morales · April 10, 2019, 5:45am

I’m new and need to confirm my understanding of the steps for a new PatientLevelPrediction study. I already pulled my dataset (EPIC EHR/Clarity) of risk factors, protective factors, and outcomes; it resides in Oracle, have Java installed, and have .csv with my records of interest. Are these the steps? Please correct me if this isn’t right, or provide clarification.

1a. Navigate the online Atlas tool to get familiar with major components. This has 2 exploratory data sources already installed. Don’t think I can load my CDM formatted data sources?

1b. Download and install OHDSI In A Box virtual machine. This has a Medicare data set but can I use this to create a new data set or is it only for tutorial purposes like the online Atlas?

1c. Download and install White Rabbit/Rabbit In A Hat, Achilles, Usagi, CommonDataModel, Vocabulary, Atlas, WebAPI, and PatientLevelPrediction.

Load data set into WhiteRabbit, Rabbit In A Hat and Achilles to analyze and map the data.
Use the Oracle DDL (CommonDataModel-master.zip) to create the OMOP CDM and Results tables on my Oracle schema
Usagi to map source system codes into OMOP vocabulary, then export as an append to the CDM. Load my code standardized data, add indices/PKs/FKs. If Usagi doesn’t append the data into the Oracle tables, I need to manually upload the standard code sets. I don’t think I need to use HERMES for vocabulary exploration if I’m using Usagi.
Use the local Atlas install to create cohorts. This breaks up my CDM table into usable subsets for R packages? Atlas will generate SQL scripts for my local Oracle database; run SQL on local Oracle. I don’t think I need to use Circe-be?
Use R Studio, run R packages pointing to my local Oracle data source. And analyze algorithm outputs.

I’m still unsure if I need to use Hermes or Cire.

Thanks, John

John.F.Morales · April 11, 2019, 3:07pm

Received answer from Jihwan Park, I now understand and have my answer. Thanks!!!

Hi John. I am studying and establishing a set of CDM database recently. Based on this experience I am going to answer your question.

#1 Prepare your data.

You can use usagi and map your data for OMOP CDM.
Using result from usagi, you can build ETL query. (Using Rabbit in a hat, and White Rabbit can be helpful to make ETL query)

#2 How to loading your data.
1a, 1b, : You can use Oracle DDL followed by loading your data using ETL query. Test dataset and data sources are examples, you can build your own database.

#3 Validation of data.

Using Achilles, you can verify your data.

#4. Visualization and analysis of data

Install Atalas, to view the data.
You can use R as you mentioned in list number 6.

I hope this information can help you.

Jihwan.

Jihwan Park
Researcher | Data Scientist
Catholic Cancer Research Institute |
Department of Urology Research
Catholic University of Korea