Hello everyone!
I am from the vantage6 development team and am looking into coupling the OMOP CDM to our infrastructure. I’ve tried several things, but am looking for some advice an thoughts on this topics. Let me start by explaining our initial idea without too much prior knowledge about the OMOP CDM.
Disclaimer: Sorry if I made any errors, assumptions or silly remarks bellow as I am by no means familiar with the OHDSI tools
Initial Idea
Vantage6 typically requires record level data to do its analysis on. So the initial step was to determine which tools we can use from the OHDSI community. The following steps had to be executed:
- Create JSON cohort definition ← Using ATLAS or OHDSI/Capr
- Cohort Creation ← vantage6 uses OHDSI/CirceR and OHDSI/SQLRenderer to construct valid OMOP CDM SQL. Finally the OHDSI/DatabaseConnector can be used to create the cohort table.
- Data Extraction
Option 1: we extract data from the OMOP CDM and create a fixed tabular dataframe for the cohort
Option 2: the user can specify which data items (based on concept ids) are extracted - vantage6 post processing ← not so interesting for this post
- vantage6 analysis ← not so interesting for this post
This can be done as we already have a somewhat working prototype of this. Jay! The downside of this approach is:
- vantage6 requires a direct connection with the database (WebAPI would be safer/preferable)
- A lot of custom code has to be written for step (3), which needs to be maintained.
So a preferable solution would be to use OHDSI/FeatureExtraction preferably in combination with OHDSI/WebAPI. After playing around with both packages for a little bit, I have some questions/remarks:
- The FeatureExtraction module adds a layer on top of the CDM. With that I mean I do not have complete flexibility to extract the data I want and am stuck with the convetions (interface). this package gives me. ← Am I wrong here?
- The webAPI does not let me access the FeatureExtraction but only the analyses (
webapi/feature-analysis
) that make use of this module ← There is no way to retrieve the FeatureExtraction data or submit a query from this interface?
This seems like a releated question: Patient-Level Data export from ATLAS Cohort Definitions - Developers - OHDSI Forums
Please let me know if you have any thoughts on this, much appreciated!
PS: vantage6 is mostly written in Python, to avoid doing things twice we intend to wrap all used R packages into a Python interface, these will be shared back with the community
PPS: I was only able to add two links to this question, so I had to remove all links, sorry about that…