OHDSI Home | Forums | Wiki | Github

Gold standard patient discussion - IRIS tool

Last meeting was about gold standard patient and I am looking at sites that would be willing to run a short SQL and share their results (preferably here in the forum (on a general level), (and possibly email the actual counts to me and Patrick).

For this purpose, there is tool called IRIS in OHDSI GitHub.

The results look like this : (this is real data for CCAE dataset in IMEDS)

MEASURE     RESULT     EXPLANATION
G1     141,805,491        count of patients
G2     20,328,289,601     count of events
D2     90,024,522         count of patients with at least 1 Dx and 1 Rx
D3     112,148,500        count of patients with at least 1 Dx and 1 Proc
D4     5,939,621          count of patients with at least 1 Obs, 1 Dx and 1 Rx
D5     277,975            count of deceased patients

A link to earlier discussion is here

How to participate:

  1. get the .sql file for your database from here: https://github.com/OHDSI/Iris/tree/master/inst/sql/non-parametized
  2. replace a string ‘ccae_cdm4.’ with your path to the CDM tables (it may have .dbo. if you use MS SQL)
  3. run the code
  4. extract the results from result table (like this: select * from iris_A) (this is at the end of the file)

(in fact, if Martijn or other expert experienced with the lasted way of switching the schemas/databases can give me hints how to rewrite the parametized version of IRIS (it would be great) (code is here)

(Also of note, IRIS can work with both versions of CDM: v4 and v5)

Vojtech, I’ll take the point to run IRIS on the data sets here at Janssen and to provide feedback on the IRIS scripts where applicable. Thanks for the presentation today on the call - it made for a good discussion.

@Vojtech_Huser thanks for the IRIS script - it was nice and easy to use. It took me a about 10 minutes to review and convert using SqlRender for our SQL Server PDW environment here. Then it took me 10 minutes to run across 8 databases. So in less than a 1/2 hr, I was able to create a nice summary of the populations in each database.

I think everyone in the OHDSI community will find this activity easy and very useful. I’m happy to share our results with you. How would you like to do that? Should we have Lee create some shared server space to compile results or would you like us to just email you directly?

@Vojtech_Huser: yes thanks for this code, very easy to use, I ran it here and have results for everything except D5. Not exactly sure how you wanted us to share these results

@Vojtech_Huser, would you like IRIS reformatted as a standard R package so users just need to:

install_github("OHDSI/Iris")
library(Iris)
Iris::execute( database-specific-parameters )

best, M

@Vojtech_Huser, we are running the code on Stanford’s STRIDE. I will email you and cc Patrcik with the results.

Thank you for interest in IRIS.

@msuchard - Yes. That would be great to a have it in a more user friendly way.

To clarify how to share the results - It is nice to post to the forum the total execution time and indicating that you were able to run it (your experience). If you have ideas about new measures that you think are important (for your dataset or in general), describe those too. Also, say which CDM version you use (v5 or v4) and maybe the SQL flavor you use (oracle, redshift, etc)

I would like to compare the results. I think just email and plain paste into email body or .CSV attachment is fine for this initial stage. (it is only a few lines of data)

For future improvements of IRIS, I think we will focus on v5 and go also after EHR-ish data and highlight measurements, observations and notes tables.

Here are the results of comparing 17 datasets using IRIS. Datasets and sites are masked with just datasetID to encourage site participation.

(table below shows counts in thousands)

This is a great tool and wonderful resource. Simple and elegant but
extremely informative. Thanks vojtech!!

t