OHDSI Home | Forums | Wiki | Github

2020 OMOPed MIMIC project

(Vojtech Huser) #1

This thread is to be used to document work on converting MIMIC demo (and full) data to OMOP.

Prior ETL (https://github.com/MIT-LCP/mimic-omop) has last commits from 2 years ago and will be refreshed. Physionet v1.0 release will be based on that code. If we improve the mapping, Physionet release 2 will contain those additions. For now, we will still use that GitHub repo as the one and only repo for the ETL.


OMOPed MIMIC (Argos) Google drive folder link is: https://drive.google.com/open?id=1j-x-rwuYJr2nIs5zxCW6ST_Q-vPc1tfN

Github issue for the conversion: https://github.com/MIT-LCP/mimic-omop/issues/52

This forum thread will be primary way to communicate updates.

Argos is the dog of Odysseus. It is no acronym. We just need a name for the project.

ICU data, anyone?
MIMIC2 (and possibly MIMIC3) data in OHDSI CDM format
(Vojtech Huser) #2

May20 update: 2 additional researchers from Stanford joined the team. (see central notes on Google Drive folder). Nicolas Paris plans to run extract for demo data and help that way with relase 1.0 on Physionet. Thank you for folks who volunteers to be primary person for an OMOP table. MIT Physionet team provided guidance. If you want to volunteer as technical lead (let me know; I will assume that role until we have a volunteer for that role).

We will focus on MIMIC demo data first. We will author release notes on Google Drive (central notes).

The team has 10+ members and growing. N3C even wants to help. (update pending)

(Jose Posada) #3


We started the work on the procedure_occurrence table. The code is here

This is a first attempt and it is open for comments or suggestions. Let’s keep the ball rolling

(Vojtech Huser) #4

June 3 update:

  • BQ updates
  • some steps done towards a funding proposal (Andrew W. can provide best update)
  • At the folder link, there is spreadsheet for tasks that we try to prioritize. Please add you vote to what tasks you see important or add new tasks. (file central spreasheet) at the google folder here: https://drive.google.com/open?id=1j-x-rwuYJr2nIs5zxCW6ST_Q-vPc1tfN
  • I made some work on loading demo OMOPed data into SQLite file and running some OHDSI tools on the resulting file

Tasks look like this:

Mapping coverage from prior presentation:

(Juan M. Banda) #5


I still see this effort a bit disparate and all over the place. Should we all do the same tasks and compare notes? Should we focus on the tables we signed up to do? Are there any deadlines for deliverables? Do we need to use BQ or not (I see that Jose did, but not you)? While I am very willing to contribute and have the results of the full dataset mapped to OMOP, I don’t see structure as to when and how things should be done and what deliverables are expected?

I think we should have a call (and I strongly dislike calls :slight_smile: ) to get all on the same page and structure this effort a bit more, unless I completely missed the point or some parts when this was already discussed.


(Andrew Williams) #6

I’m sending an email out to all those involved shortly to explain the funding possibility. If successful, it would help address some of Juan’s points.

(Michael Kallfelz) #7

The Odysseus team has done some analysis using the existing project on the demo dataset. I added the results to the Central Notes document.

(Vojtech Huser) #8

Should we all do the same tasks and compare notes?

The task overview presented above is one way to collectively decide the most important tasks. Please vote on those and add tasks you see missing.

Should we focus on the tables we signed up to do?

The tables division was one way to divide the work. (besides tasks). My goal with the table “stewardship/babysitting” was to let folks think about quality of v1 mapping (2 years old) and suggest gaps. The posting by Michael Kallfelz presented a good overview for all tables. (but brief)

Are there any deadlines for deliverables?

No - because some folks are waiting for funds to cover their effort. Others are contributing some time (and not waiting). One deadline is to release v1 of OMOPed demo data by July 30th or sooner. (in a shape that MIT will approve; with release notes and their mandatory fields (see central notes for draft)). Nicolas promised to create the CSV files for that. I may have to look for other source since he may be busy.

Do we need to use BQ or not (I see that Jose did, but not you)?

Only for v2 - we may use BQ. I think a good plan is to make mapping platform independent so implementation in many flavors is possible. BQ has some advantages over current postgres. What do you think, Juan?

Here is the script to load in SQLite and later run ohdsi tools.

Thank you all who responded on this forum publicly! This helps the momentum. We need more posts with questions like you posted. Ad meeting, I am fine with meeting. @parisni, we probably need time to accommodate EU time zone. To prepare for the meeting, can everyone vote and add tasks they see important. (spreadsheet file on google drive, same link as always)

(Juan M. Banda) #9

Thank you! This clears out a lot of things in one single place. I do agree that should be platform independent, but it will be harder to normalize from SQLlight/Postgres (what I am using)/BQ, and others if we let this be a loose requirement at first. However, I see how this would be the quickest way to get something out by end of July.

Will go and vote for the tasks now.

(Vojtech Huser) #10

The funding for the project was approved (Thanks to Andrew Williams). Kick off happened last week and this week was first regular meeting. Mimic4 will be the input source we will be converting (not mimic3 ! yay). Mimic4 will be released probably this week by Physionet team.

(Vojtech Huser) #11

MIMIC-IV was released on Aug 14, 2020. See https://mimic-iv.mit.edu/docs/datasets/ (demo version will also be released at some point (soon) (per this weeks OHDSI MIMIC meeting)

(Juan M. Banda) #12

There is a OHDSI MIMIC WG? :slight_smile: Can I be added on the meeting mailing list? I don’t see it on the main page (https://www.ohdsi.org/web/wiki/doku.php?id=projects:overview) pardon my ignorance on this :slight_smile:

(Vojtech Huser) #13

This is a public reply to a U of Florida researcher: Latest info can be obtained here.
link https://github.com/OHDSI/MIMIC/wiki/Meeting-minutes#monday-august-31-2020---weekly-requirements-meeting