OHDSI Home | Forums | Wiki | Github

Harmonizing Biosignal with OMOP CDM(especially ECG)

(Jin choi) #1

Hi. I am Jin Choi from Ajou University Hospital, South Korea. I originally studied biosignals and recently joined OHDSI.

We want to integrate biosignals into the OMOP-CDM database. Are there any researchers who have biosignals and want to store them in a standardized structure? Let’s collaborate.

Our hospital has 1.72 million cases(750,000 patients) of standard 12-lead ECG in XML format(From GE device, including 500 Hz waveform data) .
Also, we have 30,000 patients’ ECG, pulse, and respiratory rate waveform data (from ICU patient monitoring devices)
We also have OMOPed EMR (CDM v5.3) that can match the above biosignals.

In my opinions, the possible integration methods are:
1. Extract only clinical features from ECG (QT interval, ST elevation, P axis, etc.) and insert them into the measurement or observation table. (Simplest)
2. Put the ECG records into the Measurement table and put the original ECG files’ path in the Measurement_source_value column. Then, we can extract waveform data using this path.
3. Produce a separate CDM extension model for biosignal (CDM for waveform biosignal data)
4. Any method you suggest

If you are already doing research using biosignals, we are open to any proposal (External validation…etc.)

And Also, below is a list of clinical features in our ECG data and their mapping.

If you have any advice on our ECG mapping, please tell me.

(Seng Chan You) #2

Great suggestion, @Jinchoi

I agree that features from ECG can be stored in Measurement table as you suggested.
HL7 aECG seems promising for further standardization of ECG itself.

Storing the path for original data (including DICOM files and ECG) in ‘Measurement_source_value’ column sounds clever.

I hope OHDSI develops a convention for ECG analysis in our network soon.

(lychenus) #3

Just to add, if you allow ECG, then there is a lot of data from sleep medicine that is feasible too. But it includes ECG, EEG, EMG, positionals, and all kinds of mess. And I don’t think anyone can summarize it simply with a table.

But for the moment let’s leave sleep a dream here. For the record, I come from sleep medicine.

(Manlik Kwong) #4

We are completing a demonstration project now that converts MIMIC-IV ICU data to OMOP. This project include integrating ICU monitor continuous waveform data into OMOP CDM. While the CDM will not contain the continuous waveform data - the derived measurements (ex ECG II, III, V, MCL1, etc) will use the “souce_value” to provide the link back to the signal file as you are also doing. The link therefore retains the original MIMIC-IV waveform source file identifier. At Tufts Medical Center we do the same for all our 12-lead ECG records using the “source_value” to retain the ECG Management System/Electrocardiograph automatically assigned filename.

So the later observations/conditions (from the interpretation statements) and measurements (V4 R-amp, V1 STJ, etc) all retain the source filename in the “source_value” field.

This assumes users will run initial cohort discovery on derived measurements and conditions. If you are then developing new signal processing algorithms - then use the “source_value” to pull the waveform data for that stage of the project. Its a balance of avoiding CDM bloating.

(Manlik Kwong) #5

Hi - I have OMOP CDM mapping for the GE 12SL electrocardiograph. Send me a message at mkwong@tuftsmedicalcenter.org if you want a copy of the ETL map.

(Daniel C. Phelps) #6

Hi Malik - I was just thinking about doing that MIMIC-IV to OMOP conversion so I’m thrilled that you are already undertaking it. I’d love to see what you have when you’re done.

(Manlik Kwong) #7

This work is part of the N3C project lead by Andrew Williams (Tufts Medical Center) who is a well known and active member of the OHDSI community along with the great folks in the Odysseus Data Science team and PhysioNet. Stay tuned - more information about the project and methods are coming soon.

(Jin choi) #8


In fact, I don’t know much about sleep medicine, and we don’t have any data about it.

However, We think that bio-signals in sleep medicine can be standardized in the manner discussed above.

Anyway, I recently heard the news that a Seoul national university in Korea is opening an ai-challenge about sleep medicine. If you’re interested, I think you can send an email and ask for data sharing.


(Jin choi) #9

Thanks for your suggestion. I agree to the use of HL7 aECG for ECG standardization. I have heard of the recent cooperation between OHDSI and HL7. Now, Our ECG is in format of MUSE XML, so we will find a method to convert it to HL7 aECG and share it with OHDSI.

Thank you for sharing the ETL map for the GE 12SL electrocardiograph.
I will sent you an e-mail.

Currently, we only have features that are automatically extracted from GE devices. You seem to be extracting new features from the MIMIC waveform. Could you teach us about the extraction of a new feature from the waveform itself?

And when will the MIMIC waveform database integration method be announced? As you announce, we will try to transform our ICU bio-signal data to be as similar as possible. Once the work is done, we will be able to have a multicenter ICU biosignal research network.


I am glad to hear that you are also interested in the ICU data. Do you work in a hospital? Are there any biosignals in the hospital that could be transformed?
Or do you have any studies (clinical or machine learning, engineering…etc) to suggest? Our hospital is open to any suggestion.

Anyway, our first target is to extract features from a 12-electrode ECG and combine them into the OMOPed EMR database. As soon as we are done, we will share our work with OHDSI.

P.S. Does anyone have an ECG XML file from a vendor other than GE? I understand that the XML format is different for each device vendor. In order to discuss data standardization, it is likely that the formats of multiple vendors need to be shared.

JIn choi