Commercial datasets available in the form of flat files

(Gyeol Song) #1

Hi all,

I was looking through the 2019 Data Network and noticed that a lot of data holders own the same database.

Does anyone know which of these databases are available in the form of flat files?

For example, I see there are 5 collaborators who hold the CPRD database in the OMOP CDM format but when I looked through the CPRD website, it looks like they are offering their data through a web-based interface only. How did these 5 data holders manage to convert the CPRD database into the OMOP CDM format without getting their hands on flat files?

Thanks in advance.


(Clair Blacketer) #2

Hi Song,

CPRD has two different data assets that they license. The one you are referring to that is only accessible through a web interface is the Aurum asset. This is the newest offering that they do not license as flat files. The asset that we have converted to the CDM is known as CPRD GOLD. In the past they gave flat copies of this which facilitated conversion.


(Gyeol Song) #3

Hi @clairblacketer,

Thanks for your feedback.

When you say “in the past”, can I assume that they no longer give out flat copies for CPRD GOLD as well?

Thanks again.


(Christian Reich) #4

Most databases are not available at all. You have to define a study and then have the owner execute it on the data and share the result. Some data are commercially available. CDPRD is one, IBM, IQVIA and Optum are the biggest data vendors for you to obtain large scale aggregated data.

(Gyeol Song) #5

Hi @Christian_Reich, thanks for your feedback.

I’m totally aware of this. I think my question might have been misleading. I was not referring to the individual databases that have already undergone conversion but rather the source databases like the ones you’ve mentioned.

My question is more focused on which of the source databases are available in the form of flat files, in case we might want to do a conversion into the OMOP CDM.

(Clair Blacketer) #6

Hi Song,

I am not sure if they still give out copies of CPRD GOLD. It may be something you can negotiate in your contract should you choose to purchase the data.


(Gyeol Song) #7

Hi @clairblacketer,

Thanks for your reply. I’ll discuss with CPRD and see if they still have the same policy.

A little late but Happy New Year to you and your family!

Best regards,


(Mark Danese) #8

My understanding is the same as Clair’s. The data you get is formatted the same, so the ETL process is unchanged. It just applies to your data subset.