OHDSI Home | Forums | Wiki | Github

New working group: Clinical Trials


(Vojtech Huser) #21

forum response to the proposed agenda

- Actions from last meeting, incl conclusion on group approach for SDTM->OMOP conversion
- Refine use case for testing this approach
- Discuss feasibility / relevance of using clinical trial data from Gates Foundation as test bed to showcase / prove our SDTM->OMOP conversion approach
- Discuss any other clinical trial data sources for this purpose

possible sources are: (each repository offers several studies, not just one example listed below)

https://projectdatasphere.org/projectdatasphere/html/content/310
https://datashare.nida.nih.gov/study/nida-ctn-0056

NIDA request approval process is instant.


(Andrew Williams) #22

Mike Gurley drew the Oncology WG’s attention to the ICAREdata project. It’s relevance to this WG probably has less to do with the STDM-to-OMOP ETL work than the possible uses of ETLed RCT data we might be interested in down the road. Similar to some of the things in my long post above. But if those working on the ETLs aren’t familiar with it, they may want to check out the GitHub repo of the outfit supporting the project (the Standard Health Record Collaborative) to see if there’s useful code there.


(Andrew Williams) #23

On our last call the excellent work on ETLs from STDM drew attention to the need for a standard vocabulary for biomarkers in OMOP. Of the gaps in the CDM needed to do these ETLs, standardizing representation of biomarkers stands out to me as the most important and the one with the greatest benefit to analyses of both trial and observational data. I.e. representing trial arms and drugs not yet in RxNorm seems less challenging to accommodate and less likely to benefit other areas of OHDSI.

This recent fine paper that Patrick co-authored has a very helpful breakdown of the current impediments to trial replication due to the absence of data in EHR and claims sources. It adds to our understanding of the types of trial data that are potentially available in EHRs but cannot yet be represented in a standard way. In other words, it suggests types o concepts and concept relationships that are common to trials and EHRs that might be mappable with a minimal extension to the CDM,

Among these, I think biomarkers will help to maximize the targets in OMOP that can be mapped to from trial data in STDM.

The idea of a biomarker vocab is a bit different than the other domains in the CDM because it is as much about the relationships between concepts as it is about the coverage of the concepts in the domain. I suggest we consider the use of the Human Phenotype Ontology (HP0) for this. The HPO is the object of a very large and very mature biocuration process annotating relationships between concepts based on research evidence, it is already widely used by many researchers, and it has established linkages with standard OMOP vocabs that can function as biomarkers.

This paper describes recent work annotating LOINC concepts for lab results with HPO terms. Similar work is underway for radiologic results as represented in RadLex which has been proposed by Chan and Kwangsoo for their Radiology CDM extension. Most obviously, it has a strong connection to genomic data which it is rooted in and would be an important complement to the oncology extension of the CDM.

Juan has already done extensive work annotating standard OMOP vocab with HPO. So there is much to build on already and the fit with standard vocabs is good. There is also a natural relationship between the process of biocuration and the relationships that the HPO encodes. The evidence for determining whether a relationships comes from trials. A virtuous circle that assists in the extension of the HPO’s biocuration activities could be arranged that is driven by the same researchers and organizations who want to use it for ETLing their trial data.

Adding the HPO to the OMOP CDM including its relationships to standard OMOP concepts would add new possibilities for phenotyping and for relating clinical data to knowledge bases used in life sciences. Both of those impacts are potentially large and worthwhile. Perhaps the biggest impact would be a significant extension of the community’s ability to identify valid clinical endpoints in analyses and predictive models.

I would be happy to reach out to Peter Robinson who is a leader of HPO activities and related algorithm development, to explore this idea.

I am eager to know whether others, particularly those in the trials WG, think it might interested in this. This is work I think has a good chance of receiving external funding support because of it’s broad impact and the central role the HPO plays in many national and international research support efforts involving ontologies and knowledge bases.


t