OHDSI Home | Forums | Wiki | Github

New working group: Clinical Trials

All - the 1st meeting of (revived) Clinical Trials WG came off Friday, we had 5+ attendees. Log of meeting will be posted to wiki and future meetings will be recorded as is the protocol. Still working out what we will do and cover. To be added to the invite, please email me your direct email address…

Brief overview of proposals from our presentation today:

  • Trial visit (Vocabulary extension)
    Add visit concepts specific for trials, like Baseline, Screening, Follow-up, Cycle #
  • Study (CDM extension)
    Table with study and arm metadata and assignment of persons to a study/arm.
  • Measurement modifiers (CDM extension)
    Table like the fact_relationship table that connects a fact (e.g. condition_occurrence_id) to a concept_id. This allows for a flexible way of connecting more attributes to a record. This proposal is based on the modifier domain in the tranSMART data model
1 Like

So sorry I missed this! Was this recorded?

Hi Shawn, sorry to say that we have no recording. We will share notes and the presentation asap.

2 Likes

Hi Shawn. Is this working group still active? I’m interested to join.

Our WG hasn’t met in a while. Here are some relevant developments in this space and related spaces we should keep abreast of along with a couple of suggestions we might ponder until Shawn resumes calls. Hopefully that will be soon!

The NIH Collaboratory is promoting clinical trial data sharing to meet evolving journal and research funding requirements and increase the value of trial data through reuse. Our work might consider reaching out to them to help to meet the needs they are seeking to address.

They are educating researchers and research organizations about data sharing requirements, data sharing plans, and demoing solutions that help meet related needs. The NIH data commons is a prominent example of research sponsor aspirations to leverage research data in ways that foster reuse. My sense is that after years of faltering attempts to promote data sharing there is now enough momentum, technical maturity, and calibrated understanding of researchers incentives to tip the scales toward widespread adoption over the next couple of years if things go well.

An STDM-to-OMOP ETLing solution produced by the OHDSI CT WG that allows semantic interoperability of shared trial data and use of OHDSI tooling on data aggregated across trials would be an enormous benefit to this effort. It would also get more researchers involved in using OMOP and OHDSI tools. At least some of those researchers would then be likely to join OHDSI more fully and expand work in this space.

One solution featured in the NIH Collaboratory’s Sept 27. 2019 videoconference on trial data sharing was Vivli. This and similar platforms enable researchers to share data, search for shared data, assign credit to researchers and organizations who share their data, track users of shared data, and track publications that use shared data.

Vivli uses DataCite and DOIs to assign credit and track data use. DOIs are Digital Object Identifiers. A DOI is an alphanumeric string assigned to uniquely identify an object. It is tied to a metadata description of the object and to a URL for accessing important details about the object. DOIs can be generated for data sets, software, images, and other research materials.

Though this effort is focused on trial data, these sharing, tracking, and reuse issues will be relevant to any data commons type use of the ETL and will increasingly pertain to data collected in routine care that are the bread and butter of OHDSI research. They are worth bringing to this groups attention, in part, because data from routine care will increasingly be integrated with trial data.

Here’s why I think that’s true and why we should be aware of the implications for our WG. In 2015, the FDA sponsored pilot projects examining integration of data capture for trials into routine care through EHRs. Last year they issued guidance based on what was learned from those pilots. Their industry guidance covered both the use of electronic source data in general and EHR data in particular in clinical investigations, i.e. in trials.

Health care organizations are adopting solutions offered by the major EHR and CTMS vendors’ (e.g. Cerner’s PowerTrials, Epic’s BEACON, AllScripts/Veradigm’s partnership with Microsoft, Velos/REDCap integration with EHRs, etc.) that is responsive to the FDA guidance. These solutions will integrate electronic case report forms into the EHR, allow clinicians to see research-related data in the patient’s chart, screen and recruit and conduct other research workflows using the same systems that support routine care.

As health care organizations adopt these solutions it will allow trial data to be collected and linked to records of the usual care received by the same patient. This will open up important new possibilities for analyses that more directly evaluate and compare data and evidence generated from trials and from routine care. Such analyses will help advance the amazing work presented by Patrick Ryan, George Hripcsak at this year’s OHDSI Symposium on trial and observational evidence for regulatory decisions.

Apart from trial data, open observational data repositories such as the MIMIC-CXR database used by PhysioNet are a fantastic resource that OHDSI should seek to promote and use. Obviously, these efforts are only relevant to the CT WG insofar as the group’s work is designed with respect to the needs of a data sharing solution that encompasses both trial and observational data.

Since the CT work group’s work is being driven in part by Odysseus and the Hyve, it seems relevant to explore whether that should be a design consideration. The OHDSI community would benefit greatly from a strategy for the kind of functions supported by Vivli. I don’t know whether it is the best platform for this sort of thing or if there are better alternatives or if it would be better for OHDSI to develop its own. But I think that there would be much wider study participation and more studies would get done if we were to support or participate in a centralized data sharing platform and incentivize its use through a well-designed credit attribution system. Supporting this for both observational data sets as well as for OMOPed data from trials would be ideal.

I feel especially strongly that we would benefit from a credit attribution system that uses a global persistent unique identifier like a DOI to give credit to community members. The OHDSI community is doing amazing things without many people getting much credit for their contributions. Think of what would happen if their contributions became matters of public record that they could put on CVs, show to superiors, etc.

This isn’t a central issue the CT WG is trying to address either, but it too is related enough to warrant mentioning here. Efforts to implement the ETL from trial data set (STDM) to OMOP are likely to be motivated by some effort to promote FAIR principles and hence will require a good credit attribution system. Attribution for the data should be done in concert with attribution for other research products since they are often handled using the same attribution strategy.

In addition to DataCite and ORCID here are some other relevant efforts in this space worth considering if we want to include a robust strategy for using trail data set identifiers and trial data set contributors in our work:

National Center for Data to Health (CD2H) is a consortium of academic health centers in the US. Its efforts in this space include a promising solution called InvenioRDM. It is being developed:

in partnership with the European Organization for Nuclear Research (CERN), birthplace of the World Wide Web and developers of the Zenodo RDM for the European community

This work is:

driven by user needs and informed by best practices and standards, including those that help define Next Generation Repositories as a foundation for a distributed, globally-networked infrastructure for scholarly communication, discovery, and innovation.
… to build a wide range of features that can help power biomedical research and support data sharing, innovation, knowledge dissemination, and interdisciplinary collaboration.

Another CD2H effort is working on

development of a contribution role ontology (built on CRedIT through the CRedIT ontology) to support modeling of the significant ways in which the translational workforce contributes to research; better understanding of the types of research objects generated; and mining of acknowledgements section of publications to harvest existing contributor roles to serve as a data source to drive additional development

I know it’s a pain reading long posts. I packed a lot into this one because efforts like the NIH Collaboratory, CD2H, PhysioNet, and MCBK are developing robust methods and infrastructure to support open science. There are so many challenges to doing this kind of work well, I think this WG and many others in OHDSI should be aware of what’s relevant in these other networks so we can decide what we might want to contribute to and benefit from.

Here’s my bottom line suggestion for the CT WG: consider defining our motivating use case through engaging with the NIH Collaboratory or with Vivli or another widely used platform. A successful product of that engagement would ensure that the broader research community benefits from the CT WG’s solutions for munging STDM data to OMOP so they can enjoy the meticulous effort OHDSI has given to the CDM and the inspired OHDSI tools. It might also provide funding if they are sufficiently interested. Engagement with them might simply consist of repeating the prior demonstrations by Odysseus & Medidata and the Hyve.

If Odysseus or the Hyve or another group plans to develop an OHDSI data sharing platform we might just investigate what’s being done by others in this space to learn what’s needed for more OHDSI-specific purposes.

1 Like

Please add me to the group

Dear Group,
I won’t be able to keep moderating the Clinical Trials working group. Apologies for no September meeting. Some recommendations and thoughts and next steps below

  • I think the group might really be 2 groups, one for tech around clinical trial data import/ conversions, as this seems to be the interest of Vojtech, Greg, Kees/ Hyve and current Gates interest, and maybe others. The second group would be those interested in the #1 voted use case to focus on, which was “Add many completed clinical trials and combine them into a single ‘synthetic’ trial in a common model to perform an in silico clinical trial”

  • Andrew just wrote a long email with different recommendations. Group should consider those as alternate recommendations.

  • There is good momentum within the group imho and certainly a steady set of attendees. And while I recall some reported impatience, I feel proud of what we got off the ground and our process toward getting the catalog of 18 use cases and the presentations we had and recruiting C-DISC folks to join. There seemed to be some real passion.

  • Sonia Araujo has stepped forward and volunteered to lead the group. She will be setting up a con call for other folks who want to volunteer to lead to put their hat in the ring so the majority of interested parties could choose.

I appreciate your support.
Shawn Dolley

202-460-4660
The Shawn Dolley Company LLC
Consultant to Bill & Melinda Gates Foundation
Washington DC
DACS for Clinical Trials
Design, Analyze, Communicate, Sustain

1 Like

Thanks @shawndolley for having initiated and led this group. And great that @sonia is stepping up to continue this!

At The Hyve we are still very eager to contribute, and to take a role for creating conventions around the CT to OMOP conversion work, which you referred to.

1 Like

Can you share the details of the WG? I would like to be a part of the group as well.

Thanks
Ajinkya

Hi, I am also interested to join. Have been working on clinical trial to OMOP mappings, so I am really interested in the WG discussions.

Dear ClinTrials WG members, you should have received an email from me today with notes from our meeting on 11 October, to resume group activity. Great call, and plans made for the WG focus. We will have 2 or 3 calls in 2019 to get things restarted (invites will follow soon), and our fortnightly calls will resume from January 2020. If you didn’t get that email, it means I don’t have you on my WG members list. Please email me at sonia.araujo@iqvia.com so I may add you to that list. Thanks, Sonia.

@EmmaVos, @Ajinkya_Patale, @jliddil1, @tom_snelling - please email me at sonia.araujo@iqvia.com with your name and email address so I may add you to the WG’s distribution list. Thanks, Sonia

great,i am also interested in this group too

I’d like to join as well.

forum response to the proposed agenda

- Actions from last meeting, incl conclusion on group approach for SDTM->OMOP conversion
- Refine use case for testing this approach
- Discuss feasibility / relevance of using clinical trial data from Gates Foundation as test bed to showcase / prove our SDTM->OMOP conversion approach
- Discuss any other clinical trial data sources for this purpose

possible sources are: (each repository offers several studies, not just one example listed below)

https://projectdatasphere.org/projectdatasphere/html/content/310
https://datashare.nida.nih.gov/study/nida-ctn-0056

NIDA request approval process is instant.

Mike Gurley drew the Oncology WG’s attention to the ICAREdata project. It’s relevance to this WG probably has less to do with the STDM-to-OMOP ETL work than the possible uses of ETLed RCT data we might be interested in down the road. Similar to some of the things in my long post above. But if those working on the ETLs aren’t familiar with it, they may want to check out the GitHub repo of the outfit supporting the project (the Standard Health Record Collaborative) to see if there’s useful code there.

1 Like

On our last call the excellent work on ETLs from STDM drew attention to the need for a standard vocabulary for biomarkers in OMOP. Of the gaps in the CDM needed to do these ETLs, standardizing representation of biomarkers stands out to me as the most important and the one with the greatest benefit to analyses of both trial and observational data. I.e. representing trial arms and drugs not yet in RxNorm seems less challenging to accommodate and less likely to benefit other areas of OHDSI.

This recent fine paper that Patrick co-authored has a very helpful breakdown of the current impediments to trial replication due to the absence of data in EHR and claims sources. It adds to our understanding of the types of trial data that are potentially available in EHRs but cannot yet be represented in a standard way. In other words, it suggests types o concepts and concept relationships that are common to trials and EHRs that might be mappable with a minimal extension to the CDM,

Among these, I think biomarkers will help to maximize the targets in OMOP that can be mapped to from trial data in STDM.

The idea of a biomarker vocab is a bit different than the other domains in the CDM because it is as much about the relationships between concepts as it is about the coverage of the concepts in the domain. I suggest we consider the use of the Human Phenotype Ontology (HP0) for this. The HPO is the object of a very large and very mature biocuration process annotating relationships between concepts based on research evidence, it is already widely used by many researchers, and it has established linkages with standard OMOP vocabs that can function as biomarkers.

This paper describes recent work annotating LOINC concepts for lab results with HPO terms. Similar work is underway for radiologic results as represented in RadLex which has been proposed by Chan and Kwangsoo for their Radiology CDM extension. Most obviously, it has a strong connection to genomic data which it is rooted in and would be an important complement to the oncology extension of the CDM.

Juan has already done extensive work annotating standard OMOP vocab with HPO. So there is much to build on already and the fit with standard vocabs is good. There is also a natural relationship between the process of biocuration and the relationships that the HPO encodes. The evidence for determining whether a relationships comes from trials. A virtuous circle that assists in the extension of the HPO’s biocuration activities could be arranged that is driven by the same researchers and organizations who want to use it for ETLing their trial data.

Adding the HPO to the OMOP CDM including its relationships to standard OMOP concepts would add new possibilities for phenotyping and for relating clinical data to knowledge bases used in life sciences. Both of those impacts are potentially large and worthwhile. Perhaps the biggest impact would be a significant extension of the community’s ability to identify valid clinical endpoints in analyses and predictive models.

I would be happy to reach out to Peter Robinson who is a leader of HPO activities and related algorithm development, to explore this idea.

I am eager to know whether others, particularly those in the trials WG, think it might interested in this. This is work I think has a good chance of receiving external funding support because of it’s broad impact and the central role the HPO plays in many national and international research support efforts involving ontologies and knowledge bases.

For COVID19 studies (including observational studies and registries), I propose this workgroup takes a lead in guiding current PI teams to standardize their data directly into OMOP CDM. (and not SDTM). (to move away from native->STDM->OMOP but go native->OMOP.

This is separate from guidance for EHR data. I mean advanced CRF data (in addition to all the EHR guidance will now have after studyathon).

Good morning! I am Qin Ryan, a new member of OHDSI, introduced by Dr. Ana Szarfman. I am a hematologist/oncologist who works at FDA reviewing efficacy and safety on new therapies with experience on claims data analysis. I am also a cell and molecular biologist. It is my honor to contribute to join you team to work on COVID-19 data. Presently, I am still try to navigate through OHDSI but would love to contribute.

t