OHDSI Home | Forums | Wiki | Github

Steering committee: Proposal to Standardize Network Studies

Continuing the private conversation that came out of the Steering Committee meeting on the forums:

@Christian_Reich, @gregk, I would say:

  1. Creating a clinicaltrials.gov, but for OHDS studies
  2. Disseminating study code

Let’s not have the “can’t have R packages” discussion again. See here and here.

ARACHNE: I * think * ARACHNE might be useful for OHDSI, but I have yet to see it in action. If we are serious about implementing ARACHNE in the OHDSI network, I recommend:

  1. Creating documentation on how to install ARACHNE (or is it already somewhere? Google can’t find it)
  2. Creating documentation on how to use ARARCHNE
  3. Use ARACHNE in some pilot OHDSI network studies
  4. Learn from these experiences to find out real-world requirements for ARACHNE, and make it really good

As Kristin mentioned, I would mostly be interested in getting some minimum meta-data in place to help make some sense of what is in which repo. I moved one of my study repos into ohdsi-studies, and have added some example minimum meta-data elements to the README.

Well, but then we need to actually conclude and come up with a solution that takes care of the various issues. I understand not all studies can be pre-scanned, but you have to understand that the opposite is not feasible either: All studies are R packages with more or less unknown (to the security people) content. We need to have the standard studies pre-canned and JSON-configured, and we need a way to do something that breaks that. In that case, the institution can implement the former, and have some tight governance oversight bureaucracy for the latter. Makes sense?

I don’t understand the proposal here, mostly because of unfamiliarity with the technology.

Agreed.

yes, I agree - we did exchange a lot of ideas and opinions but I yet to see us coming up with an aligned and agreed approach / solution

But I also do think that @schuemie is right - “No R code” is just as extreme as “Everything should be in JSON and agreed upon”. There will be times when we need to be flexible and be able to exchange the code. However, I do think that we should follow 80/20 approach where 80% should be designed in ATLAS and have a standard definitions and code output.

Yes, agreed on all. The user manual and installation guide are [here] (https://github.com/OHDSI/ArachneUI/tree/develop/manual) - at least for now. I actually debated where these docs should live, will eventually move some content to WIKI, or at least have some “Start here” document in WIKI

I would love to see us (OHDSI) using ARACHNE in ager on some OHDSI study (ies) and then learn from that experience. Any ideas what study - or multiple studies - we could use?

@Christian_Reich: ok, sure. I’ve added your wish as item 58 on the Neverending List of Fun Things To Do. Perhaps you can find a skilled R programmer who can do this? I’d be happy to advise.

@gregk: thanks for putting the documentation online. Focusing just on the installation instructions, I have some further recommendations that you can take or leave:

  1. As per @Rijnbeek’s comments, I would add some introduction, and just some context for people (like me) that aren’t completely familiar with the ins and outs of ARACHNE. For example: what is an ARACHNE Portal? An ARACHNE Datanode?
  2. I recommend having an HTML version as well. I would not recommend the Wiki, since I think we’re slowly moving away from that as a community. One example to consider is the Methods Library documentation.
  3. Currently the documentation assumes some Linux variant. You’d have many more people willing to try it (like me) if you could also do it in Windows. (On the database side you’re pushing for OHDSI to support every database platform imaginable, so you should feel the same way about deployment platforms :wink: )

I know writing documentation isn’t the most fun, but it is an essential part of any software product.

1 Like

Wait. That skilled R programmer exists. He already did it. We were able to execute R code by sharing JSON configuration files between ATLAS instances, and an the execution engine could take it and create results. That was a year and a half ago. Since then, somebody broke this model. Can’t remember who. :slight_smile:

@MauraBeaton: could you put the topic below on the agenda for next Steering Workgroup meeting?

@Patrick_Ryan and me have been working some more on the new ohdsi-studies organization in GitHub . We would like to propose the following:

  1. Each OHDSI study should have its own repo in this new GitHub organization. Anyone can request a study repo, and the study lead will become the repo admin, therefore having the rights to do whatever she wants within the repo. We do not restrict what people put in the repo. If they want to post only the protocol and the json, that is fine. Some people may post the study R package.

  2. We would require each repo to have a README file, and the start of this README file should conform to a standard template. An example study that adheres to this template is here.

  3. The README files would be (automatically) scanned to generate a table that will replace the current OHDSI research studies page. The table can be filtered, for example to show only EHDEN studies or only prediction studies.

The advantages over the current way of working are:

  • Avoiding the mess that is StudyProtocolSandbox. Each study has a clear commit log and issue tracker dedicated to that study.
  • Each study will have a stable URL that can be referenced for example in papers.
  • Having the meta-data in the README file means the study lead can easily update information on the study, hopefully leading to less out-of-date-information on the OHDSI studies.

Another idea is to have a prespecified subfolder in each study repo where the results Shiny app can live. This would replace the current ShinyDeploy repo as a means to deploy study Shiny apps. However, we still have to figure out the mechanics for that.

2 Likes

I’m second to this :slight_smile:

1 Like

Buttttt… how do we get sites to approve a study without a protocol? :wink:

+1 on this discussion. I’m traveling this week. Will try to hop on. This is near and dear to my heart.

I meant an R package instead of a JSON file. I was just trying to say that I don’t want to restrict what is in the repo (except requiring the README file). Even having a protocol, although highly encouraged, might not be necessary, for example for methods research that is not network research.

To your point, if a repo is used to distribute R study code in a network study, and the sites require a protocol to get IRB approval, it makes sense to provide a protocol in the repo.

Github even provides a mechanism for a template. See https://help.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-template-repository

Can you please provide instructions on how to request membership in OHDSI Studies organization? (and be able to create a study repo in it)

Also, tagging a study as an informatics study would be a nice bonus.

Thanks @Vojtech_Huser! I’ll look into the template repo, sounds exactly right.

We’d probably need to assign a few administrators that can grant membership and create study repos. Let’s first discuss this proposal in the Steering Committee.

I’m perfectly fine with adding an ‘Informatics Study’ category, but I could help understanding how those differ from methods research. In my very simple mind, I currently distinguish between studies that generate clinical evidence to be used in clinical decision making (‘clinical application’ studies), and research aimed to develop the methodology for the clinical application studies (‘methods research’ studies). Could you provide a definition for ‘Informatics’ studies?

I see that github only allows admin for an organization to invite a member . (a want to be member can not really request it anywhere…)

I am not sure I can define it nicely. I can only provide examples of past OHDSI studies like ConceptPrevalence, DataQuality and ThemisConcepts.

Informatics study is a study that supports improvement in the OHDSI Network, CDM specs, facilitates data standardizations, or studies of the research network itself (object of study is OHDSI network). It is a research study that advances informatics aspects within OHDSI.

After the steering committee meeting I said I’d mail @MauraBeaton with the things to review, but I figured in makes more sense to just post them here.

Please review the original proposal, and let me know if you have objections. Please make counter-proposals if you do.

Two things were already mentioned during the meeting:

  1. Not everyone agreed with the various defined study states as defined here. Please make counter proposals

  2. In addition to ‘Methods Research’ and ‘Clinical Application’, @Vojtech_Huser advocated for adding an ‘Informatics Study’ category . Please provide a definition of this category that clearly distinguishes it from Methods Research.

Counter proposal: In between “Started” and “Design Finalized” should be an explicit “Feasibility” step.

Rationale: There is significant legwork between a Started projected and a Design Finalized stage that is its own legitimate stage of progress. First commit is somewhat meaningless in comparison to the first time you test your Ts, Cs, or Os and begin to understand if your design actually works. This is a bit of a hole we fall into – often this is the stage we spent a considerable amount of time in. The limitation of current labels is we run the risk of underestimating the amount of community work going into studies – because many studies may linger at Started for an extended period of time without looking like they’re making progress. Of course, the commits in GitHub provide some transparency into what’s happening. It could be useful to have a tag for studies that are actively working through feasibility. This would also be a way to understand who may need community assistance – for instance, potentially using this self-reported tag as a mechanism for @SCYou’s Study Nurturing Committee to solicit new “nurturees” and help get past feasibility into a design finalized.

Examples:

  1. After migrating @BridgetWang’s Tofacitinib RA safety study to the new OHDSI Studies GitHub, it felt a bit disingenuous to tag this as “Design Finalized”. The study protocol was written and the study code’s been generated numerous times – but there are still some lingering design issues to debug before we can really say we’ve “finalized” the overall design such that we’d run it on other databases or publish the results / Shiny app.
  2. @mattspotnitz 's IUD studies are also “started” but I’d argue the design is still being massaged as they address issues identified during feasibility tests. Hint hint: @cukarthik and @mattspotnitz – a proper study repo would be really nice :wink:

I know less labels is generally the right approach. However, I feel strongly that if we’re using this as a way to pull metadata we should at least be giving some granularity into the abyss that exists between “Started” and “Finalized”. That’s my two cents. :moneybag:

1 Like

By ‘Feasibility’, do you mean the state when the study has been determined to be feasible?

Note that the stages have deliberately been named after achieved milestones, not activities (that may or may not be taking place)

@schuemie thanks for pointing to this post.

This has my full support to make this much easier to manage and remove the dependency between the studies in one repo which was kind of dirty.

I like the template proposed.

Something to consider is if we like to make the conditions/drugs that have been studied searchable as well as is done on clinicaltrials.gov. Maybe we can require each study to add a file with a list of conditions/drugs etc since this is searchable in the search box on the top of the Github if you select in this organization.

We may also consider to get some information from the github using the github API (for example: https://developer.github.com/v3/repos/#list-organization-repositories) on a page on ohdsi.org. I think we should be more proud of all the work we are doing and have it prominent on the ohdsi website. I could imagine a webpage that shows things like:

  • Some explanation on the purpose of the repo and link to it.
  • Latest study by tag
  • Most active study repo
  • etc.

I agree we can post our EHDEN studies here as well with a tag. We did get some requests for the possibility of private studies as well though that may require a different solution but lets not worry about this yet. I suggest we stimulate in Europe that if studies are registered at Encepp (http://www.encepp.eu) a link is added to the study repo hosted under this organisation.

Yes this is an ongoing discussion and we will also work on this in EHDEN in first quarter of 2020 by actually using ARACHNE in use cases together with Odysseus.

Yes. Feasibility complete would be when the study hypothesis has achieved a test of viability for continued study. Activities at this point may include basic characterizations to ensure adequate cohort sizes and in more sophisticated study designs, provisional testing to understand potentially issues in covariate balance – the kind of things you’d want to know before you run a full PLP or PLE package.

This would be different than design finalized where the full design is completely pre-specified and published. A study can achieve “Feasibility” when an investigator has completed the legwork to create a partial pre-specified study design. At this phase, there may still be gaps in the study design that require addressing.

A study would stay in the “Feasibility” phase until the full study package is finished (e.g. all errors related to executing the package – things like highly correlated covariates or other design flaws – are resolved). The material difference between “Feasibility” and “Design Finalized” is that one is a interim work product and the other is a final work product.

1 Like

Thanks for the nudge, @krfeeney :slight_smile:

I’ve have an updated repo after working out some kinks, but I don’t know how move it to the ohdsi-studies group. As mentioned in the study post we have two repos. The first one based on the Columbia study that was published is here and the second one based on a different T/C for claims his here.

1 Like

I think only owners (or members with some rights) of your two repos and the organization (ohdsi-studies) can move the repos between the two.

Ok, sounds good. I added a ‘Feasibility Established’ state. But I recommend adding no more!

t