OHDSI Home | Forums | Wiki | Github

Skeletons in the ATLAS Closet

Happy New Year fellow OHDSI Collaborators!

I’m supporting an OHDSI colleague as they’re navigating their first PLP study. Their environment has an ATLAS installation of V.2.11.1. (Readers note: This a release from July 2022 – only one release behind the most current release of V2.12 issued in October 2022. Do not start scoffing… there’s more to this plot.)

It’s been a while since I’ve pulled a package from ATLAS and relied on the ATLAS PLP JSONs to populate a study package. My colleague’s interest in building the PLP in ATLAS is to allow his co-investigators the ability to follow along in his design choices.

So this brings up an old discussion Database platform support revisited, @gregk touted:

We know this is true but we’re finding what ATLAS spits out has a lot of quirks. ATLAS, as of this past summer, appears to have quite a bit of pain related to the skeletons:

Now for those of you scoffing at using V2.11, the ATLAS V.2.12 release notes do not indicate that the PLP skeletons were changed or updated:

Which is to say, unless it’s missing from the documentation for the release, the version of ATLAS is detracting point from the underlying problem: ATLAS PLP JSONs are “very old” (according to @jreps’s GitHub response above).

We’re trying to navigate a path forward for this study. If the ATLAS PLP JSONs are too stale, we’ll move towards recreating the same thing in our own from scratch study package. Jenna’s comments in the GitHub tickets suggest this is the current way most folks do this. (As someone who teaches a lot of tutorials with ATLAS, this makes me quite sad. :sob: )

I thought I’d put myself out there and publicly start a discussion on two points:

  1. Is there anything coming up in 2023 in the ATLAS workgroup (@anthonysena @Chris_Knoll et al) to tackle the JSONs compatibility issue? (I know there’s always been issues with skeletons but I do remember days when that was primarily a PLE problem not a PLP problem.)
  2. Are there others who are out there struggling on this? It looks like we’re not the first stuck here. If so, it’d be great to curate some cheats from other study designers on how others are working through this. (@TheCedarPrince and I have talked about being better about sharing code snippets and other lessons learned from running our own studies.) I would imagine my ERG friends would like these tips too.

Anyone else who’s too afraid to say these things you’re welcome. It’s here now. :slight_smile:

2 Likes

I heard from a little bird who uses MS Teams in lieu of a forum reply that there’s learnings from Strategus that may help make this easier…

I was given this link and told to try this approach: Instructions · OHDSI/PatientLevelPredictionModule Wiki · GitHub

I am not up on this and learning more. Posting here because nobody should feel bad if they don’t know these things.

1 Like

Hi @krfeeney,

To more broadly expand on the second point you mentioned earlier and open up the discussion even further, as both an early stage career researcher, principal investigator of a network study, site coordinator, and study participant (the person running someone else’s study), I definitely have encountered issues here and there.

Not exclusively with software however, but rather the hidden curriculum and oral tradition we share amongst ourselves. Some of these issues that I have personally encountered have been:

  1. How do you coordinate a multi-site research study as a first time site coordinator? Both in the personal sense (e.g. communication approaches, time management, etc.) and technical sense?
  2. What are principles on how I should exclude or include a site? How can I “trust” a site’s data?
  3. How to provide expectations to data partners in running the study (e.g. here is how much labor you could expect, how we will use the results from the study, versioning study versions across sites per fixes/changes, etc.)?
  4. How to build “off-ramps” into a study in case a site no longer wishes to continue in a study or has limited time?
  5. How do I contribute to packages when I run into an issue?
  6. What should I do when I cannot find a package that does the analysis I need in my study?
  7. How do I know I have constructed a “good” phenotype definition for my study?

These are just some off issues off the top of my head. In no way do I bring any of these talking points up just to complain or critique but rather to open discussion also around some ways I am personally trying to tackle some of these issues:

  • For issues such as 1, 2, and 4, in my current study on chronic mental illness and health disparities, some of my collaborators and I are beginning to draft both papers, documentation, and even a new chapter for the Book of OHDSI (if you are interested in joining myself, @krfeeney , @agolozar , @DY_Lee , and others in these effort, please let me know in either a comment of message on Teams).
  • For an issue like 3, I have adapted semantic versioning for running and versioning my study, adding on-site unit tests to verify the package runs correctly at a given site, and prioritizing extensive documentation. With my collaborators, this has been hugely helpful in quickly pinpointing issues, enabling my collaborators to open (several :sweat_smile: ) actionable issues on our study repo, and making a promise to study collaborators about the quality of the code they will run.

Digression: anecdotally, amongst 5 sites or so that have run the first stage of my study, the average labor needed to run the first stage of the study requires only 30 minutes to at worst 3 hours manual labor (i.e. running tests, setting their database connection, clicking through an RMarkdown notebook, instantiating renv environments, and reporting results).

  • As the herculean @Paul_Nagy and @Adam_Black OHDSI Titans know, for issues like 5 or 6, I continue to advocate to contributing guidelines to help spread the load of contributing to new packages or developing further documentation. I haven’t been able to do so much here beyond open as detailed issues as I can, file the occasional docs PR, and then either test fixes or have to find an alternative workaround such as building some of my own tools. Two things here that I want to draw attention to:

    • Objectives and Key Results are underway for the OHDSI Open Source Workgroup to think around how we can build out HADES and the OHDSI open source ecosystem as a whole! PLEASE if you want to share your thoughts or get involved, message me on teams or comment here and I can advocate thoughts at the workgroup as best as possible.
    • Google Summer of Code is picking up! I have mentored GSoC students over the past couple years in separate ecosystems and with the right mentorship, they can accomplish a ton, get paid, and enrich the ecosystem. I know folks on the board of NumFOCUS and GSoC and can make the connections about how OHDSI could join GSoC initiatives.
  • Personally, 7 is the one that absolutely terrifies me the most and has caused some long nights of personal concern. However, I heartily laud the work done by @Gowtham_Rao , @Evan_Minty , and others in the Phenotype Development Workgroup to really tackle this issue as well as work done by @noemie and @tonysun at Columbia. I am currently working with some collaborators on submitting phenotype definitions to the new Phenotype Library process about chronic mental illness (such as depression, bipolar disorder, and suicidality – if you are interested in collaborating on this effort, do not hesitate to comment or DM me).

Are these the best approaches? Probably not. Are there more unspoken concerns I can’t say off the top of my head? Absolutely. Are we equipped as a community to rigorously address these issues? :100:

I love the provocative nature of Kristin’s title for this post as it is spot on. There have been growing pains I have observed across the OHDSI ecosystem – which I personally see as a hugely fantastic thing – that I think we must exonerate from the closet. I only shared a few and what I am trying to do about them.

Thoughts?

1 Like

Please let me know if you have any questions or need assistance on the phenotype submission and evaluation process. There is a definition for Major Depressive Disorder here

but has not completed the full process to be accepted (i.e. there are no accepted definitions in OHDSI phenotype library for mental illness)

My understanding is that the R packages generated by Atlas are more of a starting point and may require some tweaking by someone familiar with R package development and the Hades tools. I could be mistaken though as I have seen them run unaltered using the Arachne Execution Engine.

Please share your experience using Strategus with the Hades workgroup if you give it a try!

Thanks, @Adam_Black!

Yes, the older releases of ATLAS had workable packages come out. They did need tweaking but they could run. What comes out now isn’t even close to a skeleton. We’re completely rebuilding the package.

Hi @krfeeney - just wanted to follow up on a few points in this thread.

Re: Atlas v2.11.1 vs. Atlas v2.12.0: The changes between these versions included an update to the Hydra library which is where the PLE & PLP skeleton studies exist. So moving to v2.12.0 will provide newer version of these skeletons. That said, there is an open issue we have discovered in v2.12.0 and have noted in this issue for PLP skeletons:

I’ve put in a proposed fix which includes updating this skeleton as mentioned in that issue. The fix is referenced in the issue above for anyone that wants to take a look but it is under review with @Konstantin_Yaroshove and team. This fix should be made available in v2.12.1 (hopefully by the end of this month).

Over the last 3-4 years since we introduced the PLE & PLP editors into Atlas, we’ve discovered that is a lot of work to both develop the core HADES packages, update the corresponding skeletons and then get these updates into Atlas. The release cycle for Atlas in particular has been slow so it feels like even when we update the skeletons in Atlas, they are already a bit stale.

I won’t dive into future solutions as I think that is an area for collaboration this year and my hope is that we can make use of the Strategus project to help us with future PLE & PLP studies.

3 Likes

Thanks @anthonysena for this context. Super helpful context. I really appreciate the breadcrumbs here to understand the evolution of ATLAS releases relative to skeletons.

I’m probably 3-4 years behind in my skeleton knowledge now. :wink:

As a study designer, I would love to find ways to support this. I’ll have to get better at lurking on the Strategus discussions. (Maybe @schuemie or @jennareps can remind me which area of MS Teams to go lurk in. :wink: )

t