OHDSI Home | Forums | Wiki | Github

Github projects and best practices for repo organization

Mentioning those that I think are most involved on this side of things, apologies if not. @Patrick_Ryan @schuemie @lee_evans @msuchard @Adam_Black @Paul_Nagy

Github Projects
This is likely a much larger topic but I’ll start with the use case: The Oncology WG, among others, would greatly benefit from the ability to use the updated ‘Projects’ within Github, but given these are considered organization wide, we do not have the permissions to create one. Is there a way to enable us to do this?

There are already a few of these newer projects in the OHDSI Github but thus far any attempts to follow suit have fallen flat. I’m guessing it’s likely difficult or risky to allow this organization wide as it hasn’t been widely enabled by default and I’m wondering if there’s a way to enable a larger subset of users to create them, perhaps using permissions given with ‘Github teams’ ?

Benefits:

  • the project management it enables is much more flexible, and you are able to slice up your project boards however you see fit. this would enable us to show our entire road map and each piece along with its readiness for use, what the current iteration is focusing on, the list of studies with statuses and dependencies, and so on.
  • this would allow us to reference tickets from different repositories. We often have a task, ticket, or milestone that is dependant on a vocabulary issue or something else outside of the repo

I know that the GIS WG would also benefit from this utility and there is the potential for leveraging it towards the “work package repository” effort that @Andrew has been pushing for (imagine how wonderful it would be if we could create a specific issue template or label to designate a ticket as a “work package”, which resides in whichever repository it is most relevant, and the work package repo could scrape them all in an automated way)

Repo Organization
Somewhat separately this brings up a larger topic of Github repo organization best practices. For many working groups or efforts in OHDSI the different components/utilities are split up into different repos, while in others there is a single repo with often many packages, of disparate functionality, all included inside of a single repo. For an example of the latter, the OncologyWG repo current houses data documentation/ddls/scripts/CSVs for the model, an ETL that contains an R package, a set of SQL scripts, and ruby unit testing, and 4-5 other R packages for various use cases, among other things. Additionally, as mentioned above, the working group often depends on issues and work which is managed outside of that repository (vocabulary, Koios, etc.). Consequently, not only is the repo overloaded but it is also insufficient to cover all of the project management needs of the WG.

Point being, if we are going to have any sort best practices or guidelines within OHDSI for repo management we need to decide between the typical Github organization of each package being within it’s own repo, or sticking with the “working groups have a repo and shove it all in there”. Personally I would side with normal Github conventions and split up separate packages into separate repositories, but in order to effectively manage a project that spans multiple repositories, with emphases on transparency and enabling collaborative development, we would need a mechanism like the updated Github Projects described in the first point above.

Relevant link:

Thanks @rtmill , first the administrative thing, I believe you have admin rights on the OncologyWG repo and you should have rights to create a Project within your repo. At least that’s what I’m seeing on my side, maybe we do a quick screenshare to verify.

The broader issue: I would like for repo admins to feel empowered to use whatever Github tools they find helps them advance our goal. I do not think we’re at a point where we can enforce a consistent practice across all repos within the OHDSI git organization, namely for the reason you point out, which is that we have different people using repos for different purposes.

For open-source software development, I think the OHDSI Git repo strategy that makes sense is that we have a repo for each software package (whether it be a HADES R package or web-based tool like ATLAS), these are structured to store source code and the associated documentation supporting that source code. It seems to me Projects might be a nice way to manage a development roadmap and release checklists, but I’ll defer to our package maintainers to decide how they’d like to take advantage of this new feature (and perhaps @schuemie , this could be a discussion at a future HADES meeting if we want to converge on shared approach at least across HADES packages).

For workgroup activities that are not focused on open-source development of a specific package, some groups have found GitHub useful for organizing their activities. Personally, I’ve found MSTeams helpful for workgroup management, meetings and document sharing, but I have no concerns with anyone who is leading a workgroup and has preference to use Git tools instead. If a workgroup activity matures to the point that the group is producing a specific software package that has a defined intent, specifications, and implementation, then I would recommend that a new Git repo be created (rather than having it nested within the WG document store).

With regards to the use of Github Projects: As Patrick mentioned, there’s a repo-level Projects feature that you should already be able to use. If you need cross-repo Projects I think that would be fine, as long as there’s a way to grant those rights to users without simultaneously giving them rights to delete other people’s repos, etc. I’m sure there is, I’m just not a Github expert, so if someone can tell me how to do that I will.

For repo organization: within HADES we have strict rules on repo organization, but that is because we aim to meet a certain quality standard for our software. For other repos I would think allowing flexibility would make more sense. If you feel that within your workgroup the repos are organized sub-optimally you should discuss within your workgroup. If you need more repos, let us know and we can create them for you.

Thanks both for the feedback. I did some more digging and it would appear the most straightforward short term solution would be to request that someone with the adequate permissions creates a project and then gives me admin access to that project to manage it from there on. See below:

Not a long term fix by having to keep asking you all for things but this would certainly help us push forward. My github username is the same as the forum - rtmill . Project name would be… ‘Oncology Release 2.0’? Not sure it matters too much, can likely change it later as admin.

Any help would be appreciated!

Relevant doc around project permissions: Managing access to your projects - GitHub Docs

Here you go: Oncology Release 2.0 · GitHub

Let me know if I messed anything up.

Awesome! Thanks a ton

@rtmill I found the Github project feature helpful for the hack-a-thons as a way to prioritize and organize issues and plan to use it again.

If you’re doing some reorganization of the oncology workgroup files, I think it would be nice to have a separate repo for the OncologyRegimenFinder.

t