You are right of course: In the context of a fully federated remote network setting, there is no immediate use case for a graph database. It would not make sense to replace the relational database (because of the ‘common’ in CDM) and without direct access to all data it is not possible to pull everything into a graph.
I would argue the following: Different data storage paradigms and their associated ecosystems open routes for different kinds of applications. Let’s for the sake of the argument assume that it is possible to pull all data into a graph database. This would allow to apply graph algorithms and analysis to all three of the main use cases you describe. Clustering for stratification, weighted ranking algorithms for population-level estimates and graph neural networks for patient-level prediction. In theory, it is possible to get all the data from OMOP CDM, create networks with some R package and run the analysis. From a practical perspective, it’s much easier to do it with data in a graph database.
I don’t really know how a graph database fits into the overall architecture of OHDSI/OMOP CDM. It’s mostly my general curiosity and my personal preference for graphs
I actively work on modeling the standardized vocabularies in Neo4j. I’ve been using different types of medical terminology as part of Neo4j applications for years. I have always struggeled with versioning and mappings. Outdated mapping projects, incomplete mappings and all those problems. The vocabularies of OMOP CDM are a fantastic ressource that somehow fly under the radar, i.e. you have to read into the documentation to find out about them.
I’m not sure if it’s possible to use them (licensing, terms etc) but they would definitely benefit many graph based data integration projects.