Hi @MaximMoinat May you share an ETL example in Python with me too? I’m working on ETL for EHR data now and use Python as a main language.
I started pyomop before I saw @jbadger3 's inspectomop. pyomop is similar but may be easier to extend for ETL and machine learning. https://github.com/dermatologist/pyomop (I have not tested it much yet).
Hi all and @jbadger3 ,
Thank you a lot for suggesting different packages. I spent sometime studying both pyomop and inspectomop.
I am currently more inclined to the idea of inspectomop, which uses SQLAlchemy to connect to all database back-ends. Our database may not grow over 500GB very soon, however, we might need to move to Redshift in order to allow connections from multiple users. (Another reason is that I am not an expert in SQL anyway, so I would rather learn SQLAlchemy)
In contrast, I think pyomop is still using the raw SQL queries which are stored in the file sqldict.py. I am wondering how this package is easier to extend for machine learning? I guess that after each query, we just convert the result to a dataframe (which both packages offer) and then use the dataframe for data exploration/machine learning?
By the way, @jbadger3, thank you for your long insightful reply. Are you still updating the package on your personal repos, or have you completely migrated to OHDSI’s repos?
(I downloaded it from your personal link 1 or 2 weeks ago)
Cheers and have a nice weekend to all,
Hung
Great to see developments in Python packages for OHDSI.
Sure, a lot of what we develop at The Hyve is open source: https://github.com/thehyve/ohdsi-etl-caliber
I would also be curious about how easy it might be to train pytorch models on OHDSI data. Glad to see there are tools that are starting to make it easier!
Hi everyone, I realize this thread is old but I am interested in connecting with people using Python with OHDSI tools. I see the risk mentioned above about keeping R and Python tools in sync, so does anyone have guidance in how the community is thinking about incorporating Python in ways that don’t duplicate the existing functionality of the R tools?
Here is one interesting python package to check out.
At IKNL we started to develop a python package which wraps the OHDSI tools (DatabaseConnector
, Circe
, FeatureExtraction
, etc) using the rpy2
library. We are still figuring out on how to make it the most pythonic ohdsi experience as we can. Happy to accept contributions, consider suggestions from the community.
https://python-ohdsi.readthedocs.org
We started the development for coupling our federated learning framework vantage6 to OMOP data sources. Related post: Feature Extraction for vantage6 Federated Analysis - Developers - OHDSI Forums