Hi OHDSI family,
I have a background in biomedical physics and currently work as a full-stack programmer for large-scale web applications. I’m also halfway through a master’s program in Systems Biology at Ghent University in Belgium.
I’m very eager to pivot my career into the fascinating field of biomedical sciences. Having completed most of the coursework in mathematics, statistics, clinical study design, and machine learning—and with over a decade of programming experience—I believe I can be a valuable asset to a research group.
Many friends and alumni active in the biomedical field have told me that building a machine learning model is only a small part of their overall workload. A significant portion involves setting up CI/CD pipelines, creating user-friendly frontends, and handling data preprocessing. Tasks like mapping datasets into usable structures and managing large relational or NoSQL databases often take up more time than the modeling itself.
Interestingly, those are the areas where I bring the most experience. Designing robust data pipelines, managing large databases, and deploying full-stack applications are the core of what I do.
To start applying for relevant positions, I’m putting together a small portfolio to showcase these skills, and I would really appreciate feedback from the OHDSI community to help me focus my efforts.
Here are a few project ideas I’m currently considering:
- An agent-based model to study the effect of fasting on chemotherapy effectiveness by simulating tumor progression in a 2D or 3D lattice.
- Training a model to detect arthritis in knee X-rays. I recently had an X-ray (which fortunately showed no signs of arthritis) and found an existing ML model that had mediocre performance. After reviewing the author’s code, I identified several opportunities for improvement and would like to build a more accurate version.
- Exploring data visualization using Apache Superset. I have an offline database of reported drug side effects (20M+ records) and use it to practice building meaningful visual dashboards.
- Contributing to OHDSI by adding support for debiased machine learning estimators (I wrote a separate post on this).
It was actually through a conversation with an acquaintance—where I mentioned struggling to categorize symptoms and drugs for Superset dashboards—that I was introduced to OHDSI. It turns out that this challenge is one of the core motivations behind OHDSI: standardizing medical terminology across databases. It was reassuring to discover that I wasn’t alone in facing this issue.
Thanks for reading! I’m excited to contribute and to learn from this amazing community.