How to handle multiple diagnosis/drug_era/etc for one patient id in patient level prediction

@kangyh9659:

That’s the nature of the health data. Most people have no diagnosis, some one, a few a few and even fewer people many. How many illnesses can you have?10? 100? It’s called a sparse matrix. Most cells are 0 (not missing).

You may want to save yourself the effort. There is a Patient Level Prediction package taking care of all those issues, and many others. It’s not in Python though.

Good luck.