Hi everyone,
I am currently trying to implement a reweighting algorithm.
I need some help / advice regarding how the covariate matrix should be handled.
The packages that I use / my custom implementation expects features that are organized as:
sample * features
i.e., the rows would represent each patient and the columns each feature.
However, it is rather unclear how features are originally handled in the cyclops library documentation. I tried checking out some internal variables, and it seems that:
- Features are all transformed into one-hot: for example, lab values are transformed like “high sodium within 1 yrs of index”
- Only a few chosen features, such as comorbidity index, are integer-valued and regularized afterwards
My questions are:
- Are these one hot features only for patients with corresponding records?
- Would it make sense to transform this into a covariate matrix as mentioned? I plan to simply have the value to 0 when the patients do not have that records.
- How are these variables regularized?
Thanks in advance for any help.