What sorts of external validations do people here generally do on ML models?

FishmanL · December 16, 2022, 7:26pm

Specifically, asking about those of you doing validations like those from https://github.com/ohdsi-studies/PORPOISE – what metrics are people using to judge model performance, and how are you specifying the features/preprocessing the model requires? (Is it just arbitrary code, or is there some json/yml style spec for required columns/transforms?)

We’re currently building a framework around semi-federated privacy-preserving model validation and would love any details here you can give!

egillax · December 27, 2022, 8:47am

In general we use the PatientLevelPrediction package to do external validations. It supports quite a number of metrics. Personally I use ROCAUC and AUPRC for discrimination and smooth calibration plots for calibration. Then there is net benefit for clinical utility.

The package does take care of which features are used in the model and how they should be transformed. For example continuous features might be normalized by the max value to have a max value of 1, that max value from the training set is stored and the same feature from the external validation set is normalized by that same value. I guess that falls under arbitrary code although our models are stored in most cases as jsons. See here for example. If you poke around in that folder you can see how the full results are stored with all the preprocessing info.