OHDSI Home | Forums | Wiki | Github

What is events per variable for logistic regression?

Hello Everyone,

I am working on binary logistic regression problem. My label proportion is 12:78 percent.

Meaning I have 1200 records for label 1 and 7800 records for label 0. I read online click here that there is 1:10 rule which will kind of give us the number of predictors that can be used without the risk of overfitting. I understand in our case, it would be 1200/10 = 12

So we can use 12 predictors without the risk of overfitting.

However, I also see that there is something called minimum sample size which can be calculated and called as Events per Variable.

Can you help me understand this?

I understand these may not be reliable and there are new findings which say that this isn’t reliable. But am just first trying to understand what is this

t