OHDSI Home | Forums | Wiki | Github

Confidence intervals for Precision-Recall curve

Hi,
We are developing a set of Patient-Level-Prediction models.
During this process it was suggested to describe the CI for the AUCPR.
I have not been able to identify a good way to do this and was wondering if someone could help?

I checked out the source code for both plotPrecisionRecall() and the evaluatePLP() (pr.curve), But i was not able to find a good way to add the compution of the CI.

Thank you

Best Julie

@cssdenmark

It seems to me the AUPRC is calculated in line 77 in evaluatePlp:

pr <- PRROC::pr.curve(scores.class0 = positive, scores.class1 = negative)

I don’t think the PRROC package supports computing CIs. Did you have a method in mind to do that? If not this chapter might help:

https://link.springer.com/chapter/10.1007/978-3-642-40994-3_29

They suggest using either binomial or logit intervals for the CIs and show with simulations it has sufficient coverage. Bootstrapping also works but it approaches sufficient coverage from below with increasing sample size and is also more computationally expensive.

They have R code with those methods on github:

For example the function in line 257 uses the logit method. You only need the auprc estimate which is already calculated in evaluatePLP (auprc) and the number of positive samples and your alpha. The number of positive samples is probably already there as well as length(positive) or something similar.

Hope this helps,
Egill

1 Like
t