Just thinking further about metrics for assessing CIs.
If we are really interested in effect estimation, then want confidence intervals w.r.t. true value:
mean CI width
variance of CI width
bias (point estimate or CI midpoint versus true value)
see Kang and Schmeiser CI scatterplots  (e.g., CI half width versus midpoint)
(they are much like Martijn’s scatter plots)
If we want to discover associations, then we want confidence intervals w.r.t. no effect (1), and the true value is irrelevant other than its direction:
this is really just a hypothesis test (p-value)
specificity is set at .95 (95% coverage of negative controls after calibration)
sensitivity is proportion of excluding no-effect (1) for positive controls
can derive relation of sensitivity to CI: (CIwidth / 2) < EffectSize - 1
ROC area calculated based on point estimate of specificity and sensitivity
(or perhaps could generate curve by altering alpha .2, .1, .05, .03, .01)
Just noticing that when we do p-value calibration and report coverage, we really should also report power on positive controls.
- Keebom Kang, Bruce Schmeiser, (1990) Graphical Methods for Evaluating and Comparing Confidence-Interval Procedures. Operations Research 38(3):546-553. http://dx.doi.org/10.1287/opre.38.3.546