OHDSI Home | Forums | Wiki | Github

First step: Defining the broad research approach

Welcome to the task force @SCYou!

I would argue that because real negative controls likely have strong unmeasured confounding makes them ideal to evaluate methods! We want to make evaluate how well methods perform in the real world, not in a simulated ideal world (note that @aschuler’s approach also introduces unmeasured confounding).

One very important thing I realize we haven’t discussed: should we evaluate methods that try to quantify risk attributable to an exposure, or methods for comparative effectiveness. In other words, methods tend to answer one of these questions:

  1. What is the change in risk of outcome X due to exposure to A?
  2. What is the change in risk of outcome X due to exposure to A compared to exposure to B?

Question 1 can often be answered by reformulating it as question 2 by picking a comparator believed to have no effect on the risk. For example, in our Keppra and angioedema study we picked phenytoin as a comparator because we were certain it did not cause angioedema, allowing us to estimate the effect of Keppra.

I must confess I’m mostly interested in question 1, since comparative effectiveness methods can be viewed as answering question 1 by picking a ‘null comparator’ as argued above. But we could create two gold standards, one for question 1 methods and one for question 2 methods.

@aschuler, there is at least one thing we can do to evaluate unmeasured confounding: we can compare an evaluation using true negative controls to an evaluation using your simulation framework where the relative risk is 1 (no effect). If the simulation procedure is realistic enough, those two evaluations should generate the same results.

1 Like

FYI: I’ve put my notes and slides of yesterday’s meeting on the Wiki.

In summary, I think we decided to:

  1. Focus on created a ‘benchmark’ for population-level estimation methods, that shows how well methods work in general
  2. Go with synthesizing positive controls by injecting outcomes on top of negative controls (at least for now)

Based on @saradempster’s suggestion I’ve created a template protocol for establishing the benchmark. I hope everyone will join in filling in this protocol!

You can find the link to the protocol template in this topic.

Thanks Martijn! I will take a l ook asap. @schuemie - how do you want to receive comments i.e. posted here or in the document itself?

Just thinking further about metrics for assessing CIs.

If we are really interested in effect estimation, then want confidence intervals w.r.t. true value:

  • coverage

  • mean CI width

  • variance of CI width

  • bias (point estimate or CI midpoint versus true value)

  • see Kang and Schmeiser CI scatterplots [1] (e.g., CI half width versus midpoint)
    (they are much like Martijn’s scatter plots)

If we want to discover associations, then we want confidence intervals w.r.t. no effect (1), and the true value is irrelevant other than its direction:

  • this is really just a hypothesis test (p-value)

  • specificity is set at .95 (95% coverage of negative controls after calibration)

  • sensitivity is proportion of excluding no-effect (1) for positive controls
    can derive relation of sensitivity to CI: (CIwidth / 2) < EffectSize - 1

  • ROC area calculated based on point estimate of specificity and sensitivity
    (or perhaps could generate curve by altering alpha .2, .1, .05, .03, .01)

Just noticing that when we do p-value calibration and report coverage, we really should also report power on positive controls.

  1. Keebom Kang, Bruce Schmeiser, (1990) Graphical Methods for Evaluating and Comparing Confidence-Interval Procedures. Operations Research 38(3):546-553. http://dx.doi.org/10.1287/opre.38.3.546

George

@hripcsa: moved this discussion here

Hi @saradempster! Just add comments to the document itself.

t