From:Patrick Ryan ryan@ohdsi.org
Sent: Fri, Jan 30, 2015 at 4:48 PM
i added notes to our leadership team meeting minutes. it was the most lively leadership team meeting…because we were delinquint and talked science instead:) so, is circumcision a good or bad negative control outcome?
on the call i made the argument that circumcision highly confounded by age.
i’d like to add to that discussion that it’s also confounded by gender
…and apparently, in CCAE, confounded by index year as well!!!
and there’s a gender * age interaction, which currently none of our methods would handle…
– Graph here showing Achilles plot of Prevalence per 1000 People against Age Dociles –
and i owe christian an apology: it’s not that the ICD9 diagnosis code for circumcision (V502) maps to a standard condition…it’s that it doesn’t map to a standard concept at all! http://www.ohdsi.org/web/hermes/index.html#/concept/44820443
The next version of VocabV5 will be perfect, so we needn’t worry about this
From: Jon Duke [mailto:jonduke@regenstrief.org]
Sent: Tuesday, February 03, 2015 3:20 AM
Patrick,
It was a fun conversation indeed! Guess that’s what happens when we go ‘off-topic’
I have a follow-up question, related to articulation of our calibration methods.
For the original OMOP research calibration approach (fixed outcome, multiple true-negative exposures), I had a very easy time selling colleagues on its logic and importance. My explanation was something along these lines:
When we want to look at a possible exposure-outcome relationship, we have to consider whether the source dataset has some inherent undetectable factors that make the outcome appear more (or less likely) than it really is. So for each outcome we care about, we take a set of exposures that should have absolutely no correlation, and look to see if our dataset suggests otherwise. If so, we can calibrate our significance requirements so that outcomes that skew positive or negative at baseline, for whatever reason, can be interpreted in context.
I’ll usually give an example of our testicular cancer referral center at IU, with those doctors being more attuned to certain diagnoses or side-effects that might go undocumented (or unnoticed) in a less specialized environment.
Anyway, the idea has good face validity, because everyone knows that doctors may choose to code or not code things for lots of arbitrary reasons.
For the new approach (fixed exposure, multiple true-negative outcomes), I am having a slightly harder time making the same clear articulation. The story is now basically this, I believe:
For each exposure-outcome relationship, we look at that exposure and say, what outcomes should it not cause. Then we look at our data to see if it shows associations between this exposure and these outcomes. If spurious associations are found, we can calibrate analyses accordingly.
My articulation problem is that this orientation feels somehow less obvious. It seems completely logical that systemic biases would travel with outcomes (our dataset “overcalls” myocardial infarction). But to tell the same story with exposures (our dataset “overcalls” things caused by taking clopidogrel), just feels weird. I am sure the statistics stack up both ways, and probably better the latter given our shift, but it feels off-kilter to me in explaining it to people.
Can you guys help me to a clearer interpretation?
From: Martijn Schuemie schuemie@ohdsi.org
Sent: Mon, Feb 2, 2015 at 3:18 PM
Allow me to take a stab at this:
The reason for the shift is that we’re currently using the CohortMethod a lot, and many epidemiologists believe the cohort design to be the best epi design there is, since its the design most similar to the holy RCT. As you know, in the cohort design, we start with two cohorts: people starting the treatment of interest, and a comparator group. Often, the comparator group is defined by those people taking another drug for the same indication, a so called ‘active comparator’. We could think of those two groups as two arms in an RCT, if it wasn’t for one catch: the assignment wasn’t random. There’s a reason why some people got the one drug, and others got the other, and this can cause channeling / confounding by indication. We then do all sorts of fancy tricks with propensity scores and doubly robust methods to try and eliminate this confounding. There have been many neat papers suggesting (typically using n=1) that these tricks indeed reduce confounding, but you don’t know how much is left. This is where the negative controls and calibration come in to play, which we can use to map the residual bias in the design. This bias includes the confounding by indication, but also any tendency for misclassification of outcomes in the system, and more.
Hope this helps!
From: Nigam Shah nigam@stanford.edu
Sent: Mon, Feb 2, 2015 at 6:45 PM
Is there anything to point at which shows how calibrating by fixed exposure, multiple true-negative outcomes quantifies (or maps) the bias in the design? Seems like this way of calibrating could also obviate the need to propensity score matching.
From: Patrick Ryan [mailto:ryan@ohdsi.org]
Sent: Monday, February 02, 2015 7:30 PM
i started this thread, so this is my fault, but i think this is a good conversation for the Researchers Forum rather than private amongst ourselves.