My 2 cents re the potential differences between observed and unobserved treatment and why it matters.
To me it seems likely that systematic patient differences affect both whether they are observed for three years and their treatment pattern. For example, at Kaiser, duration of observation for a given patient is a function of plan membership. Membership churn fluctuates over time and across Kaiser regions but was often around 20% for some years/regions.
In general, four things determined whether a patient stayed with Kaiser year-to-year and hence whether they met a multi-year observation inclusion criterion: 1) Individualâs loss of employment; 2) Employerâs contractual choice - whether they offered Kaiser as an option that year; 3) Individualâs change of employer or relocation; 4) mortality.
All four could cause systematic differences in both who was observed in the EHR over some multi-year interval and how their condition was treated.
Loss of employment affects the affordability of available care options. Change in care provider due to employer offering or change in job/location affects the character of care - e.g. Kaiser has a fairly systematic medical group-led approach to disseminating, monitoring, and supporting adherence to clinical practice guidelines for major chronic conditions. So the unobserved care that patients received after leaving Kaiser was likely to differ in character and, as a group, be more varied than the care observed in patients captured in the Kaiser EHR. To the extent that patients die prior to a minimum duration inclusion criterion because of how they are treated, important differences in care patterns are excluded.
Whether the reasons that patients with chronic conditions go unobserved for similar reasons in OHDSI data sources and in Kaiserâs EHR, I think the potential-for-bias question is the same: Are there things that affect both: A) whether patients are observed for a given duration and B) how they are treated when they are not observed?
My hunch is that the answer to that questions is probably Yes for all OHDSI data source types. But whether my hunch is correct or not, it is a suspicion many will harbor, especially when analyzing care for a chronic condition where treatment length is expected to exceed the duration criterion. So, in my opinion, the arguments for defending a proposed lack of bias are worth sharpening. Ideally a clever analytic strategy that accurately estimates the bias would be even better. Iâm not sure that sensitivity analyses without a source of ground truth do that very well.
Though it may be a difficult problem, I think the soundness of OHDSI-based science will be greater the more able we are to characterize factors that affect the completeness of samples and their potential to distort our inferences whether the analytic aims are descriptive or hypothesis tests. The potential for bias to have important effects on the accuracy of how we characterize care or the validity of statistical hypothesis tests seems very real to me. I think this is an issue that goes to the heart of questions about the research value of observational data sources in general, not just OHDSI. Developing best practices for addressing it as rigorously as possible would be an important contribution.