We’ve started working on the SCCS R package, but there are many questions still open.
In the original MSCCS design we added all drugs to the model, we fitted the model once, and the beta for each drug was the relative risk estimate. We could however decide to fit the model once for every drug, and treat the drug-of-interest differently from the other drugs, which can be considered covariates. For example, we could define the risk window for the target drug to be only the length of exposure, but add x days to the end of exposures of other drugs. That would create a memory of drugs taken in the past to be used in assessing the risk for a particular day. We could add different covariates for the same drug, with different persistence windows, and we could add conditions, procedures, visits, observations, etc. to the covariate list. We could specifiy different persistence windows for the different types of covariates (drugs, procedurese, etc.), or even per concept. We can go nuts here, but should we?
Another issue is picking up causal intermediaries. One example that comes to mind is when I was trying to fit a model for UGIB, and the drug with the highest coefficient was the drug you take before your esophagogastroduodenoscopy to confirm the UGIB diagnose. Clearly that drug didn’t cause the UGIB, but it is masking whatever is. How do we detect these, and remove them from the model?
Yet another issue is contraindication. If a drug is contraindicated for a particular outcome, it will appear to have an increased relative risk because the outcome will be less prevalent before iniation of the treatment (else the doctor would not be following guidelines). Some people remove a period of observation time just prior to treatment initiation to get around this (e.g. Tata et al), but this doesn’t seem very optimal.
Lastly there’s the issue of outcome occurrences effecting the probability of future occurrences and/or end of observation. Some options here are censoring on date of first occurrence, eliminating all but the first occurrence (but keeping all observation time), or modeling the dependency.
I would hereby like to kick-off the discussion around these topics. Feel free to add! (especially @msuchard, @tshaddox, @David_Madigan, @Patrick_Ryan)