I’m very happy to see everyone participating in this discussion!
@jon_duke: Yes, I agree we shouldn’t reinvent the wheel, and steal from existing lists where we agree with them. Also agree with @Patrick_Ryan’s point that we should practice what we preach for some time before we make our recommendations ‘official’.
@bcs: I was thinking we could have general recommendations that apply to all types of studies, and specific recommendations for specific types of studies. I dislike the detection-refinement-confirmation classification (but that is another discussion), it seems to make more sense to classify by study design (cohort method, SCCS, etc.)
We already have a demonstration of using negative controls in a CER setting, as you can see in this CohortMethod vignette. And yes, I think all studies should use negative controls.
The reason I mentioned CohortMethod as a ‘proven and tested method’ is because it uses various mechanisms for validation, including unit tests both of the package itself and those included in the Cyclops package. (it is far from complete though, writing unit tests is hard work).
Maybe we should first define some overall principles that we think are important, and turn these into concrete ‘rules’ later on. I would say the overall principles are:
-
Transparency: others should be able to reproduce your study in every detail
-
Be explicit upfront what you want to measure and how: this will avoid hidden multiple testing (fishing expeditions, p-value hacking)
-
(Empirical) validation of your analysis: you should have evidence that your analysis does what you say it does (showing that statistics that are produced have nominal operating characteristics (e.g. p-value calibration), showing that specific important assumptions are met (e.g. covariate balance), using unit tests to validate pieces of code, etc.)