Standard table 1

I like the topic and was arguing for some basic dataset characteristics in this forum reply in the Alendronate study thread.

I created an R function (and file) that can be added to OHDSI studies to include in exported .zip file some minimum data about a dataset. To my disappointment, I did not get email reply back from Marc (when I emailed him the .R file with the function (the patch he suggested).

The generation of some data about dataset is described on GitHub [readme section here] (https://github.com/OHDSI/StudyProtocolSandbox/tree/master/DataQuality#1generate-miad-minumum-information-about-a-dataset) in the DataQuality package.

The Kahn paper argues for table 1a. (not table 1). I think OHDSI (and similar) studies that use multiple datasets are unique and novel in their use of the multiple datasets. Readers deserve to be told more about the included datasets. There is a level of aggregated dataset and level of individual datasets. Both levels are important to describe. Science is enhanced by contrasting results on various datasets (as seen in the treatment pathway study).

Instead of replicating table 1 and deep academic discussion about what gets in or out (which is valid) - I think there is additional problem that is unique to multi-datasets studies (“OHDSI like”) studies …and discussion of table “1a” (or even “1b” (not just data quality; which is in 1a)) and communicating the composition of the datasets that together comprise the aggregated" population.