Our claims data (recently converted to OMOP) spans only 365 days and the study requires at least 365+90 - so we cannot participate.
Before the study gets executed, perhaps this OHDSI study could be the first that in addition to table 1 (overview of population) also presents data on data quality.
A paper by Kahn at al argued for a table 1a that would present data for observational studies.
See it here.
The current code in terms of metadata only gets
createMetaData <- function(connectionDetails, cdmDatabaseSchema, exportFolder) {
conn <- DatabaseConnector::connect(connectionDetails)
sql <- "SELECT * FROM @cdm_database_schema.cdm_source"
Perhaps this could be extended (before final execution). (and I would be happy to contribute to coding that, since I am suggesting it)
The OHDSI [DataQuality]
(StudyProtocolSandbox/DataQuality at master · OHDSI/StudyProtocolSandbox · GitHub) package tries to reduce the number of attributes about a dataset (instead of all Achilles results data it has a much reduced subset of parameters) that a site could share and that is vastly less “sensitive”.
This could also be be just a tiny handful of parameters: (1) size of dataset population (as a size category [not exact size]) (perhaps just patients with at least 365+90 days in at least one obs period) (2) “level of inpatient-ness (% of outpatient visits out of all visits recorded); (3) level of claims-ness vs EHR-ness(e.g., % of patients with weight recorded) and some temporal span measures. I recently made a proposal to Data Quality Collaborative for MIAD (minimum information about a dataset) (as a similar concept to MIAME (minimum information about a microarray experiment).