I would like to continue the metadata discussion at the upcoming CDM WG call.
I created a modified table proposal that possibly addresses some of the points raised during the last discussion
The key is not to confuse metadata with data characterization as done by Achilles. (achilles_results table). An ETL or data warehouse insider knows a lot about a warehouse and the point of metadata is to put some of this "insider" knowledge into metadata - so that a user (or analyst) can get quickly "semi-intimate" with the data by just reading some some smart and organized notes made by the insider.
Perhaps we can propose a shell and let the community decide how to best use this shell to put some useful metadata content into it and in phase2 made metadata tighter and better. The perfect should not be enemy of the good in this phase 1.
Perhaps every WG member can provide examples of metadata that they would like to capture (and post here).
Mine would be:
- dataset is updated once a year (or monthly or ...)
- dataset reflects only data from clinical trial (not routine care)
- Achilles is executed after each data refresh. achilles_results are always available
- dataset has drug order data as well as pharmacy dispensation data (can study 'patient did not fill his prescription' questions)
- weight data comes from Health Risk Assessment done by health plan (not from EHR)
- PHR data is present (in OBSERVATION table) but not mapped to any standard concepts and there are no plans to do this mapping
- procedure data are in local codes only (any phenotype using procedures has to tweak the standard code in the phenotype to the local code)
- dataset has EHR data only (and no claims data; site has no affiliated health plan)
- dataset has claims data with "sparse lab data" (e.g., all that come from an accessible source, such as LabCorp). Such data does not reflect all lab data. Inpatient lab results are not present. (not available)
UPDATE: after CDM WG April meeting - the proposal was updated with phase 1 and phase 2 scope and use cases were updated.