All, CDM v6.0 was release over two years ago and it seems like we are still at a low rate of adoption. One of the main drivers is the fact that the tools we all use to define our cohorts and run our studies do not yet support this version. For the record this is not meant to shame our wonderful developers. They have been working extremely hard to add new features to ATLAS, CohortMethod, PatientLevelPrediction, etc. This is simply meant to point out that something needs to be done to help move v6.0 forward. I spoke with @schuemie along with the other members of the HADES workgroup to understand what we can do as a community to increase support of this CDM version.
As a reminder one of the biggest changes between v5.3.1 -> v6.0 made all *_DATE values optional and all *_DATETIME values required. This was part 2 of a three-part effort to eventually remove the date values in favor of the datetime. As Martijn pointed out, many data partners do not have data with the precision of time information so time must be imputed. The current convention is to set the time at midnight (00:00:00) if time is unknown. This then presents a problem: how do we know when times are assigned the value of midnight vs when an event actually occurred at midnight? This question seems to be the biggest barrier to wider adoption and I talked through some solutions both with the HADES group and the CDM working group. I am eager to have the community weigh in on this so please read through the options and comment/discuss below.
-
Add a precision column for each datetime value With this option we would continue on our path to eventual deprecation of the dates and add a precision column for each datetime indicating the lowest level of information given in the source data (day, hour, minute, second). This would mean that we would need to think about how the precision would impact our current coding practices, however, both @msuchard and @hripcsa mentioned the use of a datetime precision column in other projects and this seems to be the way many other groups are handling this sort of uncertainty.
-
Revert *_DATE fields back to required for now This option is the lowest barrier to entry to get the current tools working on v6.0. The plan as discussed in the CDM workgroup meeting would be to revert the requirements for now to allow for our backlog to catch up. Many people have been waiting for the oncology tables, bugfixes, and updates and this would give us the time to do that while still allowing use of the tools as-is.
-
Some combination of 1 and 2 This option would be to start with #2 as a way to clear out our backlog. We would write up a formal proposal for #1 and go through the proper process to get it approved. The idea here is to give everyone some time to work up to #1 as we are not clear on the amount of rework that is necessary with such an implementation.
We are interested in your feedback, this decision of course affects everyone and we in the CDM workgroup want to make sure we take everyone’s thoughts into account before settling on a decision.
@Patrick_Ryan, @Rijnbeek, @krfeeney, @ericaVoss, @MaximMoinat