Thanks, @hripcsa . Made one minor edit: person_time
--> birth_time
.
Re dropping dates, we need to hear from the coders what it would take. My sense is that we are talking about a year of work to get rid of the _DATE fields rather than a day of work. Most things we do touch those fields and they would all change. Software, tools, front ends, phenotype definitions, ETL, etc.
George
I just wanted to articulate the metadata aspect, and @bailey has done it for me, thank you
At Montefiore, we had different precision of procedure date/time coming from two EMRs: ambulatory procedures were recorded as 2016/10/01 00:00 because they timestamped them at midnight of the outpatient visit date, the other had actual time stamp. Now, as I focus onthe periop domain, surgical procedure time stamps are critical to analyzing effectiveness of robotic procedures and so many other aspects. So, Iād have to throw away Monte data when doing this type of comparative effectiveness research (this use case was brought up by someone else at the CDM tutorial training run). I hope these examples address @Christian_Reich and @Patrick_Ryan concerns about applicability.
My great concern, for years, has been about ā00:00ā time, indicating missing time and occurring in the majority of databases I have worked with. Does it mean midnight or missing time? In cases, where time truly matters (LOS, procedure duration, time between drug administration and an event or vise versa, etc), knowing about missing time matters.
Date/time precision is an addition to the proposal of having combined Date/time fields and conventions for representing missingness. When precision is not important, ignore precision field and run queries w/o case statements. When precision and/or time matters, utilize these fields. I disagree with @bailey regarding placement of these fields to a metadata table, because it would impact both query performance and ETL.
I understand that the tools will have to be ātrainedā to recognize these fields and it may be a longer term solution that we could address in the next CDM release. However, I know that additions of these metadata will significantly extend CDM in its ability to
a. handle variety of temporal data
b. remove ambiguity and imprecision of interpreting temporal data
b. integrate data from multiple data sources
c. streamline handling of temporal data in the software tools, especially those great ones that build query language around temporal events
Thank you for your attention to this long statement and important matter.
@hripcsa Itāll be important both to hear from ācoreā coders what the impact of changing from date to datetime would be on OHDSI-developed tools, and to make some guess at how much breakage there will be in individual usersā code. The latter may be more important in the short term, since I expect some people donāt have resources to fix things easily. So Iām not trying to minimize the burden of an incompatible change, just trying to balance that with the burden of redundancy in the long term. Is there a history in the group of deprecating fields that would inform the discussion?
@rimma I could be persuaded either way on the metadata. For me, the key question is how often itāll vary within a table. Iām not sure either has an advantage in query performance: using a metadata table forces a join when you want to use it; putting it in the fact table makes every table scan longer. Itās all a question of usage pattern.
Perhaps Chris, Patrick, and others who are in the middle of it can comment. My sense is that this is like a $1,000,000 or $500,000 investment to make a sudden move from dates to timestamps for all the coding and six months to a year moratorium on other work, but I may be overestimating it.
A more practical solution is to add timestamp and then over time remove date. But it is still an open question of whether we should eliminate date or not.
@rimmaās proposal is additive to the timestamp proposal and could be added now or in the future. I donāt think it is worth chasing how to encode narrative data, for example, because it is too complex. Sometimes you know the season but not the year or you know the day of week but not the week. And sometimes you want to store the variance of the time (how well you know it to be true) which is different from granularity. And then in practice, you discard all that and just use the timestamp anyway. The main cost of the proposal is disk space, speed associated with the disk space, and complexity to adopters.
There is pressure right now to add timestamp to the key patient tables.
George
True. I missed that. Done.
Still thinking through the proposals.
OMOP and OHDSIās principle for the CDM has been to include only those things that are provably and significantly useful for research, as opposed to representing everything there is to know about the data. To take an example, PCORnet has many ways of saying I donāt know something, whereas OMOP has one. I donāt think anyone builds practical cohorts based on how someone said I donāt know, and I donāt think real EHRs and payer data sets encode different versions of āI donāt knowā in any reasonable way.
I think adding a timestamp qualifies as significantly useful. You just canāt do inpatient or ICU research without granularity finer than day.
For storing granularity, I think it probably needs more work at this point. If we make it required, then I think most will pick a default of days or seconds and misrepresent what is actually stored. If we make it optional, then the few who use it wonāt be able to share their definitions on the network. The semantics are complex and not implemented in RDBMSs. E.g., if "what was the age at an eventā subtracts two month granularities, you end up with a 2-month granularity or a probability distribution. I would probably simplify and just cut it back to a month, but still there is a lot to implement. If we store granularity and donāt use it, then I think the cost is complexity and space. I think adding it later wonāt detract from the proposal.
George
I have updated the page.
http://www.ohdsi.org/web/wiki/doku.php?id=documentation:next_cdm:time
George
- 1!
we use timestamps for operations research and UX, as well as outcomes research in single LOS or episode.
Sorry for the newby questionā¦
How do I find out what was determined with this topic? I reviewed http://www.ohdsi.org/web/wiki/doku.php?id=documentation:next_cdm:time but did not seem to find a definitive answer.
In our group we are having some questions asked around requirements I created vs what should be developed.
Exampleā¦
Drug_Exposure_Start_Date
- Requirements = ONLY date MM/DD/YYYY.
- Our DEV team wants to see if (MM/DD/YYY HH:MM:SS) format should still be allowed in Drug_Exposure_Start_Date
Hi. On the date question, see the documentation for CDM version 5.1.0 (http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm). drug_exposure_start_date is only a date and is required. drug_exposure_start_datetime is only a datetime (timestamp) and is optional.
George