Handling Inconsistent Discharge Dates in Visit Occurrence Table

G-Accad · October 10, 2024, 3:29pm

Hello everyone,

I am currently working with the Visit Occurrence table using EHR inpatient data (NHS-APC). In this context, the visit_start_date corresponds to the admission date, and the visit_end_date reflects the discharge date. However, I’ve encountered a significant issue where around 11% of the data has a discharge date earlier than the admission date.

I’m seeking advice on how to handle these inconsistencies. Would it be best to remove the affected rows entirely, given the data integrity concerns? Or would it be more appropriate to replace the incorrect discharge dates with the corresponding admission dates?

Any guidance or suggestions would be greatly appreciated.

Thank you!

MPhilofsky · October 10, 2024, 11:03pm

Hello @G-Accad and welcome to OHDSI!

Do you have access to the source data? Do you have access to chart review the data? If yes, then I would take 20 random records and review the chart. Did these visits actually happen? Is the start or end date correct? Do these 20 random records all have the same issue (i.e. have the same default end date)? etc. If you can create a reliable transformation rule which mimics the data, then you can apply it to your ETL. If you answer no to any of these questions or you aren’t sure, don’t bring these visits into the OMOP CDM. The only way to produce reliable RWE is to use reliable RWD.

G-Accad · October 15, 2024, 3:30pm

Thank you, Melanie!

Your advice was very helpful. It looks like all the end dates have the same year 1800, which I believe represents “N/A” for NHS data. For the rows where we can’t determine the end date, would you recommend dropping them?

Thanks again!

MPhilofsky · October 15, 2024, 6:10pm

You’re welcome, @G-Accad!

When in doubt, throw them out!

You need to dig deeper. Are these real visits? Did a person have an interaction with a healthcare provider? If yes, give the visit end date = start date. If not, drop the record.