I’m working with bringing whole Estonian EHR documents (yes!) to OMOP model and the same issue strikes…
In short, for approx 1% cases the patient birth year is not recorded in our EHR as it is not mandatory to record.
Unfortunately, birth year is mandatory in OMOP. What concerns me most is the recommendation “For data sources where the year of birth is not available, the approximate year of birth is derived based on any age group categorization available.” (https://github.com/OHDSI/CommonDataModel/wiki/PERSON)
I think this suggestion is made for data sources that are missing birth years in their structure completely, but making assumptions of the birth year is not a good practice anyways, I think.
However, missing birth year does not necessarily indicate poor data quality. For several research questions (e.g estimating healthcare costs for a nation) this is not a necessary parameter. Similarly, the gender. I’d prefer missing birth year to “estimated year of birth”.
Thus, from my point of view, data quality is always something that depends on the question we are trying to answer. Sometimes it is important to have the birth year information, but sometimes it is more important to have complete data (of the nation) with the costs, even without the exact birth years/genders.
I would be very concerned of dropping 1% of the data just because the birth year is a mandatory field. Any suggestions to solve this?