The LOCATION table has all these mostly US-derived fields, but in essence it is intended to work in such a way that you could stratify by location. So, each location field is some reasonable geographic aggregation of patients or providers. In other words, you would write a something like group by location_id
. If you put in there individual addresses, apart from making the data very sensitive because of identiable information, it will kill that idea.
Having said all that: I would make a location_id either for each of the full 8 digit postal codes, or only for the first 6. The choice will depend on what you think is a reasonable level of aggregation. Given the in the US the zip codes have 5 digits, and there are a whole lot of more Americans than Swedes on this planet, I’d go with the 6. In the US, folks often even roll up to the first 3 digits of the ZIP code. So, you may even consider only 4 digits (county + town).
What we don’t have is a way to do those aggregations in a database-independent way. The addition of European data are nicely pushing this idea, since from an US-centric point of view all you do is think ZIP codes.