Friends:
As a result of the work of the GIS Working Group, we are happy to announce the addition of a new “Geography” domain and two new vocabularies in the OMOP Standardized Vocabularies. It contains for each country we think we have an OMOP CDM a full set of hierarchical information:
- Country
- States or counties
- Cities and municipalities
- Districts
So, the CONCEPT_ANCESTOR table will contain for each country the states, counties, cities, etc. as usual.
This will help the use case of stratifying data by geographical information at the same level (if the data contains that information, of course). The strata can be useful for model co-variates, clustering (e.g. of infectious diseases) and visualization. Due to the hierarchy, it is also possible to roll up and aggregate geographical information. E.g if you have detailed address information (postal codes, cities, etc.) you can report them at higher levels like county or state.
We obtain the information from an Open Source initiative similar to ours: OpenStreetMap. The vocabulary_id is ‘OSM’. We also integrated the US Census Bureau Regions and Divisions (vocabulary_id = ‘US Census’).
Note that this is not perfect. If you find discrepancies please let us know.
We are also going to publish instructions on how to assign for each LOCATION record the right Geography concept. This might involve serious computation of finding points in polygons. And a region_concept_id field will be added to the LOCATION table.