Hello All,
I am wondering if anyone else has had a shared experience of mapping data source mappings. I have historically managed this by using a source_concept.csv
and a source_concept_relationship.csv
file which would be appended to over time.
I have avoided the source-to-concept-map table for the reasons laid out in other forum posts (e.g. a hangover/legacy table, etc).
Using CSV files with git looks like a good solution at first - until someone opens it in Excel (often leading to type/formatting/rounding issues), or adds a load of concepts with overlapping concept_id
s and so on.
I think the approach is doable by one person, but scales poorly.
I have mocked up a lightweight CRUD RESTful API which is in front of a Postgres instance which allows submission/deletion of mappings easily - and others to contribute without all the issues that can occur with git and CSV: for example, automatic ID generation, constraints to ensure no duplicated mappings (enforce many to one relationships) and so on.
My question is: what are other people’s experiences of this? I have heard of people using Git for the most part, but I want to avoid for the reasons above - which the only way you get around is by using precommit/GH Workflows to check there are no duplicate entries etc: by which point one has recreated a database using GitHub and this feels wrong!
Am I overthinking this? Have others used a database to handle this?
Many thanks!