We have Truven Claims data that has been extracted from flat files provided to us and loaded on to the CDM tables in the standard OMOP format.
However when I wanted to run Achilles so that I can use Atlas to view the data, I found that almost all the source tables like Person, Visit_Occurrence, Procedure_Occurrence, Death etc, have duplicate records. The key fields (like Person_id in Person table) in each of the tables have duplicates.
In the Observation_Period table, there are duplicate records on key field (observation_period_id), however if I consider the combination of observation_period_id, person_id, there are no duplicates.
I am looking for inputs from anyone who has been in this situation before and if they have any suggestions or inputs on how to address this issue. I am planning to run Achilles with creating the primary keys on most of the tables and see how it goes.
Will Achilles run even complete in this kind of scenario?