OHDSI Home | Forums | Wiki | Github

How to identify synthetic data

For a project, we want to mix real and synthetic data in the same OMOP database. We were thinking in telling them apart from the _type_concept_id, but current allowed type values doesn’t seem to cover this meaning. Which value from the current ones would be the most fitting for this purpose? Should we propose a new type?

This does not look like a common use-case, so there is no established solution. Maybe the way to do it is to extend tables of interest with an extra custom flag field?

Yeah, I would suggest doing so. I mean you can create a custom standard (2bil) type concept and use it for synthetic data.

Want to use the Type Standard algorithm?

We assumed standard algorithm it was related to the selection process, not to the data itself, but it was one of the alternatives

No. Type concepts finish the sentence “this record was obtained from …”. It does not talk about the data manipulation process.

1 Like