Dear All,
The following concept: Athena (4190366: Basis of cancer diagnosis) is under domain “Condition”.
However, semantically, I think it should be under the domain “Observation”.
When I look into this, the original term: Athena (3575222) is actually in domain “Observation”. But it was later upgraded to domain “Condition”.
May I know the reason for the upgrade from domain “Observation” to “Condition”?
You are totally right, @wint. This is not a condition.
However, I am not sure why you even want to use this concept. What is called “basis” here we call Concept Type (I know, not the best term, but has happened for historical reasons). Want to use those instead and put them into the condition_type_concept_id, instead?
Nope! 4190366 is not type concept. Type concepts identify the provenance of a record. “Basis of cancer diagnosis” is not provenance. Put it back in the attic (Observation) until a real use case comes up
Thanks very much @Christian_Reich and @MPhilofsky .
I have a dataset (cancer registry) which contains the field “Diagnosis basis”. And it contains the basis of diagnosis for the patients, such as “death certificate only”, “based on cytological evidence”, “based on histological evidence”, “clinical investigation including imaging, etc”. So, my thought is to capture this in the “Observation” table with the observation_concept_id as “4190366” (Basis of cancer diagnosis), then the value_as_concept_id is something like "
Cancer diagnosis based on cytological evidence, etc.
Does this make sense?
Yes, your question is something I have kept asking myself too. For now, I am bringing it over to OMOP because it is in the source data. We do not have many fields, i.e. only around 40+ fields. As a result, I am thinking of bringing everything over to OMOP to ensure comprehensiveness.
To me, the information can give me an overview of the overall basis of diagnosis for the cohort. As an ETL engineer, I am thinking along the perspective of how to ensure the source data is fully represented in the OMOP schema. Given this, the users will have a better overview of the dataset itself.
What would be the suggested workflow in this case? Are we supposed to bring all the source data into OMOP? Or we will be selective in terms of what we need to do or analyze in the study? Or we do hybrid, i.e. we try to bring all the source data into OMOP, but for those difficult to map (or vocabulary is not available) and unlikely to be used in the study, we will put a note in the design document on why it is not being OMOPed? Currently, I am doing hybrid, but I still try my best to map whatever can be mapped from the source data.
You should bring in all data which easily maps to the OMOP CDM (drugs, conditions, procedures, etc.) and all the data which has a use case/is used to answer a study question. Taking this advice into account, you wouldn’t bring in “basis for diagnosis”. BUT, you hypothetically say, “it’s a small dataset and my bosses will be happy I achieved 100% mapping rate”. Which is great, until the vocabulary or source data is updated and these concepts become non-standard. Now you have DQD errors, bosses are unhappy, and you have to:
Analyze the source of the error - concepts are now non-standard or new source values were added
Go digging through Athena or utilize Usagi to find an appropriate standard concept
Update the mapping
Possibly the ETL SQL depending on how you did the mapping
Update your design document
Run your QA checks (iterate as needed)
Run DQD (iterate as needed)
Then push it to production
Notify all the stakeholders
Or document it to say it’s not brought in at this time because it doesn’t easily map and there isn’t a use case (a real use case, not a hypothetical “it’s good for…”), but it will be brought in when researchers want to use the data.
Thank you @MPhilofsky for the input. I think you have stated the pains that I have been going through.
I think your suggestion is the most practical way to pursue. Unless there is a “use case” for the difficult field, we will document it as to why it is not brought in at this time.
Thanks very much for your time and help. Appreciate it very much.