Hi there -
I’m running into an issue with the concept_ancestor
table but I’m not sure if it’s an issue with my understanding of how the table functions or some other issue.
My understanding is that if I find an “ancestor” concept then it should have all descendants concepts, including those of descendants further downstream. So in this example if I were to look for descendants of Heart disease (concept_id = 321588) then this should be a “superset” of concepts that would include any descendants of Atrial fibrillation (concept_id = 313217). Is that accurate?
Assuming this is the case, I tested my understanding with this SQL query where I expect for the result to be num_descendant_not_in_parent = 0
.
with parent_descendents as (
select distinct ca.descendant_concept_id
from vocab.concept_ancestor ca
where ca.ancestor_concept_id = 321588 -- heart disease
),
child_descendents as (
select distinct ca.descendant_concept_id
from vocab.concept_ancestor ca
where ca.ancestor_concept_id in (select descendant_concept_id from parent_descendents)
),
all_descendents as (
select descendant_concept_id
from parent_descendents
union
select descendant_concept_id
from child_descendents
),
compare_descendents as (
select
all_descendents.descendant_concept_id as all_descendent_concept_id,
pd.descendant_concept_id as parent_descendent_concept_id,
cd.descendant_concept_id as child_descendent_concept_id
from
all_descendents
left join
parent_descendents pd
on all_descendents.descendant_concept_id = pd.descendant_concept_id
left join
child_descendents cd
on all_descendents.descendant_concept_id = cd.descendant_concept_id
)
select
count(distinct all_descendent_concept_id) as num_descendant_not_in_parent
from
compare_descendents
where
parent_descendent_concept_id is null
;
That code does return 0 as I expected. However, I have another situation where this is not true. I’m interested in finding all surfactant drug exposures and am having a problem with this same “jump” to descendants.
For example, in our Atlas instance I see that lung surfactants - combinations; inhalant, parenteral (concept_id = 21603535) has DRC and does show the records correctly. For example: Lung Surfactants → calfactant (concept_id = 19089891) → calfactant 35 MG/ML (concept_id = 19089893) and so on.
But when I try to run a query like this I get 0 results:
select *
from cdm.drug_exposure
where drug_concept_id in (
select ca.descendant_concept_id
from vocab.concept_ancestor ca
where ca.ancestor_concept_id = 21603535 -- lung surfactants - combinations; inhalant, parenteral
)
;
So I tried this ancestor concept using the above SQL to compare descendent pairs:
with parent_descendents as (
select distinct ca.descendant_concept_id
from vocab.concept_ancestor ca
where ca.ancestor_concept_id = 21603535 -- lung surfactants - combinations; inhalant, parenteral
),
child_descendents as (
select distinct ca.descendant_concept_id
from vocab.concept_ancestor ca
where ca.ancestor_concept_id in (select descendant_concept_id from parent_descendents)
),
all_descendents as (
select descendant_concept_id
from parent_descendents
union
select descendant_concept_id
from child_descendents
),
compare_descendents as (
select
all_descendents.descendant_concept_id as all_descendent_concept_id,
pd.descendant_concept_id as parent_descendent_concept_id,
cd.descendant_concept_id as child_descendent_concept_id
from
all_descendents
left join
parent_descendents pd
on all_descendents.descendant_concept_id = pd.descendant_concept_id
left join
child_descendents cd
on all_descendents.descendant_concept_id = cd.descendant_concept_id
)
select
count(distinct all_descendent_concept_id) as num_descendant_not_in_parent
from
compare_descendents
where
parent_descendent_concept_id is null
;
But that query returned num_descendant_not_in_parent = 543
.
So I’m wondering: am I using the concept_ancestor
table correctly? If not, how should I use it? What is the “right” way to find all surfactant drugs?
Thanks!
Will