OHDSI Home | Forums | Wiki | Github

Addition of source_concept_id drops RC to zero

I have created a cohort in Atlas using standard condition_concept_ids for a Condition with the additional attribute of having specific condition_source_concept_ids to allow an exact comparison to the source data which are coded with ICD9CM/ICD10CM. When I add the condition_source_concept_id as an attribute of the Condition cohort entry event, my record count drops to 0. I also tried adding the condition_source_concept_id requirement as an inclusion criteria instead of an attribute of the cohort entry event, but the record count also drops to 0. I did verify my data is in the CDM and mapped correctly, verified I have Persons with both the condition_concept_id and condition_source_concept_id, and dropped all other requirements/criteria from the cohort in Atlas.

@Adam_Black is helping me trouble shoot the issue and we think it comes down to this part of the SQL:
“SELECT co.*
FROM cdm_531.CONDITION_OCCURRENCE co
JOIN Codesets codesets on ((co.condition_concept_id = codesets.concept_id and codesets.codeset_id = 0)
AND (co.condition_source_concept_id = codesets.concept_id and codesets.codeset_id = 1))”
The current working theory is the above will always return 0 rows since codeset_id cannot be both 0 and 1. Thoughts? Ideas?

Suppose I have 11,752 patients in my (Synpuf 110k) cdm with at least one condition_occurrence record where the condition_concept_id is 192450 and the condition_source_concept_id is 44829302.
image

I want to create a cohort that captures exactly these patients. Concretely the cohort is
Cohort entry: The first date where a person has a condition occurence record where condition_concept_id is 192450 and the condition_source_concept_id is 44829302.
Cohort exit: End of continuous observation
Cohort inclusion criteria: None

It seems like I should be able to do this in Atlas using the attribute
image

I created the cohort in Atlas using two concept sets where

  • Retention of urine - Standard’ contains the one standard concept (192450)
  • Retention of urine - Source’ contains the one source concept (44829302)
    image

I expect this cohort to capture 11,752 patients but instead it captures 0.

When looking at the cohort SQL code I found that the codesets temp table has two codeset_ids as expected.
image

The condition occurrence criteria section of the cohort generation SQL contains the following code:

 SELECT co.* 
  FROM @cdm_database_schema.CONDITION_OCCURRENCE co
  JOIN Codesets codesets on ((co.condition_concept_id = codesets.concept_id and codesets.codeset_id = 0) AND (co.condition_source_concept_id = codesets.concept_id and codesets.codeset_id = 1))

When I execute just this piece I get back zero rows because the join condition requires the codeset_id to be both 0 and 1 which will always result in an empty set.

It seems like what I’m after requires two joins with the codesets table.

@Chris_Knoll, @anthonysena - Am I missing something about how the “Condition Source Concept” attribute should work?

Yep, that looks like a bug: previously, these two fields were OR’d together, but when it was switched to an AND, it led ot this behavior. Shockingly, this has been an issue for a very long time (October 2018!). This would be fine for the OR case but not the AND case…hence the bug.

Not to diminish the severity of the issue, this only happens when you specify both the standard and non-standard concept in the same criteria, and usually the case is looking for one or the other. But the specific use case was to say that the standard concept and non-standard concept were both found in the same record, so that case is bugged. Thank you very much for pointing that out, and I’ll be sure to add a test case for this in our circe-be repository.

2 Likes

I’ve published a release of Circe 1.9.4 to incorporate the bug fix that you identified. This also adds test cases so we don’t miss this in the future. This version will be incorporated into the next release of WebAPI 2.9. (should be released soon)

I’ve also updated the R-wrapper for Circe: CirceR. If you would like to try out the new behavior, you can load the R package from github.

1 Like

Thanks very much for your reply @Chris_Knoll!

I think that changing the AND to an OR would still give unexpected results since it would include all patients with either a condition_concept_id = 92450 OR a condition_source_concept_id = 44829302.

image

My reading of “Filter Condition Occurrences by the Condition Source Concept” is that the query should capture records where condition_concept_id = 92450 AND condition_source_concept_id = 44829302. “Filter” implies making the result set smaller.

If I change AND to OR I will capture more than the target 11,752 patients.

If I separate the standard concept and source concept criteria I can get the desired cohort using Atlas.


image
@MPhilofsky Maybe this approach will work for your cohort.

Thanks! I’ll try it out.

@Adam_Black

I tried the above approach and still received a count = 0. And this error " “all events” are selected and cohort exit criteria has not been specified". This is the exit criteria:

To stay true to the original cohort definition, I am unable to add a censoring event.

@Chris_Knoll Is there a way to add in a condition_source_concept_id criteria without adding a condition_concept_id criteria? I’ve tried different definitions, but I have only found the condition_source_concept_id attribute is available when there is a specified condition_concept_id criteria. I know OHDSI revolves around the standard concept_id, but for testing purposes, I need to define the cohort exactly as defined against the source EHR system.

Yes, just leave the ‘standard’ concept set to ‘Any’ (ie: Any Condition, Any Drug Exposure) and specify a concept set for the Source Concept, and the result will be to look for the condition_source_concept without considering the condition_concept.

@Adam_Black demonstrated this in his screenshot under the ‘Restrict Initial Events’: he is looking for ‘Any Condition’ with a source concept of ‘Retention of urine’. ‘Any Condition’ means we don’t look at the standard condition_concept_id at all.

Yes, that’s why we changed it from OR to AND: the attributes in a criteria were AND’d together in all other cases, so why would source concept work any differently?

The problem was that when we made that switch, we couldn’t just change the OR to AND in the join clause, we actually required a new JOIN. So, that was the mistake and you’ve presented the proper fix. That’s what we’re doing now in the latest CIRCE version.

1 Like

Got it. I don’t need to specify all the criteria Adam has above. Just the “any condition” and the source condition concepts. Thanks!

1 Like

Good point @MPhilofsky. I guess all I needed for my example query was the source_concept_id since it will always map to the same standard. (Is it possible for a source concept to map to one standard for a period of time and a different standard for another period of time?)

image
image

Much simpler!

t