Atlas ConceptSet confusion

Sigfried_Gold · June 14, 2017, 10:16am

I’m trying to understand the Atlas implementation of ConceptSets and chanced upon somebody’s definition on the public site: http://www.ohdsi.org/web/atlas/#/conceptset/956/details

What I’m trying to figure out was whether, if you check both Mapped and Descendants, do you get mapped concepts of the descendants or descendants of the mapped concepts? It’s not clear to me what’s happening.

And, judging from that weird concept, I don’t think it was clear to whoever defined it either, since they asked for both mapped and descendants not just from the ICD9 concepts, but also from the SNOMED concepts.

What seems most weird to me is that asking for either mapped or descendants from the ICD9 concepts doesn’t appear to grab anything extra – whereas, presumably what one would want is the SNOMED concepts that they map to (and, ideally, either the mapped concepts of their descendants or the descendants of the SNOMED concepts). Since that didn’t work, apparently, whoever made it must have navigated from the ICD9 codes they started with to the SNOMED concepts, and then asked for mapped and descendants of one of those.

With SNOMED, unlike ICD9, asking for mapped does appear to work, and so it includes all the ICD, READ, and OXMIS concepts mapping to them. It seems the opposite of what one would want.

Am I misunderstanding something here?

Patrick_Ryan · June 14, 2017, 12:27pm

The preferred approach to all OHDSI analyses is to use standard concepts,
such that analyses can be portable across the OHDSI community. Depending
on the analysis, sometimes you need to create a set of standard concepts (a
conceptset) to represent one entity. For example, you want to define a
disease by two or more concepts or by one concept and all of its
descendants. The conceptset function in ATLAS allows you to create a
conceptset expression that serves this purpose. ‘Descendants’ allows you
to specify that when instantiating the conceptset, any concept with this
flag = YES will also include all descendant concept_ids from the
CONCEPT_ANCESTOR table. If any concept in your conceptset expression is
marked ‘Exclude’, then it will be removed from the included conceptset upon
excecution. If a concept is marked ‘Exclude’=TRUE and ‘Descendants’=TRUE,
then the final conceptset will exclude this concept plus all of its
descendants. These conceptsets can then be used in any analysis performed
against any CDM by looking for these concepts in the standard _CONCEPT_ID
fields throughout the CDM. For example, from our Sisyphus Challenge
specification of the target cohort (
http://www.ohdsi.org/web/atlas/#/cohortdefinition/99321), you can see how
conceptsets were defined for diseases like ‘osteoporosis’ and ‘hip
fracture’ using standard concepts from SNOMED and drugs like ‘alendronate’
using standard concepts from RxNorm and then used in the cohort definition
to look for records in the CONDITION_OCCURRENCE and DRUG_EXPOSURE tables,
respectively.

Within the CDM, most domain tables also allow for _SOURCE_CONCEPT_ID field,
which is an optional, non-standard structure to allow for using unique
OHDSI identifiers for each source vocabulary entity. For example, ICD-9-CM
is a non-standard vocabulary, and some organizations with source data
containing ICD-9-CM diagnoses opt to store the ICD_9 source value in the
CONDITION_SOURCE_VALUE field, then the OHDSI vocabulary identifier for that
same ICD9 in the CONDITION_SOURCE_CONCEPT_ID field, and then store the
standard concept that the ICD9 maps to in the CONDITION_CONCEPT_ID field.
While it is not the preferred approach for OHDSI research, we extended the
conceptset expresion in ATLAS to allow for construction of a conceptset
that could be used to query the _SOURCE_CONCEPT_ID fields as well. The
basic idea is that a user may want to have a conceptset that contains
non-standard source concepts. There would be 2 ways to achieve that: 1)
by selecting non-standard source concepts from the vocabulary, thereby
explicitly including them, and 2) by selecting a standard concept and using
the CONCEPT_RELATIONSHIP to pull in all non-standard source concepts that
roll up to the standard. So, the example for #1 would to simply pick an
ICD-9 concept. The example for #2 would be to pick a SNOMED concept for a
disease, mark ‘Mapped’=TRUE and then return back all source codes,
including ICD9, that belong to that SNOMED concept. The value of approach
#2 vs. #1 is that the approach is source vocabulary agnostic, and it can
allow for a more succinct conceptset expression (you have to select fewer
concepts to express the same idea). The value to approach #1 is you can
explicitly define the set of concepts you want without having to know
anything about the vocabulary mappings within and between vocabularies.
The ‘Denscendant’ flag and the ‘Mapped’ flag will have no impact on a
non-standard source concept, because source concepts are not contained in
the CONCEPT_ANCESTOR table and no non-standard source concept map into
another source concept. Instead, source concepts map to standard concepts,
and the ‘Mapped’ flag would only have an impact if you apply it to a
standard concept when trying to include all source concepts that fall
beneath it. So, if you want all components of a piece of the ICD9
pseudo-hierarchy (e.g. all 4-digit and 5-digit codes beneath a 3-digit
code), you’d need to either find the standard concepts that contain these
source codes, or select each source code individually. Once you have
created a conceptset that contains non-standard concepts, you can then
perform your analysis by linking it up to the _SOURCE_CONCEPT_ID fields in
each domain…but just remember that approach won’t work universally
across all CDM, since you’ve now bound it to be source vocabulary dependent.

Sigfried_Gold · June 14, 2017, 3:10pm

Thanks, Patrick. I was misunderstanding.

I still don’t understand the advantage of your example 2 over just finding the SNOMED codes and using those. After you’ve found the SNOMED concepts, what extra benefit do you get from including Mapped for those concepts? If the SOURCECONCEPT_ID matches the mapped codes, won’t the target concept id match the standard codes?

In the example I linked (http://www.ohdsi.org/web/atlas/#/conceptset/956/details) the author seems to have misunderstood also. That concept set includes a number of ICD9 concepts (with map and descendants checked), but you can delete all of them with no impact on the concepts actually included in the concept set because they are already included by checking Mapped on one of the SNOMED concepts.

Patrick_Ryan · June 14, 2017, 3:48pm

There is no advantage, I would generally prefer to create standard
conceptsets using standard concepts. But, if for some extenuating reason,
you decided you wanted to do a source concept-based analysis, then you
would your conceptset to include ‘mapped’ source concepts in order to apply
your conceptset to the SOURCE_CONCEPT_ID field in one of the domain tables.