Hi @Shanshan4Q33! Welcome to the community! I co-lead the OHDSI Education Workgroup and am delighted to see you asking questions about how to build concept sets. (When you have a chance introduce yourself to the community on our longstanding welcome thread: Welcome to OHDSI! - Please introduce yourself)
Now, your topic at hand.
Let’s start from the basics. If you’ve never found your way to EHDEN Academy, that’s a great place to start with on-demand modules to train you up on the common data model, vocabularies and the various tools and methodologies we have as a community. It’s free to use. I’d highly recommend these courses as they’ll provide much more detail than I’m about it give.
It all starts with searching from the clinical attribute you’re capturing in the data. For example, if I’m running a study on total joint arthroplasty and want to build a concept set of those procedures, I’d likely be starting with either: a) a list of strings (“knee arthroplasty”, “hip arthroplasty”, “shoulder arthroplasty”) or b) a list of clinical codes (e.g. CPT4 codes). One way to do this is to se ATHENA to look up how these terms exist in the vocabulary tables. ATHENA is a vocabulary viewer – and its sister is the Search tab in ATLAS, which can give you more information about the number of records that use that term. You can learn more about ATHENA in this 10-Minute Tutorial by @mik : https://youtu.be/2WdwBASZYLk
One fundamental principle here is that our vocabulary tables aren’t “choosing” anything. They’re a compiled standard leveraging ontological decision rules from a variety of sources. As end users, we’re looking up the term in the tables and the vocabulary CONCEPT table will tell you more about whether that concept is: standard versus non-standard and what domain it would be stored in the common data model structure.
Simplest answer: there are multiple ontologies that are capable of representing measurement attributes. The LOINC vocabulary is one and SNOMED is another. The vocabularies are a best-of-breed approach at compiling all the ways that information can be stored – inclusive of attributes like key-pairs. Measurements are particularly interesting because they have multiple components of standardization: 1) the measurement as it is collected, 2) the value set of the measurement (e.g. a qualitative or quantitative value) and 3) the unit of measurement (as applicable).
It’ll depend on what your use case is. If you intend on running a network study (aka an analysis across more than one OMOP CDM from different source systems), you need to lean to the standards. But yes, you’re right in assuming that for a non-standard to exist in a data set it has to have been coded that way in the source system and then be retained in the non-standard concept column for that domain.
Exclude = does what it says – will create a negation for that concept ID
Descendants = pulls all the children of that parent concept from that level of detail and below
Map = a mysterious feature that few understand or dare to use. @Chris_Knoll or @anthonysena might have some sage wisdom. Generally, people don’t use this feature because it veers into the world of “off label” OMOP vocabulary use.
Hope that helps you get started in your journey!
Have you joined the OHDSI MS Teams? There’s tips here on how to make the most of your time in the community: Join the Journey – OHDSI
Best,
Kristin