Missing concepts in the CONCEPT table (Athena Vocabulary)

bea-estrada · July 21, 2025, 8:07pm

Hello everyone! I’m transforming data from a database of inpatients at a high-complexity hospital in Chile. But I’m having trouble with the semantic mapping. I mapped more than 2,000 concepts, ensuring they were standard and valid and belonged to the domain = condition, using USAGI and Athena. I also loaded the updated vocabularies into the CDM. The problem is that many of the mapped concepts don’t appear among the loaded concepts (Athena vocabulary). It shows you a list of 10 concepts that don’t appear (there are many more).

Concept_id	concept_description	concept_code
22350	Edema of larynx	51599000
23653	Foreign body in esophagus	47609003
31317	Dysphagia	40739000
75128	Injury of chest wall	65978000
80182	Dermatomyositis	396230008
133002	Acute osteomyelitis	409780002
196455	Hepatorenal syndrome	51292008
199860	Hernia of abdominal cavity	52515009
201340	Gastritis	4556007
313217	Atrial fibrillation	49436004

Has this happened to any of you? How did you solve it?

rookie_crewkie · July 22, 2025, 9:06am

Hello @bea-estrada, welcome to the community.

These 10 concepts from your example all belong to SNOMED vocabulary — did you make sure to include SNOMED when downloading vocabularies from Athena as CSV? If not, this could explain why they’re present online but not in the downloaded CSV files.
Another community member had a somewhat similar problem: Missing "maps to" values in ICD10 download

bea-estrada · July 22, 2025, 7:27pm

Hi @rookie_crewkie ! Yes, I did, more than 1 million SNOMED concepts were loaded into my database.

rookie_crewkie · July 23, 2025, 9:53am

Hello @bea-estrada,

Good, thanks for confirming that. It is strange though that I’m getting a different number of SNOMED concepts (27-FEB-25 vocabulary version): 1,089,088, which is 81,569 more than in your case.

Let’s try comparing the distributions:

SELECT standard_concept, count(1) FROM concept
WHERE vocabulary_id = 'SNOMED' GROUP BY 1 ORDER BY 1;
/*
|standard_concept|_col1  |
|----------------|-------|
|S               |346,648|
|                |742,440|
*/

SELECT domain_id, count(1) FROM concept
WHERE vocabulary_id = 'SNOMED' GROUP BY 1 ORDER BY 1;
/*
|domain_id          |_col1  |
|-------------------|-------|
|Condition          |163,376|
|Device             |218,097|
|Drug               |254,991|
|Gender             |10     |
|Geography          |685    |
|Language           |878    |
|Meas Value         |5,122  |
|Meas Value Operator|7      |
|Measurement        |40,148 |
|Metadata           |2,378  |
|Observation        |268,976|
|Procedure          |84,582 |
|Provider           |708    |
|Race               |466    |
|Relationship       |405    |
|Route              |216    |
|Spec Anatomic Site |41,129 |
|Specimen           |2,094  |
|Type Concept       |3,395  |
|Unit               |1,362  |
|Visit              |63     |
*/

SELECT concept_id / 1000000 AS range_1M, count(1) FROM concept
WHERE vocabulary_id = 'SNOMED' GROUP BY 1 ORDER BY 1;
/*
|range_1M|_col1  |
|--------|-------|
|0       |41,156 |
|1       |9,685  |
|3       |148,542|
|4       |297,250|
|35      |24,956 |
|36      |36,107 |
|37      |95,355 |
|40      |85,706 |
|42      |11,693 |
|43      |2,797  |
|44      |28,667 |
|45      |60,038 |
|46      |247,136|
*/

If you spot the differences, it might make sense to reload the vocabularies from Athena.

bea-estrada · July 24, 2025, 1:13am

@rookie_crewkie I also had several problems uploading it to the CDM. Is there any possibility that you could share the download link for your vocabulary with me? I would really appreciate it.

rookie_crewkie · July 24, 2025, 9:51am

@bea-estrada,

What kind of problems did you have? Some records might’ve been skipped due to CSV parsing errors (quoting/delimiter issues, etc), although I don’t think that it’s the case for the concepts in your example.

Sorry, but it doesn’t work like that, the links are per account and are short-living anyway. It shouldn’t be too difficult though: create a new download in Athena, get the link to a zip with CSVs and upload them to your database alongside the current ones, so you can compare them.
Athena requires an account to download the vocabularies, but it’s open for registration and free for everyone.

bea-estrada · July 24, 2025, 2:04pm

@rookie_crewkie, I downloaded all the concepts last week, and when I tried to load the concepts table separated by \t it told me that the concept_name field exceeded 255 characters, in fact I had to enable up to 1000 characters for it to load without giving an error, when reviewing the document there are also inconsistencies with the delimiters, loading concepts without correctly separating the tables, especially associated with drugs. I will download the concepts again, verifying that the same number of Snomed concepts are loaded as yours. That should work fine. Thank you very much for responding.

bea-estrada · July 27, 2025, 3:51pm

@rookie_crewkie I finally solved it. Loading from pgAdmin doesn’t allow the correct loading. I did it from the console and everything is fine! Thanks so much for your help.

rookie_crewkie · July 28, 2025, 9:33am

Glad it helped. Good luck with your project!