OHDSI Home | Forums | Wiki | Github

Oncology Derived SEER Stage concepts

Hello,

The concept names for the Derived SEER Pathological Stage use roman numerals while the concept names for the Derived SEER Clinical Stage are a mix of roman numerals and numbers. Should there be consistent nomenclature for both vocabularies?

Derived SEER Path Stg Grp (35918668)

Stage 0
Stage 0A
Stage 0is
Stage I
Stage IA
Stage IA1
Stage IA2
Stage IB
Stage IB1
Stage IB2
Stage IC
Stage II
Stage IIA
Stage IIA1
Stage IIA2
Stage IIB
Stage IIC
Stage III
Stage IIIA
Stage IIIB
Stage IIIC
Stage IIIC1
Stage IIIC2
Stage IS
Stage IV
Stage IVA
Stage IVA1
Stage IVA2
Stage IVB
Stage IVC
Unknown

Derived SEER Clin Stg Grp (35918772)

Stage 0
Stage 0A
Stage 0is
Stage 2
Stage 2A
Stage 2A1
Stage 2A2
Stage 2B
Stage 2C
Stage 3
Stage 3A
Stage 3B
Stage 3C
Stage 3C1
Stage 3C2
Stage 4
Stage 4A
Stage 4A1
Stage 4A2
Stage 4B
Stage 4C
Stage I
Stage IA
Stage IA1
Stage IA2
Stage IB
Stage IB1
Stage IC
Stage IS
Stage OC
Unknown

Also, both sets are missing concepts for Stage IA3, Stage IEA, Stage IIIA2, Stage IIID and Stage IIIE. These are less frequently assigned but would be nice to have.

Thanks,
Tina

Queries

select concept_name, concept_code, vocabulary_id
from concept
where concept_id in (
select concept_id_2
from concept_relationship
where concept_id_1 = 35918772 --Derived SEER Clin Stg Grp
and relationship_id = ‘Has Answer’)
order by concept_name;

select concept_name, concept_code, vocabulary_id
from concept
where concept_id in (
select concept_id_2
from concept_relationship
where concept_id_1 = 35918668 --Derived SEER Path Stg Grp
and relationship_id = ‘Has Answer’)
order by concept_name;

@tseto:

We decided to turn all of them into Arabic. In the source vocabularies, there is a ugly mix of Roman and Arabic. Roman numbers are really bad for searching, because when you search for I (=1) you will get II, III, IV, VI etc.

Will look at the missing ones. Thanks, and please keep it coming.

Thanks Christian. I have one more minor correction. The concept_name ‘Stage 0is’ has the concept_code ‘3610@01S’ for the ‘Derived SEER Clin Stg Grp’, but instead should be ‘3610@0IS’. This is an error only in the clinical stage group vocabulary. The path concept_code is correct.

I’m revisiting adding oncology staging data to our OMOP. I downloaded the latest Athena vocabulary but noticed the corrections to the NAACCR vocabulary for clinical and path stage have not been done yet. Are we to use the NCIt vocabulary for stage? I only see TNM concepts in the NCIt vocabulary but not staging concepts. Is there any documentation on how to integrate staging data? I tried looking in https://github.com/OHDSI/OncologyWG but didn’t find anything.

@tseto have you seen the ‘Cancer Modifier’ vocab #141 in Athena

It has staging concepts.

t