First of all, thank you for publishing the Oncology OnRamp documentation. It has been very helpful for reviewing and refining our oncology ETL and mapping strategy in OMOP CDM. We are currently going through the cancer-related domains step by step, and today I wanted to start with the “Nodes” section. I will likely post additional observations and questions regarding other oncology domains later as well.
While reviewing the current Cancer Modifier “Nodes” concepts, I noticed a potential limitation related to real-world surgical pathology workflows.
The current vocabulary mainly provides highly granular station- or level-specific lymph node concepts (e.g. 2R, 4L, 10R, Axillary Level I, Axillary Level II, etc.). These work well when nodal metastasis is documented at an exact anatomical level.
However, in real-world pathology workflows, lymph nodes are frequently submitted and reported as grouped nodal packets rather than as individually separated stations or levels. Common examples include:
- “LN #2,#4: 3/12 positive”
- “Paratracheal lymph nodes: metastatic carcinoma”
- “Intrapulmonary lymph nodes: 1/6 positive”
- “Axillary lymph nodes: 2/15 positive”
In these situations, exact decomposition into individual stations or nodal levels is often not possible. Mapping the same metastatic count to multiple level-specific concepts can introduce double counting and distort downstream analyses such as nodal burden calculation or staging reconstruction.
Previously, broader SNOMED concepts such as:
- Structure of paratracheal lymph node
- Structure of intrapulmonary lymph node
- Axillary lymph node structure
were sometimes used because they better reflected actual specimen grouping in surgical pathology workflows.
Of course, under the current guidance, these cases could technically be represented using measurement_concept_id = 0 and a SNOMED concept in measurement_source_concept_id. However, this significantly reduces downstream usability because many OHDSI tools and analyses primarily rely on standard concepts in measurement_concept_id.
In breast cancer specifically, several clinically important regional nodal groups are currently missing or only partially represented in the Cancer Modifier vocabulary. For example:
- Axillary Level I and II are available, but Axillary Level III is missing
- There is currently no broader “Axillary lymph nodes” concept for grouped axillary nodal packets
- Internal mammary lymph nodes (IMN)
- Supraclavicular lymph nodes
are all clinically important regional nodal groups in breast cancer staging and pathology workflows.
I was wondering whether it might make sense to introduce intermediate regional nodal basin concepts and hierarchical relationships into the Cancer Modifier vocabulary.
Possible examples could look something like this:
- Superior mediastinal lymph nodes
ㄴ Paratracheal lymph nodes
ㄴ-- 2R Upper paratracheal
ㄴ-- 2L Upper paratracheal
ㄴ-- 4R Lower paratracheal
ㄴ-- 4L Lower paratracheal
ㄴ 3A Prevascular lymph nodes
ㄴ 3P Retrotracheal lymph nodes - Intrapulmonary lymph nodes
ㄴ 10R/L Hilar
ㄴ 11R/L Interlobar
ㄴ 12R/L Lobar
ㄴ13R/L Segmental
ㄴ 14R/L Subsegmental - Axillary lymph nodes
ㄴ Axillary Level I
ㄴ Axillary Level II
ㄴ Axillary Level III
ㄴ Intramammary lymph nodes - Internal mammary lymph nodes
- Supraclavicular lymph nodes
I think this could better align the vocabulary with real-world pathology workflows while preserving analytic usability and avoiding forced arbitrary assignment to overly granular nodal concepts.
Thank you!