@Christian_Reich
The use case for connecting NLP-derived pathology findings to the pathology procedure is that people will want to see the textual evidence that was the basis for an NLP-derived/chart abstracted data point. Most real world, historical pathology findings data are stuck in clinical text. For oncology, this will be very important.
But maybe a better approach would be
- The pathology procedure belongs in the ‘PROCEDURE_OCCURRENCE’ table.
- The pathology report that records the pathology findings of the pathology procedure belongs in the ‘NOTE’ table.
- The pathology report in the ‘NOTE’ table should be related to the pathology procedure in the ‘PROCEDURE_OCCURRENCE’ table via the note_event_id/note_event_field_concept_id (or via FACT_RELATIONSHIP in CDM 5.X).
- The pathology findings (like anatomic site, histology, grade, staging and lymphatic invasion, etc.) belong in the ‘MEASUREMENT’ domain/table.
- The pathology findings in the ‘MEASUREMENT’ domain/table should be related to the pathology report via the addition of two new fields to the NOTE_NLP table:
note_nlp_event_id
note_nlp_event_field_concept_id
These two new fields would replace ‘note_nlp_concept_id’. ‘note_nlp_concept_id’ is already deficient in that it can only represent non-EAV structures like ‘CONDITION_OCCURRENCE’ or ‘PROCEDURE_OCCURRENCE’. Not ‘MEASUREMENT’ or ‘OBSERVATION’
@Andrew the undocumented convention of NOTE_NLP only representing the ‘Condition’ domain seems very limiting.
There already is a type concept of ‘NLP Derived’, so the ‘measurement_type_concept_id’ could be ‘NLP Derived’. All NLP-related metadata could continue to live within the NOTE_NLP table. But all the analytical and UI tools could begin using NLP data today. Instead of it being stranded in the NOTE_NLP table. Folks that don’t trust NLP could filter out any ‘NLP Derived’ data using the type concept.