What you do with EAV type situation, where you have a MEASUREMENT_CONCEPT_ID and a VALUE_AS_CONCEPT_ID combination
Here’s how we handle it. We have an NLP pipeline that extracts TNM staging data from surgical pathology reports. So it’ll show up in the note as “Lorem ipsum dolor sit amet T1N0MX consectetur adipiscing elit. Nulla rutrum facilisis…” and our pipeline will extract a row for the report with a “Tstage” “Nstage” and “Mstage” column - in this case, the values of those would be 1, 0, and X.
In instances where there’s a standard concept for the NLP-extracted value, we use that as the note_nlp_concept_id - so in the above, the note_nlp_concept_id is 40481057 (SNOMED for “pT1a category.”)
In instances where we need the EAV structure because the value doesn’t exist as a CONCEPT_ID, we use the lexical_variant column to store the reslts. So take PHQ-9 (a screening instrument for depression) for example. We set the note_nlp_concept_id as 3042932 (LOINC for “Patient Health Questionnaire 9 item (PHQ-9) total score [Reported]”) and then put the actual score (e.g. 13, 17, 21) in lexical_variant. This is probably not ideal but it’s the only way we could think of to allow for the EAVish structure you’re describing here.
- Whether or not the NOTE_NLP table must only contain Conditions
As you can doubtless see from the above, we would argue that it shouldn’t only contain conditions, because sometimes what you’re extracting is a measurement. There are plenty of other good use cases for representing NLP-derived data that maps to other domains (NLP-extracted ejection fraction data from echocardiography reports, etc). Ultimately, I’d vote for what @mgurley describes above as the best way to move forward:
There already is a type concept of ‘NLP Derived’, so the ‘measurement_type_concept_id’ could be ‘NLP Derived’. All NLP-related metadata could continue to live within the NOTE_NLP table. But all the analytical and UI tools could begin using NLP data today. Instead of it being stranded in the NOTE_NLP table. Folks that don’t trust NLP could filter out any ‘NLP Derived’ data using the type concept.