OHDSI Home | Forums | Wiki | Github

NLP Workgroup Discussion Thread

Today the NLP workgroup met.
There was email discussion calling for a schema to capture NLP results.

I would like to initiate a discussion on such schema. The proposal is at the wiki page below:

http://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg#proposal_for_concepts_detected_by_nlp

Please edit the wiki page to add your input to this schema or use the forum to discuss it.

Link to an earlier schema discussion as at this link

The presentation suggest to support multiple types of NLP detected entities.

These would be:

Procedure
Disease/Disorder
Medication
Lab
Procedure
Sign/Symptom
Anatomical Site

Perhaps we can pick one of the domains (e.g., Disease/Disorder) and further discuss the columns for this domain. Will we need multiple note_nlp_xxxxx tables then?

Hi,

Iā€™m pursuing some ideas utilizing the OMOP CDM ā€˜NOTEā€™ and ā€˜NOTE_NLPā€™ tables, but I donā€™t have any sample data. Does anyone know of any publicly available sample data sets for these tables?

Cheers!

James

James,

You can use MTSamples - http://mtsamples.com/, which is a collection of fake notes. We have annotated some MTSamples notes and can share with you too. Thanks.

Hua

Is this group active? Do you have WG calls?

yes, check out the WG info here: https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg

Hi,

I would also like to contribute and learn. How can I join NLP work group?
if you can, please add me - tmshah@ismnet.com

thanks,
Tarun Shah

Hello, I would like to be part of this WG as well. Email-id is selvasathappan36@gmail.com

Hi Hua. May I ask to share annotated samples with me too? I joint to OHDSI community few days ago and Iā€™m thinking to join NLP working group. My email is asivura@icloud.com

Hello,

I am interested to learn NLP. Am a beginner. If there are any tasks that I can volunteer, request you to let me know. Do you have any opportunities in your project?

We have the dataset at OHDSI NLP GitHub: https://github.com/OHDSI/NLPTools.

Here is a link to CLAMP outputs, which is similar to the annotated data: https://github.com/OHDSI/NLPTools/tree/master/clamp-wrapper/output_xmi

Sure. Welcome! Why donā€™t you join our monthly meeting and see what is going on here. Then you can decide which project that you can contribute. The call in information can be found here: https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg (at the bottom of the page). Thanks.

Hua

2 Likes

Sure HuaXu, Thanks. will join for sep 11th call without fail

Thanks for sharing. But itā€™s an output of CLAMP pipeline, isnā€™t it? It would be great to have some data manually checked to use it as ground truth.

I also have a question about tooling to label data manually. Could you recommend anything?

Iā€™ve learned a lot listening in on the working group calls but have a fundamental question:

I cannot, for the life of me, work out how to get the CLAMP wrapper configured and running.

Iā€™m labeling a messy problem history data setā€¦ I think I would like to use Usagi to identify consistent concepts associated with my headers but will want to use CLAMP to pick out concepts associated with the free text in the subsequent ā€œexplainā€ parts.

My tentative development plan was going to be to have human annotators mark up a gold-standard set of our problem lists in CLAMP and then use its machine learning tooling to refine the stock OHDSI pipeline. Will this approach work?

Iā€™m also very interested in practical ways to use the NLP objects youā€™ve designed. My first instinct was going to be when I discovered (for example) a specific medication, that I would write a row to the DRUG_EXPOSURE table. But am I jumping too far from the intent of the NLP work?

These may be remedial questions - but Iā€™d be grateful for a little help coming up to speed with ways to operationalize this tooling.

Kate
UM School of Dentistry

Hi, is this group still active, having calls etc? Iā€™d really like to contribute, since Iā€™m doing extensive use of NLP with OMOP CDM and have several possible improvements Iā€™d love to discuss.

1 Like
t