NLP Workgroup Discussion Thread

Vojtech_Huser · March 9, 2016, 7:58pm

Today the NLP workgroup met.
There was email discussion calling for a schema to capture NLP results.

I would like to initiate a discussion on such schema. The proposal is at the wiki page below:

http://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg#proposal_for_concepts_detected_by_nlp

Please edit the wiki page to add your input to this schema or use the forum to discuss it.

Vojtech_Huser · March 15, 2016, 4:35pm

Link to an earlier schema discussion as at this link

The presentation suggest to support multiple types of NLP detected entities.

These would be:

Procedure
Disease/Disorder
Medication
Lab
Procedure
Sign/Symptom
Anatomical Site

Perhaps we can pick one of the domains (e.g., Disease/Disorder) and further discuss the columns for this domain. Will we need multiple note_nlp_xxxxx tables then?

JamesSWiggins · June 6, 2018, 5:04pm

Hi,

I’m pursuing some ideas utilizing the OMOP CDM ‘NOTE’ and ‘NOTE_NLP’ tables, but I don’t have any sample data. Does anyone know of any publicly available sample data sets for these tables?

Cheers!

James

HuaXu · June 13, 2018, 2:45pm

James,

You can use MTSamples - http://mtsamples.com/, which is a collection of fake notes. We have annotated some MTSamples notes and can share with you too. Thanks.

Hua

SELVA_MUTHU_KUMARAN · June 14, 2019, 2:32pm

Is this group active? Do you have WG calls?

HuaXu · June 19, 2019, 1:27am

yes, check out the WG info here: https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg

TMS · August 26, 2019, 12:37pm

Hi,

I would also like to contribute and learn. How can I join NLP work group?
if you can, please add me - tmshah@ismnet.com

thanks,
Tarun Shah

SELVA_MUTHU_KUMARAN · August 27, 2019, 1:51am

Hello, I would like to be part of this WG as well. Email-id is selvasathappan36@gmail.com

alexander · August 27, 2019, 4:01pm

Hi Hua. May I ask to share annotated samples with me too? I joint to OHDSI community few days ago and I’m thinking to join NLP working group. My email is asivura@icloud.com

SELVA_MUTHU_KUMARAN · August 28, 2019, 9:57am

Hello,

I am interested to learn NLP. Am a beginner. If there are any tasks that I can volunteer, request you to let me know. Do you have any opportunities in your project?

HuaXu · September 3, 2019, 6:00pm

We have the dataset at OHDSI NLP GitHub: https://github.com/OHDSI/NLPTools.

Here is a link to CLAMP outputs, which is similar to the annotated data: https://github.com/OHDSI/NLPTools/tree/master/clamp-wrapper/output_xmi

HuaXu · September 3, 2019, 6:02pm

Sure. Welcome! Why don’t you join our monthly meeting and see what is going on here. Then you can decide which project that you can contribute. The call in information can be found here: https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg (at the bottom of the page). Thanks.

Hua

SELVA_MUTHU_KUMARAN · September 6, 2019, 3:33am

Sure HuaXu, Thanks. will join for sep 11th call without fail

alexander · September 6, 2019, 6:32am

Thanks for sharing. But it’s an output of CLAMP pipeline, isn’t it? It would be great to have some data manually checked to use it as ground truth.

I also have a question about tooling to label data manually. Could you recommend anything?

Kate_Weber · October 17, 2019, 4:18pm

I’ve learned a lot listening in on the working group calls but have a fundamental question:

I cannot, for the life of me, work out how to get the CLAMP wrapper configured and running.

I’m labeling a messy problem history data set… I think I would like to use Usagi to identify consistent concepts associated with my headers but will want to use CLAMP to pick out concepts associated with the free text in the subsequent “explain” parts.

My tentative development plan was going to be to have human annotators mark up a gold-standard set of our problem lists in CLAMP and then use its machine learning tooling to refine the stock OHDSI pipeline. Will this approach work?

I’m also very interested in practical ways to use the NLP objects you’ve designed. My first instinct was going to be when I discovered (for example) a specific medication, that I would write a row to the DRUG_EXPOSURE table. But am I jumping too far from the intent of the NLP work?

These may be remedial questions - but I’d be grateful for a little help coming up to speed with ways to operationalize this tooling.

Kate
UM School of Dentistry

alvaroabascar · April 29, 2020, 1:40pm

Hi, is this group still active, having calls etc? I’d really like to contribute, since I’m doing extensive use of NLP with OMOP CDM and have several possible improvements I’d love to discuss.