Link to an earlier schema discussion as at this link
The presentation suggest to support multiple types of NLP detected entities.
These would be:
Procedure
Disease/Disorder
Medication
Lab
Procedure
Sign/Symptom
Anatomical Site
Perhaps we can pick one of the domains (e.g., Disease/Disorder) and further discuss the columns for this domain. Will we need multiple note_nlp_xxxxx tables then?
Iām pursuing some ideas utilizing the OMOP CDM āNOTEā and āNOTE_NLPā tables, but I donāt have any sample data. Does anyone know of any publicly available sample data sets for these tables?
You can use MTSamples - http://mtsamples.com/, which is a collection of fake notes. We have annotated some MTSamples notes and can share with you too. Thanks.
Hi Hua. May I ask to share annotated samples with me too? I joint to OHDSI community few days ago and Iām thinking to join NLP working group. My email is asivura@icloud.com
I am interested to learn NLP. Am a beginner. If there are any tasks that I can volunteer, request you to let me know. Do you have any opportunities in your project?
Sure. Welcome! Why donāt you join our monthly meeting and see what is going on here. Then you can decide which project that you can contribute. The call in information can be found here: https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:nlp-wg (at the bottom of the page). Thanks.
Thanks for sharing. But itās an output of CLAMP pipeline, isnāt it? It would be great to have some data manually checked to use it as ground truth.
I also have a question about tooling to label data manually. Could you recommend anything?
Iāve learned a lot listening in on the working group calls but have a fundamental question:
I cannot, for the life of me, work out how to get the CLAMP wrapper configured and running.
Iām labeling a messy problem history data setā¦ I think I would like to use Usagi to identify consistent concepts associated with my headers but will want to use CLAMP to pick out concepts associated with the free text in the subsequent āexplainā parts.
My tentative development plan was going to be to have human annotators mark up a gold-standard set of our problem lists in CLAMP and then use its machine learning tooling to refine the stock OHDSI pipeline. Will this approach work?
Iām also very interested in practical ways to use the NLP objects youāve designed. My first instinct was going to be when I discovered (for example) a specific medication, that I would write a row to the DRUG_EXPOSURE table. But am I jumping too far from the intent of the NLP work?
These may be remedial questions - but Iād be grateful for a little help coming up to speed with ways to operationalize this tooling.
Hi, is this group still active, having calls etc? Iād really like to contribute, since Iām doing extensive use of NLP with OMOP CDM and have several possible improvements Iād love to discuss.