With the help from @ShinSeojeong and my colleagues, our first draft for Genetic CDM was released at GoogleDrive
This model is developed on the basis of ISO standard for reporting NGS result (ISO/TS 20428, ‘Health Informatics-Data elements and their metadata for describing structured clinical genomic sequence information in electronic health records’)
This is our first draft and we need your thorough review and comments!
I agree with @rimma 's comment
it is important to know ‘there is no mutation in certain genes’. We need to figure out how to add information of target genes in targeted NGS.
Thank you for @davidfasel 's comment
Basic variant info / Annotations:
Basically, I agree with David’s thought. Annotation data is useful but annotation data can be changed rapidly and this is so huge. I think to leave this data in external data sources too, if it is possible. And that’s the reason why we add another table for annotation or basic variant info.
Interpretation/Report
I think we can store the information of original pathology report and genetic pathology into ‘note’ table in existing CDM.
Scope
As @Christian_Reich said, I think that it would be hard to define ‘limited variant with a clinical interpretation’ (In our model, the information for clinical implication should be stored in ‘variant_annotation’ table). I don’t think the data of thousands of variants itself is overwhelming for CDM compared with current CDM. We store every single device, medication, device and note in CDM now. The current variant_occurrence table in our model has 23 columns. And most of patients have single NGS result.
Use case
On-going project of mine is developing machine learning to predict outcomes in cancer patients by using combined information of genomic and clinical data. Owing to great contribution @Rijnbeek, @jennareps, @schuemie and their colleagues, it won’t that hard to build this by modifying feature extraction package and using patient level prediction package.
Another my ambitious goal is converting existing open genomic database in cancer patients into OMOP-CDM. by this, it is possible to leverage accumulated genomic and clinical database to generate better evidence for accumulating genetic and clinical information. Collaboration with oncology group is absolutely essential for this ambitious goal to capture information from existing oncology registries in OMOP-CDM