Main Page

From HealthNLP-Cancer

Revision as of 14:11, 6 August 2019 by Guergana (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

1 Welcome to the Cancer Deep Phenotype Extraction (DeepPhe) project
2 Who We Are
3 Funding
4 Publications and presentations crediting DeepPhe
5 DeepPhe Software
6 DeepPhe Gold Set
7 Qualitative Interviews
8 Project materials/ WIKIs to tasks
9 Communication
10 Scrum Sprints
11 Meeting Notes
12 Licensing
13 Contact
14 Getting started

Welcome to the Cancer Deep Phenotype Extraction (DeepPhe) project

Who We Are

Boston Childrens Hospital/Harvard Medical School
- Guergana Savova (PI)
- Timothy Miller
- Sean Finan
- David Harris
- Chen Lin
- past members -- Dmitriy Dligach (currently faculty at Loyola University, Chicago), Pei Chen, James Masanz

University of Pittburgh
- Harry Hochheiser (site PI)
- Zhou Yuan
- past members - through June 2017: Rebecca Crowley Jacobson (MPI), Roger Day, Adrian Lee, Robert Edwards, John Kirkwood, Kevin Mitchell, Eugene Tseytlin, Girish Chavan, Melissa Castine; Liz Legowski (through Jan 2015), Olga Medvedeva, Mike Davis

Vanderbilt University
- Jeremy Warner (site PI)
- Alicia Beeghly-Fadiel

Dana-Farber Cancer Institute
- Elizabeth Buchbinder

Funding

The project described is supported by the National Cancer Institute at the US National Institutes of Health. It is part of the NCI's Informatics Technology for Cancer Research (ITCR) Initiative (http://itcr.nci.nih.gov/) The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Publications and presentations crediting DeepPhe

Hochheiser H; Jacobson R; Washington N; Denny J; Savova G. 2015. Natural language processing for phenotype extraction: challenges and representation. AMIA Annual Symposium. Nov 2015, San Francisco, CA.
Dmitriy Dligach, Timothy Miller, Guergana K. Savova. 2015. Semi-supervised Learning for Phenotyping Tasks. AMIA Annual Symposium. Nov 2015, San Francisco, CA.
Lin, Chen; Dligach, Dmitriy; Miller, Timothy; Bethard, Steven; Savova, Guergana. 2015. Layered temporal modeling for the clinical domain. Journal of the American Medical Informatics Association. http://jamia.oxfordjournals.org/content/early/2015/10/31/jamia.ocv113
Lin, Chen; Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Savova, Guergana. 2016. Improving Temporal Relation Extraction with Training Instance Augmentation. BioNLP workshop at the Association for Computational Linguistics conference. Berlin, Germany, Aug 2016
Timothy A. Miller, Sean Finan, Dmitriy Dligach, Guergana Savova. Robust Sentence Segmentation for Clinical Text. Abstract presented at the Annual Symposium of the American Medical Informatics Association, San Francisco, CA, 2015.
Hochheiser, Harry; Castine, Melissa; Harris, David; Savova, Guergana; Jacobson, Rebecca. 2016. An Information Model for Cancer Phenotypes. BMC Medical Informatics and Decision Making. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-016-0358-4
Ethan Hartzell, Chen Lin. 2016. Enhancing Clinical Temporal Relation Discovery with Syntactic Embeddings from GloVe. International Conference on Intelligent Biology and Medicine (ICIBM 2016). December 2016, Houston, Texas, USA
Dligach, Dmitriy; Miller, Timothy; Lin, Chen; Bethard, Steven; Savova, Guergana. 2017. Neural temporal relation extraction. European Chapter of the Association for Computational Linguistics (EACL 2017). April 3-7, 2017. Valencia, Spain.
Lin, Chen; Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Savova, Guergana. 2017. Representations of Time Expressions for Temporal Relation Extraction with Convolutional Neural Networks. BioNLP workshop at the Association for Computational Linguistics conference. Vancouver, Canada, Friday August 4, 2017
Timothy A. Miller, Dmitriy Dligach, Chen Lin, Steven Bethard, Guergana Savova. Feature Portability in Cross-domain Clinical Coreference. Abstract presented at the Annual Symposium of the American Medical Informatics Association, Chicago, IL, 2016.
Castro SM, Tseytlin E, Medvedeva O, Mitchell K, Visweswaran S, Bekhuis T, Jacobson RS. 2017. Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform. 2017 May;69:177-187. doi: 10.1016/j.jbi.2017.04.011. PMID: 28428140; PMCID: PMC5706448 [Available on 2018-05-01] DOI:10.1016/j.jbi.2017.04.011
Timothy A. Miller, Steven Bethard, Hadi Amiri, Guergana Savova. Unsupervised Domain Adaptation for Clinical Negation Detection. Proceedings of the 16th Workshop on Biomedical Natural Language Processing. 2017.
Timothy A. Miller, Dmitriy Dligach, Steven Bethard, Chen Lin, and Guergana Savova. Towards generalizable entity-centric coreference resolution. Journal of Biomedical Informatics, 69; 251-258. 2017.
Lin, Chen; Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Savova, Guergana. 2017. Representations of Time Expressions for Temporal Relation Extraction with Convolutional Neural Networks. BioNLP workshop at the Association for Computational Linguistics conference. Vancouver, Canada, Friday August 4, 2017
Miller, T; Bethard, S; Amiri, H; Savova, G. 2017. Unsupervised Domain Adaptation for Clinical Negation Detection. BioNLP workshop at the Association for Computational Linguistics conference. Vancouver, Canada, Friday August 4, 2017
Savova, G., Tseytlin, E., Finan, S., Castine, M., Miller, T., Medvedeva, O., Haris, D., Hochheiser, H., Lin, C., Chavan, G., Jacobson R. 2017. DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records Cancer Research 77(21), November 2017 DOI: 10.1158/0008-5472.CAN-17-0615.
Savova, G., Tseytlin, E., Finan, S., Castine, M., Miller, T., Medvedeva, O., Haris, D., Hochheiser, H., Lin, C., Chavan, G., Jacobson R. 2017. DeepPhe - A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Annual Symposium of the American Medical Informatics Association (AMIA). Nov 2017. Washington DC.
Savova, G; Miller, T. 2018. DeepPhe and Extraction of Oncology Patient Phenotypes from Unstructured Text Using NLP and Other AI Tools. Presentation to Dana Farber Cancer Institute. January 24 2018. Boston, MA.
Warner, Jeremy. 2018. Improving Cancer Diagnosis and Care: Patient Access to Oncologic Imaging and Pathology Expertise and Technologies. the National Cancer Policy Forum of the National Academies of Sciences, Engineering, and Medicine. http://www.nationalacademies.org/hmd/Activities/Disease/NCPF/2018-FEB-12/Videos/Session%204%20Videos/32%20Warner.aspx

DeepPhe Software

DeepPhe software documentation for developers is available in

code

DeepPhe Gold Set

Process for Deidentification of Source Documents.
Process for Deidentification of Source Documents.
Process for Deidentification of Source Documents.
Process for Selection of Gold Set Source Documents.
DeepPhe UPMC Training/Development/Test splits
- training set:
  - all documents for Breast Cancer patients 03, 11, 92, 93 for a total of 48 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Breast Cancer patients extended 04,05,06,09,10,12,13,14,18,19,20,22,23,26,27,30,31,32,33,34,35,40,41,42,43,38,39,46,47 for a total of 954 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Melanoma patients 05, 06, 18, 19, 25, 28, 30, 33, 34, 42, for a total of 233 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\trainSet\DeepPhe DevSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 3, 4, 7, 8, 12, 13, 16, 17, 18, 20, 24, 25, 26, 27, 30, 31, 32, 34, 37, 38, 41, 42, 43, 44, 46, 48 for a total of 1675 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\trainSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\trainSet\DeepPhe_ovCa_Train_Set_Phenotype_Annotations_GOLD.xlsm
- development set:
  - all documents for Breast Cancer patients 02, 21 for a total of 42 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Breast Cancer patients extended 01,15,16,17,28,29,36,37,44,45,07,08,24,25 for a total of 457 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Melanoma patients 07, 32, 43 for a total of 215 (processed only 211 docs) documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\devSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\devSet\DeepPhe DevSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 9, 11, 19, 28, 29, 35, 39, 47 for a total of 562 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\devSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\devSet\DeepPhe_ovCa_Dev_Set_Phenotype_Annotations_GOLD.xlsm
- test set:
  - all documents for Breast Cancer patients 01 (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\DeepPhe Test Phenotype Annotations v2.xlsm
  - all documents for Breast Cancer extended for patients 01, 02, 09,10,12,15,17,18,19,20,23,24,27,32,36,39,44,63, 76, 100, 101, 104, 106, 109, 111, 114, 115, 117, 118, 119, 120, 121, 123, 125, 126, 129, 130, 132, 136, 137, 138, 142, 143, 155, 156, 158, 174, 181, 189, 197 for phenotyping level testing use (\\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\DeepPhe Test Phenotype Annotations v2.xlsm
  - all documents for Melanoma patients 02, 03, 11, 12, 14, 16, 24, 27, 41, 44 for a total of 229 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\testSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\testSet\DeepPhe TestSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 15, 21, 33, 36, 40, 45, 49, 50 for a total of 559 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\testSet); gold annotations are in \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\testSet\DeepPhe_ovCa_Test_Set_Phenotype_Annotations_GOLD.xlsm
- use the training set for developing the algorithms and the development set to report results and error analysis. The test set will be used only for the final evaluation to go in publications.
SEER Project Train/Dev/Test Splits
Clinical Genomics Gold Set

Qualitative Interviews

Project materials/ WIKIs to tasks

Liquid Planner link (project management): https://app.liquidplanner.com/space/26220/dashboard
Templates for describing stakeholders.
Software development policies and repositories.
Data Repository and Policies.
Adopted Standards and Conventions for NLP annotations (task 1.4.2)
Gold Set Selection
Entity Mention and Template Evaluation Statistics
Phenotype Evaluation Statistics (with DeepPhe v1)
Phenotype Evaluation Statistics (with DeepPhe v2)
Modeling
- Phenotyping Rules
- Breast Cancer Model
- Melanoma Model
- Ovarian Cancer Model
- Cancer phenotype modeling notes
- Layered cancer phenotyping
  - Episode modeling
- FHIR modeling
- Domain Modeling Notes/Questions
  - Breast Cancer Domain Notes/Questions
- Validation of models with domain experts
- Competency questions to be used for validation of models.
- Analysis tasks potentially requiring episode labels
- Representations of the models.
- Historical pages
  - CEM Cancer phenotype models: models describing the original CEM Models
- Value decomposition issues https://docs.google.com/document/d/1riAHoLRdEmp4Ah9Z8NXN-ABkcAW9nnfNXQ5_md5rgYs/edit

Presentations

How to effectively use LiquidPlanner for DeepPhe: https://www.dropbox.com/s/1f6nkhx3yxh4v9q/LiquidPlanner%20for%20Deep-Phe.pptx
DeepPhe Rule Driven Architectures: https://www.dropbox.com/s/hl70zkvjs1ftt5a/DeepPhe%20Rule%20Driven%20Architectures.pptx

Communication

Weekly team meetings
Tools we use for communication are listed in our Communications Plan .

Scrum Sprints

Meeting Notes

January 25, 2018 Rules and Ontology Development Meeting
January 18, 2018 Rules and Ontology Development Meeting
January 11, 2018 Rules and Ontology Development Meeting
January 5, 2018 Rules and Ontology Development Meeting
December 21, 2017 Rules and Ontology Development Meeting
December 14, 2017 Rules and Ontology Development Meeting
November 16, 2017 Rules and Ontology Development Meeting
November 9, 2017 Rules and Ontology Development Meeting
November 2, 2017 Rules and Ontology Development Meeting
October 24, 2017 Rules and Ontology Development Meeting
October 17, 2017 Rules and Ontology Development Meeting
October 12, 2017 Rules and Ontology Development Meeting
October 5, 2017 Melanoma Rules and Ontology Meeting
September 28, 2017 Melanoma Rules and Ontology Meeting
September 14, 2017 Melanoma Rules and Ontology Meeting
September 7, 2017 Melanoma Rules and Ontology Meeting
August 24, 2017 Melanoma Rules Meeting
August 17, 2017 Melanoma Rules Meeting
August 10, 2017 Melanoma Rules Meeting
August 3, 2017 Melanoma Rules Meeting
August 27, 2015 Research Meeting
August 3, 2015 Modeling Meeting
July 20, 2015 Modeling Meeting
July 7, 2015 Bi-weekly team meeting
July 1, 2015 Scrum Sprint - 1
June 26, 2015 Software architecture meeting
June 23, 2015 Bi-weekly team meeting
June 9, 2015 Bi-weekly team meeting
May 12, 2015 Team meeting:DeepPhe demo
May 5, 2015 Team meeting:DeepPhe demo
April 28, 2015 Bi-weekly team meeting
April 13, 2015 Bi-weekly team meeting
March 17, 2015 Bi-weekly team meeting
February 23, 2015 Model prioritization meeting
February 17, 2015 Bi-weekly team meeting
February 3, 2015 Bi-weekly team meeting
January 28, 2015 BCH team meeting
January 20, 2015 Bi-weekly team meeting
January 6, 2015 Bi-weekly team meeting
December 9, 2014 BCH team meeting
December 9, 2014 Bi-weekly team meeting
November 20, 2014 BCH team meeting
November 11, 2014 Bi-weekly team meeting
November 11, 2014 BCH team meeting
November 4, 2014 BCH team meeting
November 3, 2014 PI meeting
October 27, 2014 Bi-weekly team meeting: Avillach's presentation on tranSMART, cTAKES and PCORI
October 14, 2014 Bi-weekly team meeting: agenda and notes
September 30, 2014 Bi-weekly team meeting: agenda and notes
September 2, 2014 Bi-weekly team meeting: agenda and notes
August 19, 2014 Bi-weekly team meeting: agenda and notes
August 5, 2014 Bi-weekly team meeting: agenda and notes
July 22, 2014 Bi-weekly team meeting: agenda and notes
July 15, 2014 Bi-weekly team meeting: agenda and notes
July 10, 2014 Hochheiser visit to Savova group
June 24, 2014 Bi-weekly team meeting: agenda and notes
June 10, 2014 Bi-weekly team meeting: agenda and notes
June 3, 2014 All hands kick-off meeting
May 08, 2014 NCIP collaboration with UT (Bermstram/Xu)

Licensing

Licensing policies for DeepPhe software and ontological models.

Contact

If you need assistance or if you have further questions about the project, contact us at the DeepPhe group.

Getting started

Consult the User's Guide for information on using the wiki software.

Retrieved from "https://healthnlp.hms.harvard.edu/cancer/wiki/index.php?title=Main_Page&oldid=3109"

Main Page

Contents

Welcome to the Cancer Deep Phenotype Extraction (DeepPhe) project

Who We Are

Funding

Publications and presentations crediting DeepPhe

DeepPhe Software

DeepPhe Gold Set

Qualitative Interviews

Project materials/ WIKIs to tasks

Communication

Scrum Sprints

Meeting Notes

Licensing

Contact

Getting started

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools