Main Page

From HealthNLP-Cancer

Revision as of 14:56, 8 May 2024 by Guergana (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

1 Welcome to the Cancer Deep Phenotype Extraction project
2 Who We Are
3 Funding
4 Cancer Deep Phenotyping for Cancer Surveillance (DeepPhe*CR)
5 Cancer Deep Phenotyping for Translational Science (DeepPhe)
6 Contact
7 Getting started

Welcome to the Cancer Deep Phenotype Extraction project

Our goal is to develop novel methods for information extraction to facilitate automatic/unsupervised/minimally supervised extraction of specific discrete cancer-related data from various types of unstructured electronic medical records. Our two main use cases are cancer deep phenotyping for translational science (DeepPhe) and a platform for cancer surveillance by the cancer registries (DeepPhe*CR)

Who We Are

Boston Children's Hospital/Harvard Medical School

Guergana Savova (MPI for DeepPhe and DeepPhe*CR)
Timothy Miller
Sean Finan
David Harris
Chen Lin
past members -- Dmitriy Dligach (currently faculty at Loyola University, Chicago), James Masanz

University of Pittburgh

Harry Hochheiser (MPI for DeepPhe and DeepPhe*CR)
Zhou Yuan
John Levander
past members - through June 2017: Rebecca Crowley Jacobson (MPI), Roger Day, Adrian Lee, Robert Edwards, John Kirkwood, Kevin Mitchell, Eugene Tseytlin, Girish Chavan, Melissa Castine; Liz Legowski (through Jan 2015), Olga Medvedeva, Mike Davis

Rhode Island Hospital (Brown University)

Jeremy Warner (MPI for DeepPhe and DeepPhe*CR)
Ece Uzun
Don Dizon
Sandeep Jain
Alex VanHelene

University of Kentucky/Kentucky Cancer Registry

Eric Durbin (MPI for DeepPhe*CR)
Isaac Hands
Jong Jeong
Ramakanth (Rama) Kavuluru
David Rust
Lisa Witt

Dana-Farber Cancer Institute

University of Minnesota

Piet de Groen

Vanderbilt University

Douglas B. Johnson
past members - Alicia Beeghly-Fadiel

Funding

The project described is supported by the National Cancer Institute at the US National Institutes of Health. It is part of the National Cancer Institute's Informatics Technology for Cancer Research (ITCR) Initiative (http://itcr.nci.nih.gov/) and the Surveillance, Epidemiology, and End Results Program (SEER; https://seer.cancer.gov/) at the US National Cancer Institute (NCI). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Cancer Deep Phenotyping for Cancer Surveillance (DeepPhe*CR)

Scrum Sprints

Goals DeepPhe-CR July 2019 - June 2020

Goals DeepPhe-CR July 2020 - June 2021

Goals DeepPhe-CR July 2021 - June 2022

Goals DeepPhe-CR July 2022 - June 2023

Goals DeepPhe-CR July 2023 - June 2024

Project materials

Publications and presentations

Peer-reviewed publications:

2023: Bitterman DS, Goldner E, Finan S, Harris D, Durbin EB, Hochheiser H, Warner JL, Mak RH, Miller T, Savova GK. An End-to-End Natural Language Processing System for Automatically Extracting Radiation Therapy Events From Clinical Texts. Int J Radiat Oncol Biol Phys. 2023 Sep 1;117(1):262-273. doi: 10.1016/j.ijrobp.2023.03.055. Epub 2023 Mar 27. PMID: 36990288; PMCID: PMC10522797.
2020: Durbin, Eric; Hochheiser , Harry; Petkov, Valentina; Rivera, Donna; Savova, Guergana; Warner, Jeremy. 2020. Tools and software to automate and normalize the cancer data abstraction workflow. Workshop at the annual North American Association of Cancer Registries (NAACCR). June 2020. Philadelphia, PA.
2020: Zhou Yuan, Sean Finan, Jeremy Warner, Guergana Savova, Harry Hochheiser 2019. Interactive Exploration of Longitudinal Cancer Patient Histories Extracted From Clinical Text. JCO Clin Cancer Inform. 2020 May;4:412-420. doi: 10.1200/CCI.19.00115
2019: Guergana Savova, Ioana Danciu, Folami Alamudun, Timothy Miller, Chen Lin, Danielle S Bitterman, Georgia Tourassi and Jeremy L Warner. 2019. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records. Cancer Research. doi: 10.1158/0008-5472.CAN-19-0579

Pre-prints:

2023: Hochheiser H, Finan S, Yuan Z, Durbin EB, Jeong JC, Hands I, Rust D, Kavuluru R, Wu XC, Warner JL, Savova G. DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction. medRxiv [Preprint]. 2023 Oct 26:2023.05.05.23289524. doi: 10.1101/2023.05.05.23289524. PMID: 37205575; PMCID: PMC10187451.

Presentations:

2019: Savova, Guergana and Hochheiser, Harry. “Cancer Deep Phenotype Extraction from Electronic Medical Records ”. Data Science Seminar Series. National Cancer Institute, National Institutes of Health. Oct 2019.
2019: Warner, Jeremy, Durbin, Eric, Petkov, Valentina and Savova, Guergana. 2019. Tools and Software to Automate and Normalize the Cancer Data Abstraction Workflow. Workshop at 2019 Conference of the North American Association of Central Cancer Registries and the International Association of Cancer Registries. June 9-13, 2019. Vancouver, BC, Canada

Websites

Cancer Deep Phenotyping for Translational Science (DeepPhe)

Publications and presentations

Peer-reviewed publications:

Lin, Chen; Miller, Timothy; Dligach, Dmitriy; Amiri, Hadi; Bethard, Steven and Savova, Guergana. 2018. Self-training improves Recurrent Neural Networks performance for Temporal Relation Extraction. LOUHI 2018: The Ninth International Workshop on Health Text Mining and Information Analysis. Oct 31-Nov 1, 2018. Brussels, Belgium. https://aclanthology.coli.uni-saarland.de/papers/W18-5619/w18-5619
Malty, Andrew M., Jain, Sandeep K., Yang, Peter C., Harvey, Krysten, Warner, Jeremy L. Computerized approach to creating a systematic ontology of hematology/oncology regimens. JCO Clinical Cancer Informatics. 2018 May 11. http://ascopubs.org/doi/full/10.1200/CCI.17.00142
Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Lin, Chen; Savova, Guergana. 2017. Towards Generalizable Entity-Centric Clinical Coreference Resolution. Journal of Biomedical Informatics. Vol. 69, May 2017, pp. 251-258. https://doi.org/10.1016/j.jbi.2017.04.015; http://www.sciencedirect.com/science/article/pii/S1532046417300850
Castro SM, Tseytlin E, Medvedeva O, Mitchell K, Visweswaran S, Bekhuis T, Jacobson RS. 2017. Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform. 2017 May;69:177-187. doi: 10.1016/j.jbi.2017.04.011. PMID: 28428140; PMCID: PMC5706448 [Available on 2018-05-01] DOI:10.1016/j.jbi.2017.04.011 https://www.sciencedirect.com/science/article/pii/S1532046417300813
Lin, Chen; Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Savova, Guergana. 2017. Representations of Time Expressions for Temporal Relation Extraction with Convolutional Neural Networks. BioNLP workshop at the Association for Computational Linguistics conference. Vancouver, Canada, Friday August 4, 2017. https://aclanthology.coli.uni-saarland.de/papers/W17-2341/w17-2341
Miller, T; Bethard, S; Amiri, H; Savova, G. 2017. Unsupervised Domain Adaptation for Clinical Negation Detection. BioNLP workshop at the Association for Computational Linguistics conference. Vancouver, Canada, Friday August 4, 2017 https://aclanthology.coli.uni-saarland.de/papers/W17-2320/w17-2320
Savova, G., Tseytlin, E., Finan, S., Castine, M., Miller, T., Medvedeva, O., Haris, D., Hochheiser, H., Lin, C., Chavan, G., Jacobson R. 2017. DeepPhe - A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Annual Symposium of the American Medical Informatics Association (AMIA). Nov 2017. Washington DC https://amia2017.zerista.com/event/member/389439
Savova, G., Tseytlin, E., Finan, S., Castine, M., Miller, T., Medvedeva, O., Haris, D., Hochheiser, H., Lin, C., Chavan, G., Jacobson R. 2017. DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Cancer Research 77(21), November 2017 DOI: 10.1158/0008-5472.CAN-17-0615. https://www.ncbi.nlm.nih.gov/pubmed/29092954
Dligach, Dmitriy; Miller, Timothy; Lin, Chen; Bethard, Steven; Savova, Guergana. 2017. Neural temporal relation extraction. European Chapter of the Association for Computational Linguistics (EACL 2017). April 3-7, 2017. Valencia, Spain. https://aclanthology.coli.uni-saarland.de/papers/E17-2118/e17-2118
Chen, Lin; Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Savova, Guergana. 2016. Improving Temporal Relation Extraction with Training Instance Augmentation. BioNLP workshop at the Association for Computational Linguistics conference. Berlin, Germany, Aug 2016 https://aclanthology.coli.uni-saarland.de/papers/W16-2914/w16-2914
Hochheiser, Harry; Castine, Melissa; Harris, David; Savova, Guergana; Jacobson, Rebecca. 2016. An Information Model for Computable Cancer Phenotypes. BMC Medical Informatics and Decision Making. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-016-0358-4 https://www.ncbi.nlm.nih.gov/pubmed/27629872
Ethan Hartzell, Chen Lin. 2016. Enhancing Clinical Temporal Relation Discovery with Syntactic Embeddings from GloVe. International Conference on Intelligent Biology and Medicine (ICIBM 2016). Medical Informatics Thematic Track. December 2016, Houston, Texas, USA
Dmitriy Dligach, Timothy Miller, Guergana K. Savova. 2015. Semi-supervised Learning for Phenotyping Tasks. AMIA Annual Symposium. Nov 2015, San Francisco, CA. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765699/
Chen, Lin; Dligach, Dmitriy; Miller, Timothy; Bethard, Steven; Savova, Guergana. 2015. Multilayered temporal modeling for the clinical domain. Journal of the American Medical Informatics Association. 2016 Mar;23(2):387-95. doi: 10.1093/jamia/ocv113 https://www.ncbi.nlm.nih.gov/pubmed/26521301

Peer-reviewed other:

Beeghly-Fadiel, Alicia; Warner, Jeremy; Finan, Sean; Masanz, James; Hochheiser, Harry; Savova, Guergana. (under review). Deep Phenotype Extraction to Facilitate Cancer Research: Extending DeepPhe to Ovarian Cancer. American Association for Cancer Research (AACR) 2019. March 29-April 3, 2019. Atlanta, GA.
Yuan, Zhou; Finan, Sean; Warner, Jeremy; Savova, Guergana; Hochheiser, Harry. 2018. Toward Longitudinal Visual Analytics for Cancer Patient Trajectories Extracted from Clinical Text. 2018 Workshop on Visual Analytics and Healthcare, Demonstration Presentation. AMIA 2018, Nov 3-7, 2018. San Francisco, CA.
Chen Lin, Timothy A. Miller, Hadi Amiri, David Harris, Samuel M. Rubinstein, Jeremy Warner, Guergana K. Savova, Ph.D. 2018. Classification of electronic medical records of breast cancer and melanoma patients into clinical episodes. 30th Anniversary AACR Special Conference Convergence: Artificial Intelligence, Big Data, and Prediction of Cancer. Oct 14-17, 2018. Newport, RI, USA.
Warner, Jeremy; Elhadad, Noemie; Bastarache, Lisa; Gotz, David; Savova, Guergana. 2018. Panel - Didactic: Computable Longitudinal Patient Trajectories. Annual Symposium of the American Medical Informatics Association. November, 2018. San Francisco, CA. (peer-reviewed panel)
Savova G, Tseytlin E, Finan S, Castine M, Miller T, Medvedeva O, Harris D, Hochheiser H, Lin C, Chavan G, Warner JL, Jacobson R. DeepPhe – a natural language processing system for extracting cancer phenotypes from clinical records. Annual conference of the North American Association of Central Cancer Registries (NAACCR). Pittsburgh, PA.
Warner JL, Harris D, Rubinstein S, Finan S, Lin C, Miller T, Amiri H, Hochheiser H, Savova G. Capturing high-resolution temporal cancer phenotypes using DeepPhe. Annual conference of the North American Association of Central Cancer Registries (NAACCR). Pittsburgh, PA.
Yang PC, Malty A, Jain SK, Harvey K, Finan S, Warner JL. 2018. A Comprehensive Ontology of Hematology/Oncology Regimens. Annual conference of the North American Association of Central Cancer Registries (NAACCR). Pittsburgh, PA.
Hochheiser H; Jacobson R; Washington N; Denny J; Savova G. 2015. Natural language processing for phenotype extraction: challenges and representation. AMIA Annual Symposium. Nov 2015, San Francisco, CA. (peer-reviewed panel)

Invited presentations:

Savova, Guergana. 2019. Cancer Deep Phenotype Extraction from Electronic Medical Records. Molecular Med Tri-con. March 10-15, 2019. San Francisco, CA, USA
Savova G. 2018. Software and Research Challenges for Clinical NLP. Dana Farber Cancer Institute; 2018 October; Boston, MA, USA.
Savova, Guergana. 2018. Cancer Deep Phenotype Extraction form Electronic Medical Records (DeepPhe). College of American Pathologists Pathology Electronic Reporting meeting (CAP PERT). July 29, 2018. Montreal, QB, CA.
Warner, Jeremy. 2018. A Comprehensive Ontology of Hematology/Oncology Regimens. College of American Pathologists Pathology Electronic Reporting meeting (CAP PERT). July 29, 2018. Montreal, QB, CA.
Savova, G; Miller, T. 2018. DeepPhe and Extraction of Oncology Patient Phenotypes from Unstructured Text Using NLP and Other AI Tools. Presentation to Dana Farber Cancer Institute. January 24 2018. Boston, MA.
Warner, Jeremy. 2017. Supporting cancer registries through automated extraction of pathology and chemotherapy regimen information.” CDC/NCI/FDA/VA Clinical Natural Language Processing Workshop. Atlanta, GA.
Savova, Guergana. 2017. Select Applications of Natural Language Processing in Biomedicine. Natural Language Processing Symposium, Boston University, Boston, MA. November, 2017.
Jacobson, Rebecca. 2017. Invited presentation at Ohio State University James Cancer Center Grand Rounds, January 20th, 2017
Jacobson, Rebecca. 2017. Invited presentation at Case Western University Comprehensive Cancer Center Seminar Series, March 10th, 2017
Jacobson, Rebecca. 2016. Invited presentation of cTAKES and DeepPhe to NCI in January, 2016. Gaithersburg, MD
Jacobson, Rebecca. 2016. Invited presentation in CBIIT Speaker Series, February 17, 2016. Gaithersburg, MD
Jacobson, Rebecca. 2016. Invited presentation at University of Pittsburgh Cancer Informatics (UPCI) External Advisory Board, March 8, 2016
Finan, Sean. 2016. cTAKES/deepPhe presentation at the ITCR workshop at CI4CC in Napa, CA
Jacobson, Rebecca. 2016. Invited presentation at SEER PI meeting in New Mexico, March 16, 2016
Jacobson, Rebecca. 2016. Invited presentation at University of Michigan Department of Learning Health Sciences, April 6th, 2016
Jacobson, Rebecca. 2016. Invited presentation at Pathology Informatics 2016, Pittsburgh PA, May 24th, 2016
Jacobson, Rebecca. 2016. Invited presentation at University of Pittsburgh Cancer Institute Scientific Retreat, Greensburg, PA, June 16th, 2016
Jacobson, Rebecca and Savova, Guergana. 2016. Invited presentation at SEER meeting in Gaithersburg, MD, December 10, 2016
Jacobson, Rebecca and Savova, Guergana. Invited presentation of cTAKES/DeepPhe to NCI in October, 2015

Other:

Interview with Uduak Thomas of the GenomeWeb magazine. May 16, 2014. https://www.genomeweb.com/informatics/upitt-bch-team-use-696k-grant-develop-nlp-based-tools-extract-phenotype-data-emr#.W3HF1NJKi70
Project website: cancer.healhnlp.org
Github repository: https://github.com/DeepPhe
Listed on the ITCR website, Tools: https://itcr.cancer.gov/informatics-tools

Information Extracted by DeepPhe

Cancer – body location, laterality, stage, clinical TNM, path TNM
Tumor – body location, laterality, diagnosis, tumor type, histologic type, cancer type, extend, grade
Specific to BrCA – clockface position, quadrant, ER/PR/HER2
Specific to OvCa – CA-125
Specific to melanoma – clarks level, Breslow depth
Specific to prostate cancer -- Gleason score, PSA
Medications
Procedures
Radiotherapy
Comorbidities
Episodes –
- Pre-diagnostic: a tumor is mentioned PRIOR to a malignant diagnosis
- Diagnostic: a tumor is mentioned WITH a malignant diagnosis
- Decision making: discussion of potential treatments AFTER an established diagnosis
- Treatment: a treatment is mentioned DURING the treatment episode
- Follow-up: discussion appearing AFTER the treatment episode ends
- Unknown: episode category unsettled

DeepPhe Software

DeepPhe release is available in

code

DeepPhe Gold Set

Process for Deidentification of Source Documents.
Process for Deidentification of Source Documents.
Process for Deidentification of Source Documents.
Process for Selection of Gold Set Source Documents.
DepPhe Training/Development/Test splits
- training set:
  - all documents for Breast Cancer patients 03, 11, 92, 93 for a total of 48 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Breast Cancer patients extended 4,5,6,9,10,12,13,14,18,19,20,22,23,26,27,30,31,32,33,34,35,38,39,40,41,42,43,46,47 for a total of 954 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Melanoma patients 05, 06, 18, 19, 25, 28, 30, 33, 34, 42, for a total of 233 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\trainSet\DeepPhe DevSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 3, 4, 7, 8, 12, 13, 16, 17, 18, 20, 24, 25, 26, 27, 30, 31, 32, 34, 37, 38, 41, 42, 43, 44, 46, 48 for a total of 1675 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\trainSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\trainSet\DeepPhe_ovCa_Train_Set_Phenotype_Annotations_GOLD.xlsm
  - all documents for Colorectal cancer patients 1, 2, 3, 8, 9, 10, 11, 16, 17, 18, 19, 24, 25, 26, 27, 32, 33, 34, 35, 40, 41, 42, 43, 48, 49, 50, 51, 56, 57, 58, 59, 64, 65, 66, 67, 72, 73, 74, 75, 80, 81, 82, 83, 88, 89, 90, 91, 96, 97, 98, 99, 104, 105, 106, 107, 112, 113, 114, 115, 120, 121, 122, 123, 128, 129, 130, 131, 136, 137, 138, 139, 144, 145, 146, 147, 152, 153, 154, 155, 160, 161, 162, 163, 168, 169, 170, 171, 176, 177, 178, 179, 184, 185, 186, 187, 192, 193, 194, 195, 200, 201, 202, 203, 208, 209, 210, 211, 216, 217
- development set:
  - all documents for Breast Cancer patients 02, 21 for a total of 42 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Breast Cancer patients extended 7,8,15,16,17,24,25,28,29,36,37,44,45 for a total of 457 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedDev\DeepPhe Gold Phenotype Annotations_v2.xlsm
  - all documents for Melanoma patients 07, 32, 43 for a total of 215 (processed only 211 docs) documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\devSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\devSet\DeepPhe DevSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 9, 11, 19, 28, 29, 35, 39, 47 for a total of 562 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\devSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\devSet\DeepPhe_ovCa_Dev_Set_Phenotype_Annotations_GOLD.xlsm
  - all documents for Colorectal cancer patients 4, 5, 12, 13, 20, 21, 28, 29, 36, 37, 44, 45, 52, 53, 60, 61, 68, 69, 76, 77, 84, 85, 92, 93, 100, 101, 108, 109, 116, 117, 124, 125, 132, 133, 140, 141, 148, 149, 156, 157, 164, 165, 172, 173, 180, 181, 188, 189, 196, 197, 204, 205, 212, 213
- test set:
  - all documents for Breast Cancer patients 01 (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\DeepPhe Test Phenotype Annotations v2.xlsm
  - all documents for Breast Cancer extended for patients 01, 02, 63, 76, 100, 101, 104, 106, 109, 111, 114, 115, 117, 118, 119, 120, 121, 123, 125, 126, 129, 130, 132, 136, 137, 138, 142, 143, 155, 156, 158, 174, 181, 189, 197 for phenotyping level testing use (\\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\breast\UPMCextendedTest\DeepPhe Test Phenotype Annotations v2.xlsm
  - all documents for Melanoma patients 02, 03, 11, 12, 14, 16, 24, 27, 41, 44 for a total of 229 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\testSet); gold annotations are \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\melanoma\testSet\DeepPhe TestSet Phenotype Annotations.xlsm
  - all documents for Ovarian Cancer patients 15, 21, 33, 36, 40, 45, 49, 50 for a total of 559 documents (in BCH \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\testSet); gold annotations are in \\rc-fs\chip-nlp\Public\DeepPhe\DeepPheDatasets\ovarian\final_dataset\testSet\DeepPhe_ovCa_Test_Set_Phenotype_Annotations_GOLD.xlsm
  - all documents for Colorectal cancer patients 6, 7, 14, 15, 22, 23, 30, 31, 38, 39, 46, 47, 54, 55, 62, 63, 70, 71, 78, 79, 86, 87, 94, 95, 102, 103, 110, 111, 118, 119, 126, 127, 134, 135, 142, 143, 150, 151, 158, 159, 166, 167, 174, 175, 182, 183, 190, 191, 198, 199, 206, 207, 214, 215
- use the training set for developing the algorithms and the development set to report results and error analysis. The test set will be used only for the final evaluation to go in publications.
SEER Project Train/Dev/Test Splits
Clinical Genomics Gold Set

Qualitative Interviews

Software Development Goals: Phase 2

Scrum Sprints

Project materials/ WIKIs to tasks

Archive

Communication

Weekly team meetings
Tools we use for communication are listed in our Communications Plan .

Meeting Notes

Meeting notes

Contact

If you have further questions about the project, contact guergana dot savova at childrens dot harvard dot edu.

Getting started

Consult the User's Guide for information on using the wiki software.

Retrieved from "https://healthnlp.hms.harvard.edu/cancer/wiki/index.php?title=Main_Page&oldid=4110"

Main Page

Contents

Welcome to the Cancer Deep Phenotype Extraction project

Who We Are

Boston Children's Hospital/Harvard Medical School

University of Pittburgh

Rhode Island Hospital (Brown University)

University of Kentucky/Kentucky Cancer Registry

Dana-Farber Cancer Institute

University of Minnesota

Vanderbilt University

Funding

Cancer Deep Phenotyping for Cancer Surveillance (DeepPhe*CR)

Scrum Sprints

Project materials

Publications and presentations

Websites

Cancer Deep Phenotyping for Translational Science (DeepPhe)

Publications and presentations

Information Extracted by DeepPhe

DeepPhe Software

DeepPhe Gold Set

Qualitative Interviews

Software Development Goals: Phase 2

Scrum Sprints

Project materials/ WIKIs to tasks

Communication

Meeting Notes

Contact

Getting started

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools