Difference between revisions of "User Personae"

From HealthNLP-Deepphe
Jump to: navigation, search
(Current tools and limitations)
(Information Broker)
Line 104: Line 104:
  
 
==Information Broker==
 
==Information Broker==
 +
 +
===Background===
 +
 +
* Unfamiliar with NLP Concepts
 +
* Unfamiliar with NLP tools and resources
 +
* Unfamiliar with OO programming languages
 +
* Unfamiliar with text manipulation languages (e.g. Python, Perl, Ruby)
 +
* Familiar with DBMS and data management principles
 +
* Limited or no ability to interpret clinical text reports
 +
* Premise/story
 +
 +
This class is comprised of users who are employed at medical research institutions, educational institutions and software companies interested in the healthcare domain. Their daily job involves oversight of data storage solutions. They work with other user groups to identify data requirements and design data storage format, media, and access. They may be open to using existing tools and methods as part of their solutions, but prefer to stick to their established tools for and types of filesystems, shares, databases, etc., as well as their currently used methods for storage and retrieval. Their integration would be limited to accepting NLP output and storing it, possibly in a modified form, or providing mechanisms for NLP output providers to do so themselves. The Information Broker may act as a middleman between NLP output providers and NLP output end users. Success, by the Information Broker's definition, is a simple and consistent tool or method for the migration of NLP output data to a store from which that data is easily accessible by end users.
 +
 +
===Expectations===
 +
 +
* Documentation about the types and formats of NLP output.
 +
* NLP output must be easy to integrate into their own data storage formats.
 +
 +
===Information needs===
 +
 +
* Information needs are dictated by the end users.
 +
 +
===Information constraints===
 +
 +
* The Information Broker should be able to work with a wide variety of standard data types and formats.
 +
* The Information Broker should not be expected to work with any data type and format not of their choosing.
 +
 +
===Current tools and limitations===
 +
 +
* May not have consistent or in-depth communication with end users.
 +
* May not have a complete understanding of what types of data end users may desire.
 +
* Wide array of system environments, databases and tools may be used.
 +
* Lack of standard NLP output data types make it hard to create a universal storage format.
 +
* Lack of standard NLP output data formats make it hard to identify or create universal tools and methods for NLP data consumption.
  
 
==Informatics Researcher==
 
==Informatics Researcher==

Revision as of 13:23, 3 August 2016

Presented here is a series of stakeholder or user descriptions - referred to here as personae, which informed preliminary development of the cancer models.

Translational Scientist with “Dry Bench” Bioinformatics skills

Background

  • PhD trained scientist in wide range of fields relevant to cancer (e.g. genetics, pharmacology, molecular biology, immunology)
  • Analytically trained and familiar with statistical methods, including genomics/bioinformatics.
  • Unfamiliar with Natural Language Processing (NLP) Concepts
  • Unfamiliar with NLP tools and resources
  • Limited familiarity with OO programming languages
  • Familiar with text manipulation languages ( e.g. Python, Perl, Ruby)

Premise/Story

Cancer biologists are unraveling the genomic and molecular changes that drive tumors towards specific behaviors such as progression and metastasis. Identifying these molecular drivers will require information about the specific cancer behaviors that they produce. This class of users will examine data for case finding and to classify cases based on outcome.

Expectations

  • Population-level statistics, summarization, and comparisons.
  • Graphical displays, including bar charts, error bars, etc.
  • Inferential statistics
  • Export to statistical software (SAS,SPSS,RapidMiner, R) ###Information needs
  • Demographic data
  • Treatment data
  • Disease progression, metastasis and other outcomes (e.g. RECIST criteria)
  • Available biomarkers and other clinical molecular information not in structured format (e.g. Oncotype Scores) ###Current tools and limitations
  • Mac desktop, Linux and Windows computing
  • Some familiarity with DBMS and data management principles
  • Knowledge and use of statistical software (e.g. SAS, SPSS, RapidMiner, R), but time required to extract and format data is substantial.
  • Routine access to PHI clinical text for work, able to interpret clinical text reports, but in-depth review is too error-prone and time-consuming.

Clinical Translational Scientist

Background

  • MD, DrPH, or RN trained scientist in wide range of clinical specialties (e.g. oncology, surgery, medicine, pathology, epidemiology)
  • Expert understanding of clinical oncology, cancer therapeutics, and patient management
  • Familiar with statistical methods, including genomics/bioinformatics, but typically relies on statisticians, bioinformaticists and other collaborators and colleagues for analysis.
  • Unfamiliar with NLP Concepts
  • Unfamiliar with NLP tools and resources
  • Unfamiliar with OO programming languages
  • Unfamiliar with text manipulation languages ( e.g. Python, Perl, Ruby)

Premise/story

This class of users is interested in extracting phenotype features from a set of documents, often for correlating specific features, treatments, and/or outcomes with molecular characterizations of tumors. They may also be interested in risk factors and other co-morbidities that provide insight into cancer biology (e.g. immune function). Cares more about the quality of the phenotype information in relation to the quality of the scientific conclusions to be developed,rather than the details of the phenotype extraction methods.

Expectations

  • Basic inferential statistics, summarization, and comparisons.
  • Graphical displays, including bar charts, error bars, etc.

Information needs

  • Demographic data
  • Treatment data
  • Disease progression, metastasis and other outcomes (e.g. RECIST criteria)
  • Available biomarkers and other clinical molecular information not in structured format (e.g. Oncotype Scores)

Current tools and limitations

  • Enterprise Windows computing
  • Limited familiarity with DBMS and data management principles
  • Use of statistical software (e.g. SAS, SPSS, RapidMiner, R), but time required to extract and format data is substantial.
  • Routine access to PHI clinical text for work, able to interpret clinical text reports. Most able to interpret complex temporal and other relations in text, but in-depth review is too error-prone and time-consuming.

Population Health Scientist/Health Care Outcomes Analyst

Background

  • Broad possibilities, including MD, PhD, MPH/DrPH, MBA.
  • Analytically trained and familiar with statistical methods, but not necessarily in genomics/bioinformatics.
  • Unfamiliar with NLP Concepts
  • Unfamiliar with NLP tools and resources
  • Unfamiliar with OO programming languages
  • Unfamiliar with text manipulation languages ( e.g. Python, Perl, Ruby)
  • Routine access to PHI clinical text for work, easily interprets clinical text reports
  • Some familiarity with DBMS and data management principles
  • Cares more about accuracy of results, confidence in results, summary information, pointers to WHY particular calls were made (chain of evidence) than the details of the implementation.

Premise/story

Cancer care has significant implications in terms of costs of care, effectiveness of different treatments, and values assigned to different outcomes. This class of users will examine data to study efficacy of treatment regimes across different patient groups, to identify factors that might influence costs or improve outcomes, and to otherwise understand how to allocate limited resources to optimize outcomes.

Expectations

  • Population-level statistics, summarization, and comparisons.
  • Graphical displays, including bar charts, error bars, etc.
  • Basic inferential statistics
  • Export to statistical software (SAS,SPSS, S+, RapidMiner, R)

Information needs

  • Demographic data
  • Treatment data
  • Disease progress
  • Outcomes
  • Treatment context: physician, ward, etc.

Current tools and limitations

  • Windows-based enterprise computing
  • Some familiarity with DBMS and data management principles
  • Knowledge and use of statistical software (e.g. SAS, SPSS, S+, RapidMiner), but time required to extract and format data is substantial.
  • Routine access to PHI clinical text for work, able to interpret clinical text reports, but in-depth review is too error-prone and time-consuming.

Information Broker

Background

  • Unfamiliar with NLP Concepts
  • Unfamiliar with NLP tools and resources
  • Unfamiliar with OO programming languages
  • Unfamiliar with text manipulation languages (e.g. Python, Perl, Ruby)
  • Familiar with DBMS and data management principles
  • Limited or no ability to interpret clinical text reports
  • Premise/story

This class is comprised of users who are employed at medical research institutions, educational institutions and software companies interested in the healthcare domain. Their daily job involves oversight of data storage solutions. They work with other user groups to identify data requirements and design data storage format, media, and access. They may be open to using existing tools and methods as part of their solutions, but prefer to stick to their established tools for and types of filesystems, shares, databases, etc., as well as their currently used methods for storage and retrieval. Their integration would be limited to accepting NLP output and storing it, possibly in a modified form, or providing mechanisms for NLP output providers to do so themselves. The Information Broker may act as a middleman between NLP output providers and NLP output end users. Success, by the Information Broker's definition, is a simple and consistent tool or method for the migration of NLP output data to a store from which that data is easily accessible by end users.

Expectations

  • Documentation about the types and formats of NLP output.
  • NLP output must be easy to integrate into their own data storage formats.

Information needs

  • Information needs are dictated by the end users.

Information constraints

  • The Information Broker should be able to work with a wide variety of standard data types and formats.
  • The Information Broker should not be expected to work with any data type and format not of their choosing.

Current tools and limitations

  • May not have consistent or in-depth communication with end users.
  • May not have a complete understanding of what types of data end users may desire.
  • Wide array of system environments, databases and tools may be used.
  • Lack of standard NLP output data types make it hard to create a universal storage format.
  • Lack of standard NLP output data formats make it hard to identify or create universal tools and methods for NLP data consumption.

Informatics Researcher

NLP Developers

Domain Specific Application Developers

Integrative Cancer Biologists and Modelers