Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment

Tools
- Tools
+ Tools

Guo, Yufan, Korhonen, Anna, Liakata, Maria, Silins, Ilona, Hogberg, Johan and Stenius, Ulla (2011) A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment. BMC Bioinformatics, Volume 12 (Number 1). Article 69. doi:10.1186/1471-2105-12-69

[img]
Preview
Text
WRAP_Liakata_1471-2105-12-69.pdf - Published Version
Available under License Creative Commons Attribution 2.0..

Download (1184Kb) | Preview
Official URL: http://dx.doi.org/10.1186/1471-2105-12-69

Request Changes to record.

Abstract

Background:
Many practical tasks in biomedicine require accessing specific types of information in scientific literature; e.g. information about the results or conclusions of the study in question. Several schemes have been developed to characterize such information in scientific journal articles. For example, a simple section-based scheme assigns individual sentences in abstracts under sections such as Objective, Methods, Results and Conclusions. Some schemes of textual information structure have proved useful for biomedical text mining (BIO-TM) tasks (e.g. automatic summarization). However, user-centered evaluation in the context of real-life tasks has been lacking.
Methods:
We take three schemes of different type and granularity - those based on section names, Argumentative Zones (AZ) and Core Scientific Concepts (CoreSC) - and evaluate their usefulness for a real-life task which focuses on biomedical abstracts: Cancer Risk Assessment (CRA). We annotate a corpus of CRA abstracts according to each scheme, develop classifiers for automatic identification of the schemes in abstracts, and evaluate both the manual and automatic classifications directly as well as in the context of CRA.
Results:
Our results show that for each scheme, the majority of categories appear in abstracts, although two of the schemes (AZ and CoreSC) were developed originally for full journal articles. All the schemes can be identified in abstracts relatively reliably using machine learning. Moreover, when cancer risk assessors are presented with scheme annotated abstracts, they find relevant information significantly faster than when presented with unannotated abstracts, even when the annotations are produced using an automatic classifier. Interestingly, in this user-based evaluation the coarse-grained scheme based on section names proved nearly as useful for CRA as the finest-grained CoreSC scheme.
Conclusions:
We have shown that existing schemes aimed at capturing information structure of scientific documents can be applied to biomedical abstracts and can be identified in them automatically with an accuracy which is high enough to benefit a real-life task in biomedicine.

Item Type: Journal Article
Subjects: R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer)
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Tumors -- Classification, Scientific literature, Abstracts, Tumors -- Diagnosis -- Research
Journal or Publication Title: BMC Bioinformatics
Publisher: BioMed Central Ltd.
ISSN: 1471-2105
Official Date: 8 March 2011
Dates:
DateEvent
8 March 2011Published
Volume: Volume 12
Number: Number 1
Page Range: Article 69
DOI: 10.1186/1471-2105-12-69
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Funder: Engineering and Physical Sciences Research Council (EPSRC) , Royal Society (Great Britain), Sweden. Vetenskapsrådet [Research Council], Forskningsrådet för arbetsliv och socialvetenskap (Sweden) [Swedish Council for Working Life and Social Research], Joint Information Systems Committee (JISC), Cambridge International Scholarship Scheme (CISS)
Grant number: EP/G051070/1 (EPSRC)

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us