Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Illumina error profiles : resolving fine-scale variation in metagenomic sequencing data

Tools
- Tools
+ Tools

Schirmer, Melanie, D’Amore, Rosalinda, Ijaz, Umer Z., Hall, Neil and Quince, Christopher (2016) Illumina error profiles : resolving fine-scale variation in metagenomic sequencing data. BMC Bioinformatics, 17 (1). pp. 1-15. 125. doi:10.1186/s12859-016-0976-y

[img]
Preview
PDF
WRAP_art%3A10.1186%2Fs12859-016-0976-y.pdf - Published Version - Requires a PDF viewer.
Available under License Creative Commons Attribution 4.0.

Download (2123Kb) | Preview
Official URL: http://dx.doi.org/10.1186/s12859-016-0976-y

Request Changes to record.

Abstract

Background:
Illumina’s sequencing platforms are currently the most utilised sequencing systems worldwide. The technology has rapidly evolved over recent years and provides high throughput at low costs with increasing read-lengths and true paired-end reads. However, data from any sequencing technology contains noise and our understanding of the peculiarities and sequencing errors encountered in Illumina data has lagged behind this rapid development.

Results:
We conducted a systematic investigation of errors and biases in Illumina data based on the largest collection of in vitro metagenomic data sets to date. We evaluated the Genome Analyzer II, HiSeq and MiSeq and tested state-of-the-art low input library preparation methods. Analysing in vitro metagenomic sequencing data allowed us to determine biases directly associated with the actual sequencing process. The position- and nucleotide-specific analysis revealed a substantial bias related to motifs (3mers preceding errors) ending in “GG”. On average the top three motifs were linked to 16 % of all substitution errors. Furthermore, a preferential incorporation of ddGTPs was recorded. We hypothesise that all of these biases are related to the engineered polymerase and ddNTPs which are intrinsic to any sequencing-by-synthesis method. We show that quality-score-based error removal strategies can on average remove 69 % of the substitution errors - however, the motif-bias remains.

Conclusion:
Single-nucleotide polymorphism changes in bacterial genomes can cause significant changes in phenotype, including antibiotic resistance and virulence, detecting them within metagenomes is therefore vital. Current error removal techniques are not designed to target the peculiarities encountered in Illumina sequencing data and other sequencing-by-synthesis methods, causing biases to persist and potentially affect any conclusions drawn from the data. In order to develop effective diagnostic and therapeutic approaches we need to be able to identify systematic sequencing errors and distinguish these errors from true genetic variation.

Item Type: Journal Article
Subjects: Q Science > QP Physiology
Divisions: Faculty of Medicine > Warwick Medical School
Library of Congress Subject Headings (LCSH): Metagenomics, Nucleotide sequence
Journal or Publication Title: BMC Bioinformatics
Publisher: BioMed Central Ltd.
ISSN: 1471-2105
Official Date: 11 March 2016
Dates:
DateEvent
11 March 2016Available
2 March 2016Accepted
16 October 2015Submitted
Volume: 17
Number: 1
Number of Pages: 15
Page Range: pp. 1-15
Article Number: 125
DOI: 10.1186/s12859-016-0976-y
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Open Access
Funder: Engineering and Physical Sciences Research Council (EPSRC), Medical Research Council (Great Britain) (MRC), Natural Environment Research Council (Great Britain) (NERC)
Grant number: EP/H003851/1 (EPSRC), MR/M50161X/1 (MRC), MR/L015080/1 (MRC), NE/L011956/1 (NERC)

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us