The Library
Accurate reconstruction of microbial strains using representative reference genomes
Tools
Zhou, Zhemin, Luhmann, Nina, Alikhan, Nabil-Fareed, Quince, Christopher and Achtman, Mark (2017) Accurate reconstruction of microbial strains using representative reference genomes. Working Paper. BioRxiv: Cold Spring Harbour.
|
PDF
WRAP-accurate-reconstruction-microbial-strains-metagenomic-genomes-Alikhan-2017.pdf - Accepted Version - Requires a PDF viewer. Available under License Creative Commons: Attribution-Noncommercial 4.0. Download (2356Kb) | Preview |
Official URL: http://dx.doi.org/10.1101/215707
Abstract
Exploring the genetic diversity of microbes within the environment through metagenomic sequencing first requires classifying these reads into taxonomic groups. Current methods compare these sequencing data with existing biased and limited reference databases. Several recent evaluation studies demonstrate that current methods either lack sufficient sensitivity for species-level assignments or suffer from false positives, overestimating the number of species in the metagenome. Both are especially problematic for the identification of low-abundance microbial species, e.g. detecting pathogens in ancient metagenomic samples. We present a new method, SPARSE, which improves taxonomic assignments of metagenomic reads. SPARSE balances existing biased reference databases by grouping reference genomes into similarity-based hierarchical clusters, implemented as an efficient incremental data structure. SPARSE assigns reads to these clusters using a probabilistic model, which specifically penalizes non-specific mappings of reads from unknown sources and hence reduces false-positive assignments. Our evaluation on simulated datasets from two recent evaluation studies demonstrated the improved precision of SPARSE in comparison to other methods for species-level classification. In a third simulation, our method successfully differentiated multiple co-existing Escherichia coli strains from the same sample. In real archaeological datasets, SPARSE identified ancient pathogens with ≤0.02% abundance, consistent with published findings that required additional sequencing data. In these datasets, other methods either missed targeted pathogens or reported non-existent ones.
Item Type: | Working or Discussion Paper (Working Paper) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QR Microbiology | |||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School > Biomedical Sciences > Microbiology & Infection Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School |
|||||||||
Library of Congress Subject Headings (LCSH): | Bacteria -- Genome mapping -- Databases | |||||||||
Journal or Publication Title: | biorxiv | |||||||||
Publisher: | Cold Spring Harbour | |||||||||
Place of Publication: | BioRxiv | |||||||||
Book Title: | Accurate Reconstruction of Microbial Strains from Metagenomic Sequencing Using Representative Reference Genomes | |||||||||
Official Date: | 7 November 2017 | |||||||||
Dates: |
|
|||||||||
Number: | 216788 | |||||||||
DOI: | 10.1101/215707 | |||||||||
Institution: | University of Warwick | |||||||||
Status: | Not Peer Reviewed | |||||||||
Publication Status: | Published | |||||||||
Access rights to Published version: | Open Access (Creative Commons) | |||||||||
Date of first compliant deposit: | 7 December 2017 | |||||||||
Date of first compliant Open Access: | 7 December 2017 | |||||||||
RIOXX Funder/Project Grant: |
|
|||||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year