
The Library
Accurate reconstruction of bacterial pan- and core genomes with PEPPAN
Tools
Zhou, Zhemin, Charlesworth, Jane and Achtman, Mark (2020) Accurate reconstruction of bacterial pan- and core genomes with PEPPAN. Genome Research, 30 . pp. 1667-1679. doi:10.1101/gr.260828.120 ISSN 1088-9051.
|
PDF
WRAP-Accurate-reconstruction-bacterial-pan-core-genomes-PEPPAN-Zhou-2020.pdf - Accepted Version - Requires a PDF viewer. Download (1491Kb) | Preview |
Official URL: https://doi.org/10.1101/gr.260828.120
Abstract
Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications and horizontal gene transfer. In order to reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses including the calculation of trees based on accessory gene content or allelic differences between core genes. In order to test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated datasets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of ~40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus. The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus.
Item Type: | Journal Article | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QH Natural history > QH301 Biology | |||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School > Biomedical Sciences Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School > Biomedical Sciences > Microbiology & Infection Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School |
|||||||||
Library of Congress Subject Headings (LCSH): | Genomics, Bacterial genomes, Streptococcus, Bacteria -- Genome mapping | |||||||||
Journal or Publication Title: | Genome Research | |||||||||
Publisher: | Cold Spring Harbor Lab Press | |||||||||
ISSN: | 1088-9051 | |||||||||
Official Date: | 14 October 2020 | |||||||||
Dates: |
|
|||||||||
Volume: | 30 | |||||||||
Page Range: | pp. 1667-1679 | |||||||||
DOI: | 10.1101/gr.260828.120 | |||||||||
Status: | Peer Reviewed | |||||||||
Publication Status: | Published | |||||||||
Access rights to Published version: | Restricted or Subscription Access | |||||||||
Copyright Holders: | © 2020 Zhou et al.; Published by Cold Spring Harbor Laboratory Press | |||||||||
Date of first compliant deposit: | 9 September 2020 | |||||||||
Date of first compliant Open Access: | 10 September 2020 | |||||||||
RIOXX Funder/Project Grant: |
|
|||||||||
Related URLs: | ||||||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year