Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Accurate reconstruction of bacterial pan- and core genomes with PEPPAN

Tools
- Tools
+ Tools

Zhou, Zhemin, Charlesworth, Jane and Achtman, Mark (2020) Accurate reconstruction of bacterial pan- and core genomes with PEPPAN. Genome Research, 30 . pp. 1667-1679. doi:10.1101/gr.260828.120 ISSN 1088-9051.

[img]
Preview
PDF
WRAP-Accurate-reconstruction-bacterial-pan-core-genomes-PEPPAN-Zhou-2020.pdf - Accepted Version - Requires a PDF viewer.

Download (1491Kb) | Preview
Official URL: https://doi.org/10.1101/gr.260828.120

Request Changes to record.

Abstract

Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications and horizontal gene transfer. In order to reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses including the calculation of trees based on accessory gene content or allelic differences between core genes. In order to test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated datasets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of ~40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus. The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus.

Item Type: Journal Article
Subjects: Q Science > QH Natural history > QH301 Biology
Divisions: Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School > Biomedical Sciences
Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School > Biomedical Sciences > Microbiology & Infection
Faculty of Science, Engineering and Medicine > Medicine > Warwick Medical School
Library of Congress Subject Headings (LCSH): Genomics, Bacterial genomes, Streptococcus, Bacteria -- Genome mapping
Journal or Publication Title: Genome Research
Publisher: Cold Spring Harbor Lab Press
ISSN: 1088-9051
Official Date: 14 October 2020
Dates:
DateEvent
14 October 2020Published
2020UNSPECIFIED
1 September 2020Accepted
Volume: 30
Page Range: pp. 1667-1679
DOI: 10.1101/gr.260828.120
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Copyright Holders: © 2020 Zhou et al.; Published by Cold Spring Harbor Laboratory Press
Date of first compliant deposit: 9 September 2020
Date of first compliant Open Access: 10 September 2020
RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
202792/Z/16/ZWellcome Trusthttp://dx.doi.org/10.13039/100010269
BB/L020319/1[BBSRC] Biotechnology and Biological Sciences Research Councilhttp://dx.doi.org/10.13039/501100000268
Related URLs:
  • Publisher
Open Access Version:
  • bioRxiv

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us