Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Bayesian modeling of recombination events in bacterial populations

Tools
- Tools
+ Tools

Marttinen, Pekka, Baldwin, Adam, Hanage, William P., Dowson, Christopher G. and Mahenthiralingam, Eshwar. (2008) Bayesian modeling of recombination events in bacterial populations. BMC Bioinformatics, Vol.9 (No.421). ISSN 1471-2105

[img]
Preview
PDF
WRAP_baldwin_Bayesian_1471-2105-9-421.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (1106Kb)
Official URL: http://dx.doi.org/10.1186/1471-2105-9-421

Abstract

Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html.

Item Type: Journal Article
Subjects: Q Science > QR Microbiology
Divisions: Faculty of Science > Life Sciences (2010- ) > Biological Sciences ( -2010)
Library of Congress Subject Headings (LCSH): Bayesian statistical decision theory, Bacteria, Evolution (biology)
Journal or Publication Title: BMC Bioinformatics
Publisher: BioMed Central Ltd.
ISSN: 1471-2105
Date: 7 October 2008
Volume: Vol.9
Number: No.421
Identification Number: 10.1186/1471-2105-9-421
Status: Peer Reviewed
Access rights to Published version: Open Access
Funder: ComMIT graduate school , Suomen Akatemia [Academy of Finland]
References: References 1. Skalka A, Burgi E, Hershey AD: Segmental distribution of nucleotides in the DNA of bacteriophage lambda. Journal of Molecular Biology 1968, 34:1-16. 2. Elton RA: Theoretical models for heterogeneity of base composition in DNA. Journal of Theoretical Biology 1974, 45:533-553. 3. Sawyer S: Statistical tests for detecting gene conversion. Mol Biol Evol 1989, 6(5):526-538. 4. Hein J: A heuristic method to reconstruct the history of sequences subject to recombination. Journal of Molecular Evolution 1993, 36:396-405. 5. Grassly NC, Holmes EC: A likelihood method for the detection of selection and recombination using nucleotide sequences. Mol Biol Evol 1997, 14(3):239-247. 6. Maynard Smith J, Smith NH: Detecting recombination from gene trees. Mol Biol Evol 1998, 15(5):590-599. 7. Suchard MA, Weiss RE, Dorman KS, Sinsheimer JS: Inferring spatial phylogenetic variation along nucleotide sequences: A multiple changepoint model. Journal of American Statistical Association 2003, 98:427-437. 8. Lawrence JG: Gene Transfer in Bacteria: Speciation without Species? Theoretical Population Biology 2002, 61:449-460. 9. Jain R, Rivera MC, Moore JE, Lake JA: Horizontal Gene Transfer in Microbial Genome Evolution. Theoretical Population Biology 2002, 61:489-495. 10. Fraser C, Hanage WP, Spratt BG: Recombination and the Nature of Bacterial Speciation. Science 2007, 315:476-480. 11. Cohan FM, Perry EB: A Systematics for Discovering the Fundamental Units of Bacterial Diversity. Current Biology 2007, 17:373-386. 12. Husmeier D, McGuire G: Detecting recombination in 4-taxa DNA sequence alignments with Bayesian hidden Markov models and Markov chain Monte Carlo. Molecular Biology and Evolution 2003, 20:315-337. 13. Minin VN, Dorman KS, Fang F, Suchard MA: Dual multiple changepoint model leads to more accurate recombination detection. Bioinformatics 2005, 21:3034-3042. 14. Didelot X, Falush D: Inference of Bacterial Microevolution Using Multilocus Sequence Data. Genetics 2007, 175:1251-1266. 15. Chan CX, Beiko RG, Ragan MA: Detecting recombination in evolving nucleotide sequences. BMC Bioinformatics 2006, 7:412. 16. Hanage WP, Fraser C, Spratt BG: Fuzzy species among recombinogenic bacteria. BMC Biology 2005, 3:. 17. Braun JV, Muller HG: Statistical Methods for DNA Sequence Segmentation. Statistical Science 1998, 13:142-162. 18. Corander J, Tang J: Bayesian analysis of population structure based on linked molecular information. Mathematical Biosciences 2007, 205:19-31. 19. Corander J, Marttinen P: Bayesian identification of admixture events using multi-locus molecular markers. Molecular Ecology 2006, 15:2833-2843. 20. Corander J, Waldmann P, Marttinen P, Sillanpää MJ: BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 2004, 20:2363-2369. 21. Falush D, Stephens M, Pritchard JK: Inference of Population Structure Using Multilocus Genotype Data: Linked Loci and Correlated Allele Frequencies. Genetics 2003, 164:1567-1587. 22. Hand DJ, Yu K: Idiot's Bayes – not so stupid after all? International Statistical Review 2001, 69:385-399. 23. Schervish MJ: Theory of Statistics New York: Springer-Verlag; 1995. 24. Robert CP, Casella : Monte Carlo Statistical Methods second edition. New York: Springer; 2005. 25. Sisson SA: Transdimensional Markov Chains: A Decade of Progress and Future Perspectives. Journal of American Statistical Association 2005, 100:1077-1089. 26. Aarts EHL, Korst J: Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing New York, USA: Wiley; 1989. 27. Corander J, Gyllenberg M, Koski T: Bayesian model learning based on a parallel MCMC strategy. Statistics and Computing 2006, 16:355-362. 28. Marttinen P, Corander J, Törönen P, Holm L: Bayesian search of functionally divergent protein subgroups and their function specific residues. Bioinformatics 2006, 22:2466-2474. 29. Arenas M, Posada D: Recodon: Coalescent simulation of coding DNA sequences with recombination, migration and demography. BMC Bioinformatics 2007, 8:458. 30. Felsenstein J: PHYLIP – Phylogeny Inference Package (Version 3.2). Cladistics 1989, 5:164-166. 31. Posada D, Crandall KA: The effect of recombination on the accuracy of phylogeny estimation. Journal of Molecular Evolution 2002, 54:396-402. 32. Rambaut A, Grass NC: Seq-Gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Bioinformatics 1997, 13:235-238. 33. Hasegawa M, Kishino K, Yano T: Dating the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 1985, 22:160-174. 34. Baldwin A, Mahenthiralingam E, Thickett KM, Honeybourne D, Maiden MCJ, Govan JR, Speert DP, LiPuma JL, Vandamme P, Dowson CG: Sequence Typing for the Burkholderia cepacia complex: a novel scheme that provides both species and strain differentiation. Journal of Clinical Microbiology 2005, 43:4665-4673. 35. Mahenthiralingam E, Urban TA, Goldberg JB: The multifarious, multireplicon Burkholderia cepacia complex. Nature Reviews Microbiology 2005, 3:144-156. 36. Baldwin A, Mahenthiralingam E, Drevinek P, Vandamme P, Govan JR, Waine DJ, LiPuma JJ, Chiarini L, Dalmastri C, Henry DA, Speert DP, Honeybourne D, Maiden MCJ, Dowson CG: Environmental Burkholderia cepacia complex isolates in human infections. Emerging infectious diseases 2007, 13:458-461. 37. Mahenthiralingam E, Baldwin A, Vandamme P: Burkholderia cepacia complex infection in patients with cystic fibrosis. Journal of Medical Microbiology 2002, 51:533-538. 38. Baldwin A, Sokol PA, Parkhill J, Mahenthiralingam E: The Burkholderia cepacia epidemic strain marker is part of a novel genomic island encoding both virulence and metabolismassociated genes in Burkholderia cenocepacia. Infection and Immunity 2004, 72:1537-1547. 39. Wiersinga WJ, Poll T Van der, White NJ, Day NP, Peacock SJ: Melioidosis: insights into the pathogenicity of Burkholderia pseudomallei. Nature Reviews Microbiology 2006, 4:272-282. 40. Sinsheimer JS, Suchard MA, Dorman KS, Fang F, Weiss RE: Are you my mother? Bayesian phylogenetic inference of recombination among putative parental strains. Applied Bioinformatics 2003, 2:131-144. 41. Minin VN, Dorman KS, Fang F, Suchard MA: Phylogenetic Mapping of Recombination Hotspots in Human Immunodeficiency Virus via Spatially Smoothed Change-Point Processes. Genetics 2007, 175:1773-1785.
URI: http://wrap.warwick.ac.uk/id/eprint/515

Data sourced from Thomson Reuters' Web of Knowledge

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us