The Library
Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics
Tools
Sirinukunwattana, Korsuk, Savage, Richard S., Bari, Muhammad F., Snead, David R. J. and Rajpoot, Nasir M. (2013) Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics. PLoS One, 8 (10). e75748. doi:10.1371/journal.pone.0075748 ISSN 1932-6203.
|
PDF
WRAP_journal.pone.0075748.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (998Kb) | Preview |
Official URL: http://dx.doi.org/10.1371/journal.pone.0075748
Abstract
Clustering analysis is an important tool in studying gene expression data. The Bayesian hierarchical clustering (BHC) algorithm can automatically infer the number of clusters and uses Bayesian model selection to improve clustering quality. In this paper, we present an extension of the BHC algorithm. Our Gaussian BHC (GBHC) algorithm represents data as a mixture of Gaussian distributions. It uses normal-gamma distribution as a conjugate prior on the mean and precision of each of the Gaussian components. We tested GBHC over 11 cancer and 3 synthetic datasets. The results on cancer datasets show that in sample clustering, GBHC on average produces a clustering partition that is more concordant with the ground truth than those obtained from other commonly used algorithms. Furthermore, GBHC frequently infers the number of clusters that is often close to the ground truth. In gene clustering, GBHC also produces a clustering partition that is more biologically plausible than several other state-of-the-art methods. This suggests GBHC as an alternative tool for studying gene expression data. The implementation of GBHC is available at https://sites.
google.com/site/gaussianbhc/
Item Type: | Journal Article | ||||||
---|---|---|---|---|---|---|---|
Subjects: | Q Science > QH Natural history > QH426 Genetics R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer) |
||||||
Divisions: | Faculty of Science, Engineering and Medicine > Research Centres > Warwick Systems Biology Centre | ||||||
Library of Congress Subject Headings (LCSH): | Gene expression , Cancer -- Genetic aspects, Bayesian statistical decision theory | ||||||
Journal or Publication Title: | PLoS One | ||||||
Publisher: | Public Library of Science | ||||||
ISSN: | 1932-6203 | ||||||
Official Date: | 23 October 2013 | ||||||
Dates: |
|
||||||
Volume: | 8 | ||||||
Number: | 10 | ||||||
Article Number: | e75748 | ||||||
DOI: | 10.1371/journal.pone.0075748 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Access rights to Published version: | Open Access (Creative Commons) | ||||||
Date of first compliant deposit: | 25 December 2015 | ||||||
Date of first compliant Open Access: | 25 December 2015 | ||||||
Funder: | Qatar National Research Fund (QNRF), University of Warwick. Department of Computer Science, Medical Research Council (Great Britain). Biostatistics Unit (MRC), Pakistan. Higher Education Commission, Dow University of Health Sciences, West Midlands Lung Tissue Consortium | ||||||
Grant number: | NPRPS-1345-1-228 (QNRF), G0902104 (MRC) |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year