The Library

Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures

Tools

Rasmussen, Carl Edward, De la Cruz, Bernard J., Ghahramani, Zoubin and Wild, David L. (2009) Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. IEEE - ACM Transactions on Computational Biology and Bioinformatics, Vol.6 (No.4). pp. 615-628. doi:10.1109/TCBB.2007.70269 ISSN 1545-5963.

PDF
WRAP_Wild_Gene_expression.pdf - Requires a PDF viewer.
Download (4Mb)

Official URL: http://dx.doi.org/10.1109/TCBB.2007.70269

Request Changes to record.

Abstract

Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture (DPM) models provide a nonparametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model-based clustering methods have been to short time series data. In this paper, we present a case study of the application of nonparametric Bayesian clustering methods to the clustering of high-dimensional nontime series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a DPM model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data.

Item Type:

Journal Article

Subjects:

Q Science > QA Mathematics
Q Science > QH Natural history > QH426 Genetics

Divisions:

Faculty of Science, Engineering and Medicine > Research Centres > Warwick Systems Biology Centre

Library of Congress Subject Headings (LCSH):

Bioinformatics, Gaussian distribution, Stochastic processes, Statistics -- Data processing, Monte Carlo method, Probability measures, Bayesian statistical decision theory

Journal or Publication Title:

IEEE - ACM Transactions on Computational Biology and Bioinformatics

Publisher:

IEEE

ISSN:

1545-5963

Official Date:

October 2009

Dates:

Date	Event
October 2009	Published

Volume:

Vol.6

Number:

No.4

Page Range:

pp. 615-628

DOI:

10.1109/TCBB.2007.70269

Status:

Peer Reviewed

Access rights to Published version:

Open Access (Creative Commons)

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

University of Warwick
Publications service & WRAP

Highlight your research

The Library

Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures

Abstract

Repository staff actions (login required)

Downloads

University of WarwickPublications service & WRAP

Highlight your research

The Library

Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures

Abstract

Repository staff actions (login required)

Downloads

University of Warwick
Publications service & WRAP