Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes
Anderson, Paul E., Smith, J. Q., 1953-, Edwards, Kieron D. and Millar, A. J. (Andrew J.) (2006) Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes. Working Paper. Coventry: University of Warwick. Centre for Research in Statistical Methodology. Working papers, Vol.2006 (No.7).
WRAP_Anderson_06-07w.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://www2.warwick.ac.uk/fac/sci/statistics/crism...
Background: An increasing number of microarray experiments produce time series of expression levels for many
genes. Some recent clustering algorithms respect the time ordering of the data and are, importantly, extremely
fast. The focus of this paper is the development of such an algorithm on a microarray data set consisting of
22,810 genes of the plant Arabidopsis thaliana measured at 13 time points over two days. Circadian rhythms
control the timing of various physiological and metabolic processes and are regulated by genes acting in
feedback loops. The aim is to cluster and classify the expression profiles in order to identify genes potentially
involved in, and regulated by, the circadian clock.
Results: A greedy search over time series of expression levels (where series are compared pairwise, the two most
similar put in the same cluster and so forth) will get a fast result but will only explore a very limited number of
the possible partitions of the profiles. We propose an improved, deterministic method based on a multi-step
application of a conjugate Bayesian clustering algorithm. It allows the entire space to be searched more fully and
intelligently. The values of the summary statistics are used to not only score clusters of genes, but also to guide
the search of the vast partition space. By following this procedure, we are able to cluster genes that are known
to be rhythmically expressed with genes of previously unknown function; thus suggesting potentially interesting
targets for future experiments.
|Item Type:||Working or Discussion Paper (Working Paper)|
|Subjects:||Q Science > QA Mathematics|
|Divisions:||Faculty of Science > Statistics|
|Library of Congress Subject Headings (LCSH):||Arabidopsis thaliana -- Genetics -- Mathematical models, Time-series analysis, Cluster analysis, Circadian rhythms -- Mathematical models|
|Series Name:||Working papers|
|Publisher:||University of Warwick. Centre for Research in Statistical Methodology|
|Place of Publication:||Coventry|
|Number of Pages:||37|
|Status:||Not Peer Reviewed|
|Access rights to Published version:||Open Access|
|Funder:||Engineering and Physical Sciences Research Council (EPSRC), Biotechnology and Biological Sciences Research Council (Great Britain) (BBSRC), BioSim|
|Grant number:||G19886 (BBSRC)|
1. Draghici S: Data Analysis Tools for DNA Microarrays. Chapman and Hall 2003.
Actions (login required)