The Library
Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes
Tools
Anderson, Paul E., Smith, J. Q., 1953-, Edwards, Kieron D. and Millar, A. J. (Andrew J.) (2006) Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes. Working Paper. University of Warwick. Centre for Research in Statistical Methodology, Coventry.
|
PDF
WRAP_Anderson_06-07w.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader Download (9Mb) |
Official URL: http://www2.warwick.ac.uk/fac/sci/statistics/crism...
Abstract
Background: An increasing number of microarray experiments produce time series of expression levels for many genes. Some recent clustering algorithms respect the time ordering of the data and are, importantly, extremely fast. The focus of this paper is the development of such an algorithm on a microarray data set consisting of 22,810 genes of the plant Arabidopsis thaliana measured at 13 time points over two days. Circadian rhythms control the timing of various physiological and metabolic processes and are regulated by genes acting in feedback loops. The aim is to cluster and classify the expression profiles in order to identify genes potentially involved in, and regulated by, the circadian clock. Results: A greedy search over time series of expression levels (where series are compared pairwise, the two most similar put in the same cluster and so forth) will get a fast result but will only explore a very limited number of the possible partitions of the profiles. We propose an improved, deterministic method based on a multi-step application of a conjugate Bayesian clustering algorithm. It allows the entire space to be searched more fully and intelligently. The values of the summary statistics are used to not only score clusters of genes, but also to guide the search of the vast partition space. By following this procedure, we are able to cluster genes that are known to be rhythmically expressed with genes of previously unknown function; thus suggesting potentially interesting targets for future experiments.
| Item Type: | Working or Discussion Paper (Working Paper) |
|---|---|
| Subjects: | Q Science > QA Mathematics |
| Divisions: | Faculty of Science > Statistics |
| Library of Congress Subject Headings (LCSH): | Arabidopsis thaliana -- Genetics -- Mathematical models, Time-series analysis, Cluster analysis, Circadian rhythms -- Mathematical models |
| Series Name: | Working papers |
| Publisher: | University of Warwick. Centre for Research in Statistical Methodology |
| Place of Publication: | Coventry |
| Date: | 2006 |
| Volume: | Vol.2006 |
| Number: | No.7 |
| Number of Pages: | 37 |
| Status: | Not Peer Reviewed |
| Access rights to Published version: | Open Access |
| Funder: | Engineering and Physical Sciences Research Council (EPSRC), Biotechnology and Biological Sciences Research Council (Great Britain) (BBSRC), BioSim |
| Grant number: | G19886 (BBSRC) |
| References: | 1. Draghici S: Data Analysis Tools for DNA Microarrays. Chapman and Hall 2003. 2. Heard NA, Holmes CC, Stephens DA: A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Applictaion of Bayesian Hierarchical Clustering of Curves. J. Amer. Statist. Assoc. 2006, 101(473):18{29. 3. Ceriani MF, Hogenesch JB, Yanovsky M, Panda S, Straume M, Kay SA: Genome-wide expression analysis in Drosophila reveals genes controlling circadian behavior. J Neurosci 2002, 22:9305{9319. 4. Panda S, Antoch MP, Miller BH, Su AI, Schook AB, Straume M, Schultz PG, Kay SA, Takahashi JS, Hogenesch JB: Coordinated transcription of key pathways in the mouse by the circadian clock. Cell 2002, 109:307{320. 5. Straume M: DNA microarray time series analysis: automated statistical assessment of circadian rhythms in gene expression patterning. Methods Enzymol 2004, 383:149{166. 6. Andersson CR, Isaksson A, Gustafsson MG: Bayesian detection of periodic mRNA time profiles without use of training examples. BMC Bioinformatics 2006, 7(63). [Published online on 9th Feb 2006. doi:10.1186/1471-2105-7-63]. 7. Andesmaki M, L'ahdesmaki H, Pearson R, Huttunen H, Yli-Harja O: Robust detection of periodic time series measured from biological systems. BMC Bioinformatics 2005, 6(117). 8. Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G: Bayesian co-clustering of Anopheles gene expression time series: a study of immune defense responses to multiple experimental challenges. Proc. Nat. Acad. Sci. 2005. [Doi:10.1073/pnas.0408393102]. 9. Denison DGT, Holmes CC, Mallick BK, Smith AFM: Bayesian Methods for Nonlinear Classification and Regression. Wiley Series in Probability and Statistics, John Wiley and Sons 2002. 10. Edwards KD, Anderson PE, Hall A, Salathia NS, Locke JCW, Lynn JR, Straume M, Smith JQ, Millar AJ: FLOWERING LOCUS C mediates natural variation in the high temperature response of the Arabidopsis circadian clock. The Plant Cell (in press) 2006. [Published online on 10th Feb 2006. doi:10.1104/tpc.105.038315]. 11. Locke JC, Southern MM, Kozma-Bognar L, Hibberd V, Brown PE, Turner MS, Millar AJ: Extension of a genetic network model by iterative experimentation and mathematical analysis. Mol Sys Biol 1 2005. [Doi:10.1038/msb4100018]. 12. Brockwell PJ, Davis RA: Introduction to Time Series and Forecasting. Springer 1996. 13. Hannan EJ: Multiple Time Series. John Wiley and Sons 1970. 14. Banfield JD, Raftery AE: Model-based gaussian and non-gaussian clustering. Biometrics 1993, 49:803{821. 15. Fraley C, Raftery AE: Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 2002, 97:611{631. 16. Luan Y, Li H: Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 2003, 19:474{482. 17. Ramoni M, Sebastiani P, Kohane PR: Cluster analysis of gene expression dynamics. Proc. Nat. Acad. Sci. 2002, 99:9121{9126. 18. Wakefield J, Zhou C, Self S: Modelling gene expression over time: curve clustering with informative prior distributions. In Bayesian Statistics 7. Edited by Bernardo JM, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M, Oxford University Press 2003. 19. Yeung KY, Fraley C, Murua A, Raftery AE, Ruzzo WL: Model-based clustering and data transformations for gene expression data. Bioinformatics 2001, 17:977{987. 20. Boucheron S, Bousquet O, Lugosi G: Theory of Classification: A Survey of Recent Advances. ESAIM: Probability and Statistics 2005, 9:323{375. 21. Chipman H, George E, McCullough R: Bayesian CART Model Search. J. Amer. Statist. Assoc. 1998, 93:935{960. 22. Chipman HA, George EI, McCulloch RE: Bayesian treed models. Machine Learning 2002, 48(1{3):299{320. 23. Doyle MR, Davis SJ, Bastow RM, McWatters HG, Kozma-Bognar L, Nagy F, Millar AJ, Amasino RM: The ELF4 gene controls circadian rhythms and owering time in Arabidopsis thaliana. Nature 2002, 419:74{77. 24. Alabadi D, Oyama T, Yanovsky MJ, Harmon FG, Mas P, Kay SA: Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science 2001, 293:880{883. 25. Hazen SP, Schultz TF, Pruneda-Paz JL, Borevitz JO, Ecker JR, Kay SA: LUX ARRHYTHMO encodes a Myb domain protein essential for circadian rhythms. Proc. Nat. Acad. Sci. 2005, 102:10387{10392. 26. Kuno N, Moller SG, Shinomura T, Xu X, Chua NH, Furuya M: The Novel MYB Protein EARLY-PHYTOCHROME-RESPONSIVE1 is a Component of a Slave Circadian Oscillator in Arabidopsis. The Plant Cell 2003, 15:2476{2488. 27. Heintzen C, Nater M, Apel K, Staiger D: AtGRP7, a nuclear RNA-binding protein as a component of a circadian-regulated negative feedback loop in Arabidopsis thaliana. Proc. Nat. Acad. Sci. 1997, 94:8515{8520. 28. Alabadi D, Yanovsky MJ, Mas P, Harmer SL, Kay SA: Critical role for CCA1 and LHY in maintaining circadian rhythmicity in Arabidopsis. Curr. Biol. 2002, 12:757{761. 29. Mizoguchi T, Wheatley K, Hanzawa Y, Wright L, Mizoguchi M, Song HR, Carre IA, Coupland G: LHY and CCA1 are partially redundant genes required to maintain circadian rhythms in Arabidopsis. Dev. Cell 2002, 2:629{641. 30. Aguilar O, West M: Bayesian dynamic factor models and portfolio allocation. Journal of Business and Economic Statistics 2000, 18:338{357. 31. George EI, Clyde M: Model uncertainty. Statistical Science 2004, 19:81{94. 32. Heard NA: Code for the algorithm described in Heard et al, J. Amer. Statist. Assoc., 2006. http://stats.ma.imperial.ac.uk/naheard/public html 2006. |
| URI: | http://wrap.warwick.ac.uk/id/eprint/35566 |
Actions (login required)
![]() |
View Item |
Tools
Tools

