Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes

Tools
- Tools
+ Tools

Anderson, Paul E., Smith, J. Q., 1953-, Edwards, Kieron D. and Millar, A. J. (Andrew J.) (2006) Guided conjugate Bayesian clustering for uncovering rhythmically expressed genes. Working Paper. University of Warwick. Centre for Research in Statistical Methodology, Coventry.

[img]
Preview
PDF
WRAP_Anderson_06-07w.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (9Mb)
Official URL: http://www2.warwick.ac.uk/fac/sci/statistics/crism...

Abstract

Background: An increasing number of microarray experiments produce time series of expression levels for many genes. Some recent clustering algorithms respect the time ordering of the data and are, importantly, extremely fast. The focus of this paper is the development of such an algorithm on a microarray data set consisting of 22,810 genes of the plant Arabidopsis thaliana measured at 13 time points over two days. Circadian rhythms control the timing of various physiological and metabolic processes and are regulated by genes acting in feedback loops. The aim is to cluster and classify the expression profiles in order to identify genes potentially involved in, and regulated by, the circadian clock. Results: A greedy search over time series of expression levels (where series are compared pairwise, the two most similar put in the same cluster and so forth) will get a fast result but will only explore a very limited number of the possible partitions of the profiles. We propose an improved, deterministic method based on a multi-step application of a conjugate Bayesian clustering algorithm. It allows the entire space to be searched more fully and intelligently. The values of the summary statistics are used to not only score clusters of genes, but also to guide the search of the vast partition space. By following this procedure, we are able to cluster genes that are known to be rhythmically expressed with genes of previously unknown function; thus suggesting potentially interesting targets for future experiments.

Item Type: Working or Discussion Paper (Working Paper)
Subjects: Q Science > QA Mathematics
Divisions: Faculty of Science > Statistics
Library of Congress Subject Headings (LCSH): Arabidopsis thaliana -- Genetics -- Mathematical models, Time-series analysis, Cluster analysis, Circadian rhythms -- Mathematical models
Series Name: Working papers
Publisher: University of Warwick. Centre for Research in Statistical Methodology
Place of Publication: Coventry
Date: 2006
Volume: Vol.2006
Number: No.7
Number of Pages: 37
Status: Not Peer Reviewed
Access rights to Published version: Open Access
Funder: Engineering and Physical Sciences Research Council (EPSRC), Biotechnology and Biological Sciences Research Council (Great Britain) (BBSRC), BioSim
Grant number: G19886 (BBSRC)
References: 1. Draghici S: Data Analysis Tools for DNA Microarrays. Chapman and Hall 2003. 2. Heard NA, Holmes CC, Stephens DA: A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Applictaion of Bayesian Hierarchical Clustering of Curves. J. Amer. Statist. Assoc. 2006, 101(473):18{29. 3. Ceriani MF, Hogenesch JB, Yanovsky M, Panda S, Straume M, Kay SA: Genome-wide expression analysis in Drosophila reveals genes controlling circadian behavior. J Neurosci 2002, 22:9305{9319. 4. Panda S, Antoch MP, Miller BH, Su AI, Schook AB, Straume M, Schultz PG, Kay SA, Takahashi JS, Hogenesch JB: Coordinated transcription of key pathways in the mouse by the circadian clock. Cell 2002, 109:307{320. 5. Straume M: DNA microarray time series analysis: automated statistical assessment of circadian rhythms in gene expression patterning. Methods Enzymol 2004, 383:149{166. 6. Andersson CR, Isaksson A, Gustafsson MG: Bayesian detection of periodic mRNA time profiles without use of training examples. BMC Bioinformatics 2006, 7(63). [Published online on 9th Feb 2006. doi:10.1186/1471-2105-7-63]. 7. Andesmaki M, L'ahdesmaki H, Pearson R, Huttunen H, Yli-Harja O: Robust detection of periodic time series measured from biological systems. BMC Bioinformatics 2005, 6(117). 8. Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G: Bayesian co-clustering of Anopheles gene expression time series: a study of immune defense responses to multiple experimental challenges. Proc. Nat. Acad. Sci. 2005. [Doi:10.1073/pnas.0408393102]. 9. Denison DGT, Holmes CC, Mallick BK, Smith AFM: Bayesian Methods for Nonlinear Classification and Regression. Wiley Series in Probability and Statistics, John Wiley and Sons 2002. 10. Edwards KD, Anderson PE, Hall A, Salathia NS, Locke JCW, Lynn JR, Straume M, Smith JQ, Millar AJ: FLOWERING LOCUS C mediates natural variation in the high temperature response of the Arabidopsis circadian clock. The Plant Cell (in press) 2006. [Published online on 10th Feb 2006. doi:10.1104/tpc.105.038315]. 11. Locke JC, Southern MM, Kozma-Bognar L, Hibberd V, Brown PE, Turner MS, Millar AJ: Extension of a genetic network model by iterative experimentation and mathematical analysis. Mol Sys Biol 1 2005. [Doi:10.1038/msb4100018]. 12. Brockwell PJ, Davis RA: Introduction to Time Series and Forecasting. Springer 1996. 13. Hannan EJ: Multiple Time Series. John Wiley and Sons 1970. 14. Banfield JD, Raftery AE: Model-based gaussian and non-gaussian clustering. Biometrics 1993, 49:803{821. 15. Fraley C, Raftery AE: Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 2002, 97:611{631. 16. Luan Y, Li H: Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 2003, 19:474{482. 17. Ramoni M, Sebastiani P, Kohane PR: Cluster analysis of gene expression dynamics. Proc. Nat. Acad. Sci. 2002, 99:9121{9126. 18. Wakefield J, Zhou C, Self S: Modelling gene expression over time: curve clustering with informative prior distributions. In Bayesian Statistics 7. Edited by Bernardo JM, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M, Oxford University Press 2003. 19. Yeung KY, Fraley C, Murua A, Raftery AE, Ruzzo WL: Model-based clustering and data transformations for gene expression data. Bioinformatics 2001, 17:977{987. 20. Boucheron S, Bousquet O, Lugosi G: Theory of Classification: A Survey of Recent Advances. ESAIM: Probability and Statistics 2005, 9:323{375. 21. Chipman H, George E, McCullough R: Bayesian CART Model Search. J. Amer. Statist. Assoc. 1998, 93:935{960. 22. Chipman HA, George EI, McCulloch RE: Bayesian treed models. Machine Learning 2002, 48(1{3):299{320. 23. Doyle MR, Davis SJ, Bastow RM, McWatters HG, Kozma-Bognar L, Nagy F, Millar AJ, Amasino RM: The ELF4 gene controls circadian rhythms and owering time in Arabidopsis thaliana. Nature 2002, 419:74{77. 24. Alabadi D, Oyama T, Yanovsky MJ, Harmon FG, Mas P, Kay SA: Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science 2001, 293:880{883. 25. Hazen SP, Schultz TF, Pruneda-Paz JL, Borevitz JO, Ecker JR, Kay SA: LUX ARRHYTHMO encodes a Myb domain protein essential for circadian rhythms. Proc. Nat. Acad. Sci. 2005, 102:10387{10392. 26. Kuno N, Moller SG, Shinomura T, Xu X, Chua NH, Furuya M: The Novel MYB Protein EARLY-PHYTOCHROME-RESPONSIVE1 is a Component of a Slave Circadian Oscillator in Arabidopsis. The Plant Cell 2003, 15:2476{2488. 27. Heintzen C, Nater M, Apel K, Staiger D: AtGRP7, a nuclear RNA-binding protein as a component of a circadian-regulated negative feedback loop in Arabidopsis thaliana. Proc. Nat. Acad. Sci. 1997, 94:8515{8520. 28. Alabadi D, Yanovsky MJ, Mas P, Harmer SL, Kay SA: Critical role for CCA1 and LHY in maintaining circadian rhythmicity in Arabidopsis. Curr. Biol. 2002, 12:757{761. 29. Mizoguchi T, Wheatley K, Hanzawa Y, Wright L, Mizoguchi M, Song HR, Carre IA, Coupland G: LHY and CCA1 are partially redundant genes required to maintain circadian rhythms in Arabidopsis. Dev. Cell 2002, 2:629{641. 30. Aguilar O, West M: Bayesian dynamic factor models and portfolio allocation. Journal of Business and Economic Statistics 2000, 18:338{357. 31. George EI, Clyde M: Model uncertainty. Statistical Science 2004, 19:81{94. 32. Heard NA: Code for the algorithm described in Heard et al, J. Amer. Statist. Assoc., 2006. http://stats.ma.imperial.ac.uk/naheard/public html 2006.
URI: http://wrap.warwick.ac.uk/id/eprint/35566

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us