Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Efficient utility-based clustering over high dimensional partition spaces

Tools
- Tools
+ Tools

Liverani, Silvia, Anderson, Paul E., Edwards, Kieron D., Millar, A. J. (Andrew J.) and Smith, J. Q., 1953-. (2009) Efficient utility-based clustering over high dimensional partition spaces. Bayesian analysis, Vol.4 (No.3). pp. 539-572. ISSN 1931-6690

[img]
Preview
PDF
WRAP_Liverani_Efficient_utility_based.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (6Mb)
Official URL: http://dx.doi.org/10.1214/09-BA420

Abstract

Because of the huge number of partitions of even a moderately sized dataset, even when Bayes factors have a closed form, in model-based clustering a comprehensive search for the highest scoring (MAP) partition is usually impossible. However, when each cluster in a partition has a signature and it is known that some signatures are of scientific interest whilst others are not, it is possible, within a Bayesian framework, to develop search algorithms which are guided by these cluster signatures. Such algorithms can be expected to find better partitions more quickly. In this paper we develop a framework within which these ideas can be formalized. We then briefly illustrate the efficacy of the proposed guided search on a microarray time coursed at a set where the clustering objective is to identify clusters of genes with different types of circadian expression profiles.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics
Divisions: Faculty of Science > Statistics
Library of Congress Subject Headings (LCSH): Cluster analysis, Partitions (Mathematics), Genes -- Mathematical models
Journal or Publication Title: Bayesian analysis
Publisher: Int Soc Bayesian Analysis
ISSN: 1931-6690
Date: 2009
Volume: Vol.4
Number: No.3
Number of Pages: 34
Page Range: pp. 539-572
Identification Number: 10.1214/09-BA420
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Open Access
Funder: Engineering and Physical Sciences Research Council (EPSRC), Biotechnology and Biological Sciences Research Council (Great Britain) (BBSRC), France. Agence nationale de la recherche (ANR)
Grant number: G19886 (BBSRC), BBF0054661 (ANR/BBSRC), BB/D019621/1 (EPSRC)
References: Anderson, P. E., Smith, J. Q., Edwards, K. D., and Millar, A. J. (2006). \Guided Conjugate Bayesian Clustering for Uncovering Rhytmically expressed Genes." CRISM Working Paper, (07). 556 Ban¯eld, J. D. and Raftery, A. E. (1993). \Model-Based Gaussian and Non-Gaussian Clustering." Biometrics, 49(3): 803{821. 540 Ben-Dor, A., Shamir, R., and Yakhini, Z. (1999). \Clustering Gene Expression Pat- terns." Journal of Computational Biology, 6(3{4): 281{297. 540 Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Chichester: Wiley. 540 Booth, J. G., Casella, G., and Hobert, J. P. (2008). \Clustering using objective functions and stochastic search." Journal of the Royal Statistical Society, Series B, 70(1): 119{ 139. 556 Chipman, H. A., George, E. I., and McCulloch, R. E. (2002). \Bayesian treed models." Machine Learning, 48(1{3): 299{320. 556 Crowley, E. M. (1997). \Product Partition Models for Normal Means." Journal of the American Statistical Association, 92(437): 192{198. 556 Denison, D. G. T., Holmes, C. C., Mallick, B. K., and Smith, A. F. M. (2002). Bayesian Methods for Nonlinear Classi¯cation and Regression. Wiley Series in Probability and Statistics. John Wiley and Sons. 540, 541 Edwards, K. D., Anderson, P. E., Hall, A., Salathia, N. S., Locke, J. C. W., Lynn, J. R., Straume, M., Smith, J. Q., and Millar, A. J. (2006). \FLOWERING LOCUS C Mediates Natural Variation in the High-Temperature Response of the Arabidopsis Circadian Clock." The Plant Cell, 18: 639{650. 541, 551, 552 Eisen, M., Spellman, P., Brown, P., and Botstein, D. (1998). \Cluster analysis and display of genome-wide expression patterns." Proceedings of the National Academy of Sciences, 95(25): 14863{14868. 553 Fraley, C. and Raftery, A. E. (1998). \How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis." The Computer Journal, 41: 578{588. 539, 540 French, S. and Rios Insua, D. (2000). Statistical Decision Theory. London: Arnold. 543 Heard, N. A., Holmes, C. C., and Stephens, D. A. (2006). \A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Application of Bayesian Hierarchical Clustering of Curves." Journal of the American Statistical Association, 101(473): 18{29. 539, 540, 541, 553 Keeney, R. and Rai®a, H. (1976). Decision with multiple objectives: Preferences and value tradeo®s. New York: John Wiley & Sons. 543 Keeney, R. and von Winterfeldt, D. (2007). \Practical Value Models." In Edwards, W., Miles, R. F., and von Winterfeldt, D. (eds.), Advances in Decision Analysis: From Foundations to Applications, 232{252. Cambridge University Press. 543 Luan, Y. and Li, H. (2003). \Clustering of time-course gene expression data using a mixed-e®ects model with B-splines." Bioinformatics, 19(4): 474{482. 540 McCullagh, P. and Yang, J. (2006). \Stochastic classi¯cation models." In Proceedings International Congress of Mathematicians, volume III, 669{686. 556 Michael, T., Mockler, T., Breton, G., McEntee, C., Byer, A., Trout, J., Hazen, S., Shen, R., Priest, H., Sullivan, C., Givan, S., Yanovsky, M., Hong, F., Kay, S., and Chory, J. (2008). \Network Discovery Pipeline Elucidates Conserved Time-of-Day{Speci¯c cis-Regulatory Modules." PLoS Genetics, 4(2): e14. 552 O'Hagan, A. and Forster, J. (2004). Bayesian Inference: Kendall's Advanced Theory of Statistics. Arnold, second edition. 541 Ramoni, M. F., Sebastiani, P., and Kohane, I. S. (2002). \Cluster Analysis of Gene Expression Dynamics." Proceedings of the National Academy of Sciences of the United States of America, 99(14): 9121{9126. 540 Ray, S. and Mallick, B. (2006). \Functional clustering by Bayesian wavelet methods." J. Royal Statist. Soc.: Series B, 68(2): 305{332. 541 Smith, J. Q., Anderson, P. E., and Liverani, S. (2008). \Separation Measures and the Geometry of Bayes Factor Selection for Classi¯cation." Journal of the Royal Statistical Society, Series B, 70(5): 957{980. 540, 541, 542, 552, 554 Straume, M. (2004). \DNA microarray time series analysis: automated statistical as- sessment of circadian rhythms in gene expression patterning." Methods Enzymol, 383: 149{66. 553 Tai, Y. and Speed, T. (2006). \A multivariate empirical Bayes statistic for replicated microarray time course data." Annals of Statistics, 34(5): 2387{2412. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lan- der, E., and Golub, T. (1999). \Interpreting patterns of gene expression with self- organizing maps: Methods and application to hematopoietic di®erentiation." Pro- ceedings of the National Academy of Sciences, 96(6): 2907{2912. 553 Tatman, J. and Shachter, R. (1990). \Dynamic programming and in°uence diagrams." Systems, Man and Cybernetics, IEEE Transactions on, 20(2): 365{379. 540 Wake¯eld, J., Zhou, C., and Self, S. (2003). \Modelling gene expression over time: curve clustering with informative prior distributions." Bayesian Statistics, 7: 721{732. 540, 553 Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E., and Ruzzo, W. L. (2001). \Model- based clustering and data transformations for gene expression data." Bioinformatics, 17(10): 977{987. 540 Zhou, C., Wake¯eld, J. C., and Breeden, L. L. (2006). \Bayesian Analysis of Cell-Cycle Gene Expression Data." In Do, K.-A., MÄuller, P., and Vannucci, M. (eds.), Bayesian Inference for Gene Expression and Proteomics, 177{200. Cambridge University Press. 540, 553
URI: http://wrap.warwick.ac.uk/id/eprint/16603

Data sourced from Thomson Reuters' Web of Knowledge

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us