Searching a multivariate partition space using weighted MAX-SAT
Liverani, Silvia, Cussens, James and Smith, J. Q., 1953- (2009) Searching a multivariate partition space using weighted MAX-SAT. UNSPECIFIED. Coventry: University of Warwick. Centre for Research in Statistical Methodology. (Working papers).
WRAP_liverani_09-18w.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://www2.warwick.ac.uk/fac/sci/statistics/crism...
Because of the huge number of partitions of even a moderately sized dataset, even when Bayes factors have a closed form, a comprehensive search for the highest scoring (MAP) partition is usually impossible. Therefore, both deterministic or random search algorithms traverse only a small proportion of partitions of a large dataset. The main contribution of this paper is to encode the formal Bayes factor search on partitions as a weighted MAX-SAT problem and use well-known solvers for that problem to search for partitions. We demonstrate how, with the appropriate priors over the partition space, this method can be used to fully search the space of partitions in smaller problems and how it can be used to enhance the performance of more familiar algorithms in large problems. We illustrate our method on clustering of time-course microarray experiments.
|Item Type:||Working or Discussion Paper (UNSPECIFIED)|
|Subjects:||Q Science > QA Mathematics|
|Divisions:||Faculty of Science > Statistics|
|Library of Congress Subject Headings (LCSH):||Partitions (Mathematics), Bayesian statistical decision theory|
|Series Name:||Working papers|
|Publisher:||University of Warwick. Centre for Research in Statistical Methodology|
|Place of Publication:||Coventry|
|Number of Pages:||8|
|Status:||Not Peer Reviewed|
|Access rights to Published version:||Open Access|
|References:||Anderson, P. E., J. Q. Smith, K. D. Edwards, and A. J. Millar (2006). Guided conjugate Bayesian clustering for uncovering rhytmically expressed genes. CRiSM Working Paper (07). Ben-Dor, A., R. Shamir, and Z. Yakhini (1999). Clustering gene expression patterns. Journal of Computational Biology 6(3–4), 281–297. Booth, J. G., G. Casella, and J. P. Hobert (2008). Clustering using objective functions and stochastic search. Journal of the Royal Statistical Society, Series B 70(1), 119–139. Chipman, H. A., E. I. George, and R. E. McCulloch (2002). Bayesian treed models. Machine Learning 48(1–3), 299–320. Crowley, E. M. (1997). Product partition models for normal means. Journal of the American Statistical Association 92(437), 192–198. Cussens, J. (2008). Bayesian network learning by compiling to weighted MAX-SAT. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008). Denison, D. G. T., C. C. Holmes, B. K. Mallick, and A. F. M. Smith (2002). Bayesian Methods for Nonlinear Classification and Regression. Wiley Series in Probability and Statistics. John Wiley and Sons. Edwards, K. D., P. E. Anderson, A. Hall, N. S. Salathia, J. C.W. Locke, J. R. Lynn, M. Straume, J. Q. Smith, and A. J. Millar (2006). FLOWERING LOCUS C Mediates Natural Variation in the High-Temperature Response of the Arabidopsis Circadian Clock. The Plant Cell 18, 639–650. Heard, N. A., C. C. Holmes, and D. A. Stephens (2006). A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves. Journal of the American Statistical Association 101(473), 18–29. Lau, J. W. and P. J. Green (2007). Bayesian Model-Based Clustering Procedures. Journal of Computational and Graphical Statistics 16(3), 526. Liverani, S., P. E. Anderson, K. D. Edwards, A. J. Millar, and J. Q. Smith (2008). Efficient Utility-based Clustering over High Dimensional Partition Spaces. CRiSM Research Report (22). McCullagh, P. and J. Yang (2006). Stochastic classification models. In Proceedings International Congress of Mathematicians, Volume III, pp. 669–686. O’Hagan, A. and J. Forster (2004). Bayesian Inference: Kendall’s Advanced Theory of Statistics (Second ed.). Arnold. Quintana, F. A. and P. L. Iglesias (2003). Bayesian clustering and product partition models. Journal of the Royal Statistical Society Series B 65(2), 557–574. Ray, S. and B. Mallick (2006). Functional clustering by Bayesian wavelet methods. J. Royal Statist. Soc.: Series B 68(2), 305–332. Smith, J. Q., P. E. Anderson, and S. Liverani (2008). Separation measures and the geometry of Bayes factor selection for classification. Journal of the Royal Statistical Society, Series B 70(5), 957–980. Tompkins, D. A. D. and H. H. Hoos (2005). UBCSAT: An implementation and experimentation environment for SLS algorithms for SAT and MAX-SAT. In H. H. Hoos and D. G. Mitchell (Eds.), Theory and Applications of Satisfiability Testing: Revised Selected Papers of the Seventh International Conference (SAT 2004, Vancouver, BC, Canada, May 10–13, 2004), Volume 3542 of Lecture Notes in Computer Science, Berlin, Germany, pp. 306–320. Springer Verlag.|
Actions (login required)