Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Cross-validation prior choice in Bayesian probit regression with many covariates

Tools
- Tools
+ Tools

Lamnisos, Demetris, Griffin, Jim E. and Steel, Mark F. J.. (2012) Cross-validation prior choice in Bayesian probit regression with many covariates. Statistics and Computing, Vol.22 (No.2). pp. 359-373. ISSN 0960-3174

[img]
Preview
PDF
WRAP_Steel_150911-cvpriorchoice_rev.pdf - Accepted Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (542Kb)
Official URL: http://dx.doi.org/10.1007/s11222-011-9228-1

Abstract

This paper examines prior choice in probit regression through a predictive cross-validation criterion. In particular, we focus on situations where the number of potential covariates is far larger than the number of observations, such as in gene expression data. Cross-validation avoids the tendency of such models to fit perfectly. We choose the scale parameter c in the standard variable selection prior as the minimizer of the log predictive score. Naive evaluation of the log predictive score requires substantial computational effort, and we investigate computationally cheaper methods using importance sampling. We find that K−fold importance densities perform best, in combination with either mixing over different values of c or with integrating over c through an auxiliary distribution.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics
Divisions: Faculty of Science > Statistics
Library of Congress Subject Headings (LCSH): Probits, Bayesian statistical decision theory, Regression analysis, Gene expression -- Data processing
Journal or Publication Title: Statistics and Computing
Publisher: Springer
ISSN: 0960-3174
Date: March 2012
Volume: Vol.22
Number: No.2
Page Range: pp. 359-373
Identification Number: 10.1007/s11222-011-9228-1
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
References: Alon, U., N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 96, 6745–6750. Brown, P. J. and M. Vannucci (1998). Multivariate Bayesian variable selection and prediction. Journal of the Royal Statistical Society 60 (3), 627–641. Celeux, G., J.-M. Marin, and C. P. Robert (2006). S´election bay´esienne de variables en r´egression lin´eaire. Journal de la Soci´et´e Fran¸caise de Statistique 147, 59–79. Cui, W. and E. I. George (2008). Empirical Bayes vs. Fully Bayes variable selection. Journal of Statistical Planning and Inference 138, 888–900. Denison, D. G. T., C. C. Holmes, B. K.Mallick, and A. F.M. Smith (2002). Bayesian Methods for Nonlinear Classification and Regression. John Wiley and Sons. Dobra, A. (2009). Variable selection and dependency networks for genomewide data. Biostatistics 10, 621–639. Fern´andez, C., E. Ley, and M. F. J. Steel (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics 100, 381–427. Geisser, S. and W. F. Eddy (1979). A predictive approach to model selection. Journal of American Statistical Association 74, 153–160. Gelfand, A. E. and D. K. Dey (1994). Bayesian model choice: Asymptotics and exact calculations. Journal of the Royal Statistical Society, Series B 56, 501–514. Gelfand, A. E., D. K. Dey, and H. Chang (1992). Model determination using predictive distributions with implementation via sampling-based methods. Bayesian Statistics 4, 147–167. George, E. I. and D. P. Foster (2000). Calibration and empirical Bayes variable selection. Biometrika 87 (4), 731–747. Geyer, C. J. (1994). Estimating normalizing constants and reweighting mixtures in MCMC. Technical Report 568, University of Minnesota, School of Statistics. Gneiting, T. and A. E. Raftery (2007). Strictly proper scoring rules, prediction and estimation. Journal of the American Statistical Association 102, 359–378. Good, I. J. (1952). Rational decisions. Journal of the Royal Statistical Society, Series B 14 (1), 107–114. Hastie, T., R. Tibshirani, and J. H. Friedman (2001). The Elements of Statistical Learning. Springer series in statistics, New York. Holmes, C. C. and L. Held (2006). Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Analysis 1 (1), 145–168. Key, J., L. Pericchi, and A. F. M. Smith (1999). Bayesian model choice: what and why? In J. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith (Eds.), Bayesian Statistics Volume 6, pp. 343–370. Oxford: Oxford University Press. Lee, K. E., N. Sha, E. R. Dougherty, M. Vannucci, and B. Mallick (2003). Gene selection: A Bayesian variable selection approach. Bioinformatics 19, 90–97. Liang, F., R. Paulo, G. Molina, M. A. Clyde, and J. O. Berger (2008). Mixture of g−priors for Bayesian variable selection. Journal of the American Statistical Association 103, 410–423. Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer-Verlag, New York. Owen, A. and Y. Zhou (2000). Safe and effective importance sampling. Journal of the American Statistical Association 95, 135–143. Robert, C. P. and G. Casella (2004). Monte Carlo Statistical Methods (Second ed.). Springer, New York. Scott, J. G. and J. O. Berger (2006). An exploration of aspects of Bayesian multiple testing. Journal of Statistical Planning and Inference 136, 2144–2162. Sha, N., M. Vannucci, P. J. Brown, M. K. Trower, G. Amphlett, and F. Falciani (2003). Gene selection in arthritis classification with large-scale microarray expression profiles. Comparative and Functional Genomics 4, 171–181. Sha, N., M. Vannucci, M. G. Tadesse, P. J. Brown, I. Dragoni, N. Davies, T. C. Roberts, A. Contestabile, M. Salmon, C. Buckley, and F. Falciani (2004). Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage. Biometrics 60, 812–819. Shafer, G. (1982). Lindley’s paradox. Journal of the American Statistical Associa- tion 77, 325–351. Singh, D., P. G. Febbo, K. Ross, D. G. Jackson, J. Manola, C. Ladd, P. Tamayo, A. A. Renshaw, A. V. D’Amico, J. P. Richie, E. S. Lander, M. Loda, P. W. Kantoff, T. R. Golub, and W. R. Sellers (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer cell 1, 203–209. Strimenopoulou, F. and P. J. Brown (2008). Empirical Bayes logistic regression. Statistical Applications in Genetics and Molecular Biology 7, Article 9. Veach, E. and L. Guibas (1995). Optimally combining sampling techniques for Monte Carlo rendering. In SIGGRAPH ’95 Conference Proceedings, pp. 419–428. Reading, MA: Addision-Wesley. Ventura, V. (2002). Non-parametric bootstrap recycling. Statistics and Comput- ing 12, 261–273. Zhou, X., K.-Y. Liu, and S. T. C. Wong (2004). Cancer classification and prediction using logistic regression with Bayesian gene selection. Journal of Biomedical Informatics 37 (4), 249–259.
URI: http://wrap.warwick.ac.uk/id/eprint/37681

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us