Likelihood-free estimation of model evidence
Didelot, Xavier, Everitt, Richard G., Johansen, Adam M. and Lawson, Daniel J. (2010) Likelihood-free estimation of model evidence. Working Paper. Coventry: University of Warwick. Centre for Research in Statistical Methodology. (Working papers).
WRAP_Didelot_10-12w.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://www2.warwick.ac.uk/fac/sci/statistics/crism...
Statistical methods of inference typically require the likelihood function to be computable in a reasonable amount of time. The class of "likelihood-free" methods termed Approximate Bayesian Computation (ABC) is able to eliminate this requirement, replacing the evaluation of the likelihood with simulation from it. Likelihood-free methods have gained in efficiency and popularity in the past few years, following their integration with Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) in order to better explore the parameter space. They have been applied primarily to the estimation of the parameters of a given model, but can also be used to compare models. Here we present novel likelihood-free approaches to model comparison, based upon the independent estimation of the evidence of each model under study. Key advantages of these approaches over previous techniques are that they allow the exploitation of MCMC or SMC algorithms for exploring the parameter space, and that they do not require a sampler able to mix between models. We validate the proposed methods using a simple exponential family problem before providing a realistic problem from population genetics: the comparison of different growth models based upon observations of human Y chromosome data from the terminal generation.
|Item Type:||Working or Discussion Paper (Working Paper)|
|Subjects:||Q Science > QA Mathematics|
|Divisions:||Faculty of Science > Statistics|
|Library of Congress Subject Headings (LCSH):||Mathematical statistics, Bayesian statistical decision theory, Mathematical models|
|Series Name:||Working papers|
|Publisher:||University of Warwick. Centre for Research in Statistical Methodology|
|Place of Publication:||Coventry|
|Number of Pages:||35|
|Status:||Not Peer Reviewed|
|Access rights to Published version:||Open Access|
|Version or Related Resource:||Didelot, X., et al. (2011). Likelihood-free estimation of model evidence. Bayesian Analysis, 6(1), pp. 49-76. http://wrap.warwick.ac.uk/id/eprint/41188|
|References:||Beaumont, M. A., Zhang, W., and Balding, D. J. (2002). Approximate Bayesian Computation in Population Genetics. Genetics, 162(4):2025–2035. Chib, S. (1995). Marginal Likelihood From the Gibbs Output. Journal of the American Statistical Association, 90(432):1313–1321. Del Moral, P. (2004). Feynman-Kac formulae: genealogical and interacting particle systems with applications. Probability and Its Applications. Springer, New York. Del Moral, P., Doucet, A., and Jasra, A. (2006). Sequential monte carlo samplers. Journal of the Royal Statistical Society: Series B(Statistical Methodology), 68(3):411–436. Del Moral, P., Doucet, A., and Jasra, A. (2008). An adaptive sequential Monte Carlo method for approximate Bayesian computation. preprint. Dellaportas, P., Forster, J., and Ntzoufras, I. (2002). On Bayesian model and variable selection using MCMC. Statistics and Computing, 12(1):27–36. Doucet, A. and Johansen, A. M. (2010). A Tutorial on Particle Filtering and Smoothing: Fiteen years later. In Crisan, D. and Rozovsky, B., editors, Handbook of Nonlinear Filtering. Oxford University Press. To appear. Fearnhead, P. and Prangle, D. (2010). Semi-automatic Approximate Bayesian Computation. Arxiv preprint arXiv:1004.1112. Friel, N. and Pettitt, A. (2008). Marginal likelihood estimation via power posteriors. Journal Of The Royal Statistical Society Series B, 70(3):589–607. Fu, Y. and Li, W. (1997). Estimating the age of the common ancestor of a sample of DNA sequences. Mol Biol Evol, 14(2):195–199. Gilks, W. and Spiegelhalter, D. (1996). Markov chain Monte Carlo in practice. Chapman & Hall/CRC. Green, P. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4):711–732. Green, P. (2003). Trans-dimensional markov chain monte carlo. Highly structured stochastic systems, 27:179–198. Grelaud, A., Robert, C., Marin, J., Rodolphe, F., and Taly, J. (2009). ABC likelihood-free methods for model choice in Gibbs random fields. Bayesian Analysis, 4(2):317–336. Griffiths, R. and Tavare, S. (1994). Sampling theory for neutral alleles in a varying environment. Philosophical Transactions of the Royal Society B: Biological Sciences, 344(1310):403–410. Hein, J., Schierup, M., and Wiuf, C. (2005). Gene genealogies, variation and evolution: a primer in coalescent theory. Oxford University Press, USA. Jeffreys, H. (1961). Theory of probability. Clarendon Press, Oxford :, 3rd ed. edition. Joyce, P. and Marjoram, P. (2008). Approximately sufficient statistics and Bayesian computation. Statistical Applications in Genetics and Molecular Biology, 7(1). Kass, R. and Raftery, A. (1995). Bayes factors. Journal of the American Statistical Association, 90(430). Kingman, J. (1982a). Exchangeability and the evolution of large populations. Exchangeability in probability and statistics, pages 97–112. Kingman, J. F. C. (1982b). On the genealogy of large populations. Journal of Applied Probability, 19A:27–43. Kingman, J. F. C. (1982c). The coalescent. Stochastic Processes and their Applications, 13(235):235–248. Klass, M., de Freitas, N., and Doucet, A. (2005). Towards Practical N2 Monte Carlo: The Marginal Particle Filter. In Proceedings of Uncertainty in Artificial Intelligence. Liu, J. (2001). Monte Carlo strategies in scientific computing. Springer Verlag. Luciani, F., Sisson, S., Jiang, H., Francis, A., and Tanaka, M. (2009). The epidemiological fitness cost of drug resistance in Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences, 106(34):14711–14715. Marjoram, P., Molitor, J., Plagnol, V., and Tavare, S. (2003). Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci U S A, 100(26):15324–15328. Meng, X. (1994). Posterior predictive p-values. The Annals of Statistics, 22(3):1142–1160. Neal, R. (2001). Annealed importance sampling. Statistics and Computing, 11(2):125–139. Newton, M. and Raftery, A. (1994). Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society. Series B (Methodological), 56(1):3–48. Ohta, T. and Kimura, M. (1973). A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet Res, 22(2):201–204. P´erez-Lezaun, A., Calafell, F., Seielstad, M., Mateu, E., Comas, D., Bosch, E., and Bertranpetit, J. (1997). Population genetics of Y-chromosome short tandem repeats in humans. J Mol Evol, 45(3):265–270. Peters, G. W. (2005). Topics In Sequential Monte Carlo Samplers. M.sc, University of Cambridge, Department of Engineering. Pritchard, J., Seielstad, M., Perez-Lezaun, A., and Feldman, M. (1999). Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol Biol Evol, 16(12):1791–1798. Ratmann, O., Andrieu, C., Wiuf, C., and Richardson, S. (2009). Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc Natl Acad Sci U S A, 106(26):10576–10581. Robert, C. P. (2001). The Bayesian Choice. Springer Texts in Statistics. Springer Verlag, New York, 2nd edition. Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer, New York, second edition. Robert, C. P., Mengersen, K., and Chen, C. (2010). Model choice versus model criticism. Proceedings of the National Academy of Sciences, 107(3):E5–E5. Rogers, A. R. and Harpending, H. (1992). Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol, 9(3):552–569. Seielstad, M. T., Minch, E., and Cavalli-Sforza, L. L. (1998). Genetic evidence for a higher female migration rate in humans. Nat Genet, 20(3):278–280. Shao, J. (1999). Mathematical Statistics. Springer. Sisson, S. A., Fan, Y., and Tanaka, M. M. (2007). Sequential Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences, 104(6):1760–1765. Slatkin, M. and Hudson, R. R. (1991). Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics, 129(2):555–562. Stephens, M. (2000). Bayesian analysis of mixture models with an unknown number of components-an alternative to reversible jump methods. The Annals of Statistics, 28(1):40– 74. Tajima, F. (1989). The effect of change in population size on DNA polymorphism. Genetics, 123(3):597–601. Tavare, S., Balding, D. J., Griffiths, R. C., and Donnelly, P. (1997). Inferring Coalescence Times From DNA Sequence Data. Genetics, 145(2):505–518. Thomson, R., Pritchard, J. K., Shen, P., Oefner, P. J., and Feldman, M. W. (2000). Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc Natl Acad Sci U S A, 97(13):7360–7365. Thornton, K. and Andolfatto, P. (2006). Approximate Bayesian Inference Reveals Evidence for a Recent, Severe Bottleneck in a Netherlands Population of Drosophila melanogaster. Genetics, 172(3):1607–1619. Toni, T. and Stumpf, M. (2010). Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics, 26(1):104–110. Toni, T., Welch, D., Strelkowa, N., Ipsen, A., and Stumpf, M. (2009). Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. Journal of The Royal Society Interface, 6(31):187–202. Weiss, G. and von Haeseler, A. (1998). Inference of Population History Using a Likelihood Approach. Genetics, 149(3):1539–1546. Wilder, J. A., Mobasher, Z., and Hammer, M. F. (2004). Genetic evidence for unequal effective population sizes of human females and males. Mol Biol Evol, 21(11):2047–2057. Wilkinson, R. (2008). Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Arxiv preprint arXiv:0811.3355.|
Actions (login required)