
The Library
The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data
Tools
Bierkens, Joris, Fearnhead, Paul and Roberts, Gareth O. (2019) The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data. Annals of statistics, 47 (3). pp. 1288-1320. doi:10.1214/18-AOS1715 ISSN 0090-5364.
|
PDF
WRAP-zig-zag-process-super-efficient-sampling-bayesian-big-data-Roberts-2019.pdf - Published Version - Requires a PDF viewer. Download (980Kb) | Preview |
|
![]() |
PDF
WRAP-Zig-Zag-super-efficient-sampling-big-data-Roberts-2018.pdf - Accepted Version Embargoed item. Restricted access to Repository staff only - Requires a PDF viewer. Download (893Kb) |
Official URL: https://doi.org/10.1214/18-AOS1715
Abstract
Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multi-dimensional version of the Zig-Zag process of Bierkens and Roberts (2017), a continuous time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible non-reversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, i.e. the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial pre-processing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.
Item Type: | Journal Article | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software | |||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Statistics | |||||||||
Library of Congress Subject Headings (LCSH): | Big data, Bayesian statistical decision theory | |||||||||
Journal or Publication Title: | Annals of statistics | |||||||||
Publisher: | Inst Mathematical Statistics | |||||||||
ISSN: | 0090-5364 | |||||||||
Official Date: | 13 February 2019 | |||||||||
Dates: |
|
|||||||||
Volume: | 47 | |||||||||
Number: | 3 | |||||||||
Page Range: | pp. 1288-1320 | |||||||||
DOI: | 10.1214/18-AOS1715 | |||||||||
Status: | Peer Reviewed | |||||||||
Publication Status: | Published | |||||||||
Reuse Statement (publisher, data, author rights): | "The right to place the final version of this article (exactly as published in the journal) on their own homepage or in a public digital repository, provided there is a link to the official journal site." | |||||||||
Access rights to Published version: | Restricted or Subscription Access | |||||||||
Date of first compliant deposit: | 24 May 2018 | |||||||||
Date of first compliant Open Access: | 18 April 2019 | |||||||||
RIOXX Funder/Project Grant: |
|
|||||||||
Related URLs: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year