The Library
Posterior asymptotics for boosted Hierarchical Dirichlet Process mixtures
Tools
Catalano, Marta, Blasi, Pierpaolo De, Lijoi, Antonio and Prünster, Igor (2022) Posterior asymptotics for boosted Hierarchical Dirichlet Process mixtures. Journal of Machine Learning Research, 23 (80). pp. 1-23. ISSN 1532-4435.
|
PDF
WRAP-posterior-asymptotics-boosted-HDP-mixtures-Catalano-2022.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (368Kb) | Preview |
Official URL: http://jmlr.org/papers/v23/20-1474.html
Abstract
Bayesian hierarchical models are powerful tools for learning common latent features across multiple data sources. The Hierarchical Dirichlet Process (HDP) is invoked when the number of latent components is a priori unknown. While there is a rich literature on finite sample properties and performance of hierarchical processes, the analysis of their frequentist posterior asymptotic properties is still at an early stage. Here we establish theoretical guarantees for recovering the true data generating process when the data are modeled as mixtures over the HDP or a generalization of the HDP, which we term boosted because of the faster growth in the number of discovered latent features. By extending Schwartz's theory to partially exchangeable sequences we show that posterior contraction rates are crucially affected by the relationship between the sample sizes corresponding to the different groups. The effect varies according to the smoothness level of the true data distributions. In the supersmooth case, when the generating densities are Gaussian mixtures, we recover the parametric rate up to a logarithmic factor, provided that the sample sizes are related in a polynomial fashion. Under ordinary smoothness assumptions more caution is needed as a polynomial deviation in the sample sizes could drastically deteriorate the convergence to the truth.
Item Type: | Journal Article | ||||
---|---|---|---|---|---|
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software |
||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Statistics | ||||
Library of Congress Subject Headings (LCSH): | Machine learning, Bayesian statistical decision theory -- Data processing, Estimation theory -- Computer programs, Nonparametric statistics | ||||
Journal or Publication Title: | Journal of Machine Learning Research | ||||
Publisher: | JMLR | ||||
ISSN: | 1532-4435 | ||||
Official Date: | March 2022 | ||||
Dates: |
|
||||
Volume: | 23 | ||||
Number: | 80 | ||||
Page Range: | pp. 1-23 | ||||
Status: | Peer Reviewed | ||||
Publication Status: | Published | ||||
Access rights to Published version: | Open Access (Creative Commons) | ||||
Date of first compliant deposit: | 17 May 2022 | ||||
Date of first compliant Open Access: | 18 May 2022 |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year