
The Library
Particle Monte Carlo methods for the integrative cluster analysis of multiple genomic datasets
Tools
Cunningham, Nathan (2019) Particle Monte Carlo methods for the integrative cluster analysis of multiple genomic datasets. PhD thesis, University of Warwick.
![]() |
PDF
WRAP_Theses_Cunningham_2019.pdf - Submitted Version Embargoed item. Restricted access to Repository staff only until 16 June 2022. Contact author directly, specifying your specific needs. - Requires a PDF viewer. Download (5Mb) |
Official URL: http://webcat.warwick.ac.uk/record=b3492837~S15
Abstract
In the forthcoming era of genomic medicine, high-throughput data such as gene expression, DNA methylation, and copy number alterations will be routinely measured for large numbers of people, and used as an input in deciding on their clinical care. These data provide different, and often complementary, views of the underlying biological mechanisms. A common research aim is to infer risk cohorts from patients using such data. However, standard approaches to cluster analysis are not equipped to model such heterogeneous data.
This thesis presents ParticleMDI, a novel, nonparametric Bayesian approach for performing integrative cluster analysis. ParticleMDI builds upon the multiple dataset integration (MDI) framework of Kirk et al. (2012), in which cluster allocations are updated one-at-a-time using a Gibbs sampler. Such methods are known to potentially exhibit slow mixing of the MCMC chain, so our approach uses a particle Gibbs sampler to update the cluster allocations jointly. The model can accommodate a wide range of data types and facilitates sharing of information between datasets via a reweighting of the particle system.
Several novel techniques are presented to ease the computational burden of ParticleMDI. One approach is the development of a block-updating particle Gibbs sampler which updates cluster allocations conditional on a fixed subset of allocations from a previous scan. The other approach aims to minimise the evaluation of redundant calculations inherent in particle filter methods. Greater than an order-of-magnitude decrease in computation time over a standard implementation is demonstrated with no impact on the MCMC chain. These techniques are implemented in `ParticleMDI.jl', a Julia package for practitioners to apply the algorithm to their own data.
ParticleMDI is evaluated on a number of synthetic and real datasets. In the case of the real datasets, the ability of the algorithm to identify clinically meaningful subgroups of cancer patients is demonstrated, as evidenced by significant differences in survival outcomes for the identified clusters.
Item Type: | Thesis or Dissertation (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics Q Science > QH Natural history > QH301 Biology |
||||
Library of Congress Subject Headings (LCSH): | Monte Carlo method, Cluster analysis, Genomics -- Statistical methods, Data sets -- Statistical methods | ||||
Official Date: | September 2019 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Statistics | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Wild, David L. ; Griffin, Jim E. | ||||
Sponsors: | Engineering and Physical Sciences Research Council | ||||
Format of File: | |||||
Extent: | xx, 168 leaves : illustrations (some colour) | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |