
The Library
Getting the most from medical VOC data using Bayesian feature learning
Tools
Skinner, J. R. (2019) Getting the most from medical VOC data using Bayesian feature learning. PhD thesis, University of Warwick.
|
PDF
WRAP_Theses_Skinner_2019.pdf - Submitted Version - Requires a PDF viewer. Download (22Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b3441592~S15
Abstract
The metabolic processes in the body naturally produce a diverse set of Volatile Organic Compounds (VOCs), which are excreted in breath, urine, stool and other biological samples. The VOCs produced are odorous and influenced by disease, meaning olfaction can provide information on a person’s disease state.
A variety of instruments exist for performing “artificial olfaction”: measuring a sample, such as patient breath, and producing a high dimensional output representing the odour. Such instruments may be paired with machine learning techniques to identify properties of interest, such as the presence of a given disease. Research shows good disease-predictive ability of artificial olfaction instrumentation. However, the statistical methods employed are typically off-the-shelf, and do not take advantage of prior knowledge of the structure of the high dimensional data. Since sample sizes are also typically small, this can lead to suboptimal results due to a poorly-learned model.
In this thesis we explore ways to get more out of artificial olfaction data. We perform statistical analyses in a medical setting, investigating disease diagnosis from breath, urine and vaginal swab measurements, and illustrating both successful identification and failure cases. We then introduce two new latent variable models constructed for dimension reduction of artificial olfaction data, but which are widely applicable. These models place a Gaussian Process (GP) prior on the mapping from latent variables to observations. Specifying a covariance function for the GP prior is an intuitive way for a user to describe their prior knowledge of the data covariance structure. We also enable an approximate posterior and marginal likelihood to be computed, and introduce a sparse variant. Both models have been made available in the R package stpca hosted at https://github.com/JimSkinner/stpca. In experiments with artificial olfaction data, these models outperform standard feature learning methods in a predictive pipeline.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | B Philosophy. Psychology. Religion > BF Psychology Q Science > Q Science (General) |
||||
Library of Congress Subject Headings (LCSH): | Metabolism, Bayesian statistical decision theory, Volatile organic compounds, Gaussian processes | ||||
Official Date: | July 2019 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Centre for Complexity Science | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Savage, Richard S.; Covington, James A. | ||||
Format of File: | |||||
Extent: | xv, 215 leaves: illustrations, charts | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year