The Library
Machine learning techniques for the early detection of cancer using volatile organic compounds
Tools
Neal, Matthew (2019) Machine learning techniques for the early detection of cancer using volatile organic compounds. PhD thesis, University of Warwick.
|
PDF
WRAP_Theses_Neal_2019.pdf - Submitted Version - Requires a PDF viewer. Download (30Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b3452905~S15
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QD Chemistry Q Science > QP Physiology R Medicine > RC Internal medicine |
||||
Library of Congress Subject Headings (LCSH): | Cancer -- Early detection, Volatile organic compounds, Ion mobility spectroscopy | ||||
Official Date: | January 2019 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Statistics | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Savage, Richard S. | ||||
Description: | Early cancer detection can change lives. If cancer can be detected early in its development, while a patient is still asymptomatic, it is easier and less expensive to treat. Volatile organic compounds, as measured by field asymmetric ion mobility spectrometry (FAIMS), are a novel class of biomarkers which have shown promise as a low-cost early screening test for a range of cancers. However, FAIMS data is high-dimensional, difficult to interpret, and can be subject to a range of subtle data quality problems. Additionally, in the current literature many researchers use linear methods for analysing FAIMS data. We believe improved results could be achieved by applying modern machine learning techniques to the problem. In this thesis, we investigate using modern machine learning techniques and best practices for FAIMS analysis, and develop corresponding software. We found that FAIMS has a moderate ability to detect cancer in an at-risk population, and that FAIMS could be used for the pre-symptomatic detection of other diseases with excellent results, achieving an AUC of 0.91 for the pre-symptomatic detection of anastomotic leakage after surgical resection to treat cancer. We present a novel Bayesian dimensionality reduction technique, the structured Gaussian process latent variable model (SGPLVM), which extends GPLVM to exploit structured correlations between variables, as are seen in FAIMS data. We also present a stochastic optimization algorithm and a number of extensions based on stochastic gradient descent variants. We explore the properties of SGPLVM, which we found to outperform GPLVM at recovering a latent representation of data which meet the model assumptions. We also demonstrate SGPLVM’s robustness to partially observed data. Finally, we present software packages for GPLVM, SGPLVM, and GP regression with a novel method of specifying and automatically selecting compound kernel functions. |
||||
Extent: | xv, 210 leaves: illustration, charts, maps | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year