The Library

Statistical inference from large-scale genomic data

Tools

Yuan, Yinyin (2009) Statistical inference from large-scale genomic data. PhD thesis, University of Warwick.

Preview

PDF
WRAP_THESIS_Yuan_2009.pdf - Requires a PDF viewer.
Download (1786Kb)

Official URL: http://webcat.warwick.ac.uk/record=b2260449~S15

Request Changes to record.

Abstract

This thesis explores the potential of statistical inference methodologies in their applications in functional genomics. In essence, it summarises algorithmic findings in this field, providing step-by-step analytical methodologies for deciphering biological knowledge from large-scale genomic data, mainly microarray gene expression time series.
This thesis covers a range of topics in the investigation of complex multivariate genomic data. One focus involves using clustering as a method of inference and another is cluster validation to extract meaningful biological information from the data. Information gained from the application of these various techniques can then be used conjointly in the elucidation of gene regulatory networks, the ultimate goal of this type of analysis. First, a new tight clustering method for gene expression data is proposed to obtain tighter and potentially more informative gene clusters. Next, to fully utilise biological knowledge in clustering validation, a validity index is defined based on one of the most important ontologies within the Bioinformatics community, Gene Ontology. The method bridges a gap in current literature, in the sense that it takes into account not only the variations of Gene Ontology categories in biological specificities and their significance to the gene clusters, but also the complex structure of the Gene Ontology. Finally, Bayesian probability is applied to making inference from heterogeneous genomic data, integrated with previous efforts in this thesis, for the aim of large-scale gene network inference. The proposed system comes with a stochastic process to achieve robustness to noise, yet remains efficient enough for large-scale analysis.
Ultimately, the solutions presented in this thesis serve as building blocks of an intelligent system for interpreting large-scale genomic data and understanding the functional organisation of the genome.

Item Type:

Thesis (PhD)

Subjects:

Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software

Library of Congress Subject Headings (LCSH):

Genomics -- Data processing, Genomics -- Statistical methods, Mathematical statistics -- Data processing, Bioinformatics -- Methodology, Ontology

Official Date:

March 2009

Dates:

Date	Event
March 2009	Submitted

Institution:

University of Warwick

Theses Department:

Department of Computer Science

Thesis Type:

PhD

Publication Status:

Unpublished

Supervisor(s)/Advisor:

Li, Chang-Tsun ; Wilson, Roland, 1949-

Format of File:

pdf

Extent:

205 leaves : col. ill., charts

Language:

eng

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

University of Warwick
Publications service & WRAP

Highlight your research

The Library

Statistical inference from large-scale genomic data

Abstract

Repository staff actions (login required)

Downloads

University of WarwickPublications service & WRAP

Highlight your research

The Library

Statistical inference from large-scale genomic data

Abstract

Repository staff actions (login required)

Downloads

University of Warwick
Publications service & WRAP