The Library
Reconstructing regulatory networks from high-throughput post-genomic data using MCMC methods
Tools
Sharma, Sapna (2013) Reconstructing regulatory networks from high-throughput post-genomic data using MCMC methods. PhD thesis, University of Warwick.
|
Text
WRAP_THESIS_Sharma_2013.pdf - Submitted Version Download (6Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b2691616~S1
Abstract
Modern biological research aims to understand when genes are expressed and
how certain genes in
uence the expression of other genes. For organizing and visualizing
gene expression activity gene regulatory networks are used. The architecture
of these networks holds great importance, as they enable us to identify inconsistencies
between hypotheses and observations, and to predict the behavior of biological
processes in yet untested conditions.
Data from gene expression measurements are used to construct gene regulatory
networks. Along with the advance of high-throughput technologies for measuring
gene expression statistical methods to predict regulatory networks have also
been evolving. This thesis presents a computational framework based on a Bayesian
modeling technique using state space models (SSM) for the inference of gene regulatory
networks from time-series measurements.
A linear SSM consists of observation and hidden state equations. The hidden
variables can unfold effects that cannot be directly measured in an experiment, such
as missing gene expression. We have used a Bayesian MCMC approach based on
Gibbs sampling for the inference of parameters. However the task of determining
the dimension of the hidden state space variables remains crucial for the accuracy
of network inference. For this we have used the Bayesian evidence (or marginal
likelihood) as a yardstick. In addition, the Bayesian approach also provides the
possibility of incorporating prior information, based on literature knowledge.
We compare marginal likelihoods calculated from the Gibbs sampler output
to the lower bound calculated by a variational approximation. Before using the
algorithm for the analysis of real biological experimental datasets we perform validation
tests using numerical experiments based on simulated time series datasets
generated by in-silico networks. The robustness of our algorithm can be measured
by its ability to recapture the input data and generating networks using the inferred
parameters.
Our developed algorithm, GBSSM, was used to infer a gene network using
E. coli data sets from the different stress conditions of temperature shift and acid
stress. The resulting model for the gene expression response under temperature shift
captures the effects of global transcription factors, such as fnr that control the regulation
of hundreds of other genes. Interestingly, we also observe the stress-inducible membrane protein OsmC regulating transcriptional activity involved in the adaptation
mechanism under both temperature shift and acid stress conditions. In the case
of acid stress, integration of metabolomic and transcriptome data suggests that the
observed rapid decrease in the concentration of glycine betaine is the result of the
activation of osmoregulators which may play a key role in acid stress adaptation.
Item Type: | Thesis (PhD) |
---|---|
Subjects: | Q Science > QA Mathematics Q Science > QH Natural history > QH426 Genetics |
Library of Congress Subject Headings (LCSH): | Gene regulatory networks, Gene expression, State-space methods, Bayesian statistical decision theory |
Official Date: | April 2013 |
Institution: | University of Warwick |
Theses Department: | Systems Biology Doctoral Training Centre |
Thesis Type: | PhD |
Publication Status: | Unpublished |
Supervisor(s)/Advisor: | Wild, David L. |
Extent: | xxiii, 193 leaves. |
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year