Statistical modelling of transcript profiles of differentially regulated genes
Eastwood, Daniel C., Mead, A. (Andrew), Sergeant, Martin J. and Burton, Kerry S.. (2008) Statistical modelling of transcript profiles of differentially regulated genes. BMC Molecular Biology, Vol.9 (No.66). ISSN 1471-2199
WRAP_Mead_1471-2199-9-66.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://dx.doi.org/10.1186/1471-2199-9-66
Background: The vast quantities of gene expression profiling data produced in microarray studies, and
the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous
studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of
variance (ANOVA) and the clustering of genes based on simple models fitted to their expression profiles
over time. We report the novel application of statistical non-linear regression modelling techniques to
describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E.
coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models
provides a more precise description of expression profiles, reducing the "noise" of the raw data to
produce a clear "signal" given by the fitted curve, and describing each profile with a small number of
biologically interpretable parameters. This approach then allows the direct comparison and clustering of
the shapes of response patterns between genes and potentially enables a greater exploration and
interpretation of the biological processes driving gene expression.
Results: Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Splitline"
or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification
of genes into those with primary and secondary responses. Five-day profiles were modelled using the
biologically-oriented, critical exponential curve, y(t) = A + (B + Ct)Rt + ε. This non-linear regression
approach allowed the expression patterns for different genes to be compared in terms of curve shape,
time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory
patterns were identified for the five genes studied. Applying the regression modelling approach to
microarray-derived time course data allowed 11% of the Escherichia coli features to be fitted by an
exponential function, and 25% of the Rattus norvegicus features could be described by the critical
exponential model, all with statistical significance of p < 0.05.
Conclusion: The statistical non-linear regression approaches presented in this study provide detailed
biologically oriented descriptions of individual gene expression profiles, using biologically variable data to
generate a set of defining parameters. These approaches have application to the modelling and greater
interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful
choice of appropriate model forms, such statistical regression approaches allow an improved comparison
of gene expression profiles, and may provide an approach for the greater understanding of common
regulatory mechanisms between genes.
|Item Type:||Journal Article|
|Subjects:||Q Science > QH Natural history > QH426 Genetics|
|Divisions:||Faculty of Science > Life Sciences (2010- ) > Warwick HRI (2004-2010)|
|Library of Congress Subject Headings (LCSH):||Gene mapping, Regression analysis|
|Journal or Publication Title:||BMC Molecular Biology|
|Official Date:||23 July 2008|
|Access rights to Published version:||Open Access|
Final version (published as open access).
1. Dopazo J, Zanders E, Dragoni I, Amphlett G, Falciani F: Methods and
Actions (login required)