Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Discovering transcriptional modules by Bayesian data integration

Tools
- Tools
+ Tools

Savage, Richard S., Ghahramani, Zoubin, Griffin, Jim E., De la Cruz, Bernard J. and Wild, David L.. (2010) Discovering transcriptional modules by Bayesian data integration. Bioinformatics, Vol.26 (No.12). pp. 158-167. ISSN 1367-4803

[img] PDF
WRAP_Wild_Transcriptional_modules.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (3262Kb)
Official URL: http://dx.doi.org/10.1093/bioinformatics/btq210

Abstract

Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics
Q Science > QH Natural history
Divisions: Faculty of Science > Centre for Systems Biology
Library of Congress Subject Headings (LCSH): Genes -- Research, Gene expression, Transcription factors, Dirichlet principle
Journal or Publication Title: Bioinformatics
Publisher: Oxford University Press
ISSN: 1367-4803
Date: 15 June 2010
Volume: Vol.26
Number: No.12
Page Range: pp. 158-167
Identification Number: 10.1093/bioinformatics/btq210
Status: Peer Reviewed
Access rights to Published version: Open Access
Funder: Engineering and Physical Sciences Research Council (EPSRC)
Grant number: EP/F027400/1 (EPSRC)
References: Antoniak,C. (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Stat., 2, 1152–1174. Bähler,J. (2005) Cell-cycle control of gene expression in budding and fission yeast. Ann. Rev. Genet., 39, 69–94. Bar-Joseph,Z. et al. (2003) Computational discovery of gene modules and regulatory networks. Nat. Biotechnol., 21, 1337–1342. Cho,R. et al. (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. cell, 2, 65–73. Dahl,D. (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In Do, K.-A. et al. (eds), Bayesian Inference for Gene Expression and Proteomics. Cambridge University Press, Cambridge, pp. 201–218. Datta,S. and Datta,S. (2006) Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics, 7, 397. Eisen,M. (1998) Cluster analysis and display of genome-wide expression. Proc .Natl Acad.Sci.USA, 95, 14863–14868. Falcon,S. and Gentleman,R. (2007). Using GOstats to test gene lists for GO term association. Bioinformatics, 23, 257. Ferguson,T. (1973) A Bayesian analysis of some nonparametric problems. Ann. Stat., 1, 209–230. Fritsch,A. and Ickstadt,K. (2009) Improved criteria for clustering based on the posterior similarity matrix. Bayesian Anal., 4, 367–392. Gasch,A. et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell, 11, 4241–4257. Gerber,G. et al. (2007) Automated discovery of functional generality of human gene expression programs. PLoS Comput. Biol., 3, e148. Geweke,J. (1992) Evaluating the accuracy of sampling-based approaches to calcualting posterior moments. In Bernardo,J.M. et al. (eds) Bayesian Statistics 4. Oxford University Press, New York, pp. 169–193. Harbison,C. et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature, 431, 99–104. Ideker,T. et al. (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science, 292, 929–934. Ihmels,J. et al. (2002) Revealing modular organization in the yeast transcriptional network. Nat. Genet., 31, 370–377. Kundaje,A. et al. (2005) Combining sequence and time series expression data to learn transcriptional modules. IEEE/ACM Trans. Comput. Biol. Bioinform., 2, 202. Lee,T. et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799. Liu,X. et al. (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics, 22, 1737–1744. Liu,X. et al. (2007) Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and chip-chip data. BMC Bioinformatics, 8, 283. Medvedovic,M. and Sivaganesan,S. (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics, 18, 1194–1206. Medvedovic,M. et al. (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics, 20, 1222–1232. Qin,Z.S. (2006) Clustering microarray gene expression data using weighted Chinese restaurant process. Bioinformatics, 22, 1988–1997. Rasmussen,C. et al. (2009) Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. IEEE/ACM Trans. Computat. Biol. Bioinform., 6, 615–628. Rasmussen,C.E. (2000) The infinite Gaussian mixture model. In Solla,S.A. et al., (eds). Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, pp. 554–560. Reid,J. et al. (2009) Transcriptional programs: modelling higher order structure in transcriptional control. BMC Bioinformatics, 10, 218. Savage,R.S. et al. (2009) R/BHC: fast Bayesian hierarchical clustering for microarray data. BMC Bioinformatics, 10, 242. Segal,E. et al. (2003a) Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics, 19, 273–282. Segal,E. et al. (2003b). Module networks: Discovering regulatory modules and their condition specific regulators from gene expression data. Nat. Genet., 34, 166–176. Teh,Y.W. and Jordan,M.I. (2010) Hierarchical Bayesian nonparametric models with applications. In Lid Hjort,N. et al. (eds), Bayesian Nonparametrics, Cambridge University Press, Cambridge, pp. 158–207. Teh,Y.W. et al. (2006) Hierarchical Dirichlet processes. J. Am. Stat. Assoc., 101, 1566–1581. Wild,D. et al. (2002) A Bayesian approach to modeling uncertainty in gene expression clusters. In 3rd International Conference on Systems Biology. Yao,J. et al. (2008) Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient. BMC Bioinformatics, 9, 288. Yeung,K. et al. (2003) Clustering gene-expression data with repeated measurements. Genome Biol., 4, R34.
URI: http://wrap.warwick.ac.uk/id/eprint/3277

Data sourced from Thomson Reuters' Web of Knowledge

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us