The Library
Discovering transcriptional modules by Bayesian data integration
Tools
Savage, Richard S., Ghahramani, Zoubin, Griffin, Jim E., De la Cruz, Bernard J. and Wild, David L.. (2010) Discovering transcriptional modules by Bayesian data integration. Bioinformatics, Vol.26 (No.12). pp. 158-167. ISSN 1367-4803
|
PDF
WRAP_Wild_Transcriptional_modules.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader Download (3262Kb) |
Official URL: http://dx.doi.org/10.1093/bioinformatics/btq210
Abstract
Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs.
| Item Type: | Journal Article |
|---|---|
| Subjects: | Q Science > QA Mathematics Q Science > QH Natural history |
| Divisions: | Faculty of Science > Centre for Systems Biology |
| Library of Congress Subject Headings (LCSH): | Genes -- Research, Gene expression, Transcription factors, Dirichlet principle |
| Journal or Publication Title: | Bioinformatics |
| Publisher: | Oxford University Press |
| ISSN: | 1367-4803 |
| Date: | 15 June 2010 |
| Volume: | Vol.26 |
| Number: | No.12 |
| Page Range: | pp. 158-167 |
| Identification Number: | 10.1093/bioinformatics/btq210 |
| Status: | Peer Reviewed |
| Access rights to Published version: | Open Access |
| Funder: | Engineering and Physical Sciences Research Council (EPSRC) |
| Grant number: | EP/F027400/1 (EPSRC) |
| References: | Antoniak,C. (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Stat., 2, 1152–1174. Bähler,J. (2005) Cell-cycle control of gene expression in budding and fission yeast. Ann. Rev. Genet., 39, 69–94. Bar-Joseph,Z. et al. (2003) Computational discovery of gene modules and regulatory networks. Nat. Biotechnol., 21, 1337–1342. Cho,R. et al. (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. cell, 2, 65–73. Dahl,D. (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In Do, K.-A. et al. (eds), Bayesian Inference for Gene Expression and Proteomics. Cambridge University Press, Cambridge, pp. 201–218. Datta,S. and Datta,S. (2006) Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes. BMC Bioinformatics, 7, 397. Eisen,M. (1998) Cluster analysis and display of genome-wide expression. Proc .Natl Acad.Sci.USA, 95, 14863–14868. Falcon,S. and Gentleman,R. (2007). Using GOstats to test gene lists for GO term association. Bioinformatics, 23, 257. Ferguson,T. (1973) A Bayesian analysis of some nonparametric problems. Ann. Stat., 1, 209–230. Fritsch,A. and Ickstadt,K. (2009) Improved criteria for clustering based on the posterior similarity matrix. Bayesian Anal., 4, 367–392. Gasch,A. et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell, 11, 4241–4257. Gerber,G. et al. (2007) Automated discovery of functional generality of human gene expression programs. PLoS Comput. Biol., 3, e148. Geweke,J. (1992) Evaluating the accuracy of sampling-based approaches to calcualting posterior moments. In Bernardo,J.M. et al. (eds) Bayesian Statistics 4. Oxford University Press, New York, pp. 169–193. Harbison,C. et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature, 431, 99–104. Ideker,T. et al. (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science, 292, 929–934. Ihmels,J. et al. (2002) Revealing modular organization in the yeast transcriptional network. Nat. Genet., 31, 370–377. Kundaje,A. et al. (2005) Combining sequence and time series expression data to learn transcriptional modules. IEEE/ACM Trans. Comput. Biol. Bioinform., 2, 202. Lee,T. et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799. Liu,X. et al. (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics, 22, 1737–1744. Liu,X. et al. (2007) Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and chip-chip data. BMC Bioinformatics, 8, 283. Medvedovic,M. and Sivaganesan,S. (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics, 18, 1194–1206. Medvedovic,M. et al. (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics, 20, 1222–1232. Qin,Z.S. (2006) Clustering microarray gene expression data using weighted Chinese restaurant process. Bioinformatics, 22, 1988–1997. Rasmussen,C. et al. (2009) Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. IEEE/ACM Trans. Computat. Biol. Bioinform., 6, 615–628. Rasmussen,C.E. (2000) The infinite Gaussian mixture model. In Solla,S.A. et al., (eds). Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, pp. 554–560. Reid,J. et al. (2009) Transcriptional programs: modelling higher order structure in transcriptional control. BMC Bioinformatics, 10, 218. Savage,R.S. et al. (2009) R/BHC: fast Bayesian hierarchical clustering for microarray data. BMC Bioinformatics, 10, 242. Segal,E. et al. (2003a) Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics, 19, 273–282. Segal,E. et al. (2003b). Module networks: Discovering regulatory modules and their condition specific regulators from gene expression data. Nat. Genet., 34, 166–176. Teh,Y.W. and Jordan,M.I. (2010) Hierarchical Bayesian nonparametric models with applications. In Lid Hjort,N. et al. (eds), Bayesian Nonparametrics, Cambridge University Press, Cambridge, pp. 158–207. Teh,Y.W. et al. (2006) Hierarchical Dirichlet processes. J. Am. Stat. Assoc., 101, 1566–1581. Wild,D. et al. (2002) A Bayesian approach to modeling uncertainty in gene expression clusters. In 3rd International Conference on Systems Biology. Yao,J. et al. (2008) Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient. BMC Bioinformatics, 9, 288. Yeung,K. et al. (2003) Clustering gene-expression data with repeated measurements. Genome Biol., 4, R34. |
| URI: | http://wrap.warwick.ac.uk/id/eprint/3277 |
Data sourced from Thomson Reuters' Web of Knowledge
Actions (login required)
![]() |
View Item |
Tools
Tools

