Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Patient-specific data fusion defines prognostic cancer subtypes

Tools
- Tools
+ Tools

Yuan, Yinyin, Savage, Richard S. and Markowetz, Florian. (2011) Patient-specific data fusion defines prognostic cancer subtypes. PLoS Computational Biology, Vol.7 (No.10). e1002227. ISSN 1553-7358

[img]
Preview
PDF
WRAP_Savage_journal.pcbi.1002227.pdf - Published Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (2214Kb)
Official URL: http://dx.doi.org/10.1371/journal.pcbi.1002227

Abstract

Different data types can offer complementary perspectives on the same biological phenomenon. In cancer studies, for example, data on copy number alterations indicate losses and amplifications of genomic regions in tumours, while transcriptomic data point to the impact of genomic and environmental events on the internal wiring of the cell. Fusing different data provides a more comprehensive model of the cancer cell than that offered by any single type. However, biological signals in different patients exhibit diverse degrees of concordance due to cancer heterogeneity and inherent noise in the measurements. This is a particularly important issue in cancer subtype discovery, where personalised strategies to guide therapy are of vital importance. We present a nonparametric Bayesian model for discovering prognostic cancer subtypes by integrating gene expression and copy number variation data. Our model is constructed from a hierarchy of Dirichlet Processes and addresses three key challenges in data fusion: (i) To separate concordant from discordant signals, (ii) to select informative features, (iii) to estimate the number of disease subtypes. Concordance of signals is assessed individually for each patient, giving us an additional level of insight into the underlying disease structure. We exemplify the power of our model in prostate cancer and breast cancer and show that it outperforms competing methods. In the prostate cancer data, we identify an entirely new subtype with extremely poor survival outcome and show how other analyses fail to detect it. In the breast cancer data, we find subtypes with superior prognostic value by using the concordant results. These discoveries were crucially dependent on our model’s ability to distinguish concordant and discordant signals within each patient sample, and would otherwise have been missed. We therefore demonstrate the importance of taking a patientspecific approach, using highly-flexible nonparametric Bayesian methods.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics
R Medicine > R Medicine (General)
Divisions: Faculty of Science > Centre for Systems Biology
Library of Congress Subject Headings (LCSH): Cancer -- Prognosis -- Mathematical models, Cancer -- Prognosis -- Data processing
Journal or Publication Title: PLoS Computational Biology
Publisher: Public Library of Science
ISSN: 1553-7358
Date: 20 October 2011
Volume: Vol.7
Number: No.10
Page Range: e1002227
Identification Number: 10.1371/journal.pcbi.1002227
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Open Access
Funder: University of Cambridge, Cancer Research UK (CRUK), Hutchison Whampoa Ltd., Medical Research Council (Great Britain) (MRC)
References: 1. Perou CM, Børresen-Dale AL (2010) Systems biology and genomics of breast cancer. Cold Spring Harb Perspect Biol 3: 2. 2. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, et al. (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 100: 8418–23. 3. Furge KA, Lucas KA, Takahashi M, Sugimura J, Kort EJ, et al. (2004) Robust classification of renal cell carcinoma based on gene expression data and predicted cytogenetic profiles. Cancer Res 64: 4117–4121. 4. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, et al. (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503–511. 5. Segal E, Friedman N, Koller D, Regev A (2004) A module map showing conditional activity of expression modules in cancer. Nat Genet 36: 1090–1098. 6. Hummel M, Bentink S, Berger H, Klapper W, Wessendorf S, et al. (2006) A biologic definition of burkitt’s lymphoma from transcriptional and genomic profiling. N Engl J Med 354: 2419–2430. 7. Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, et al. (2010) Integrative genomic profiling of human prostate cancer. Cancer Cell 18: 11–22. 8. Shen R, Olshen AB, Ladanyi M (2009) Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 25: 2906–2912. 9. Smolkin M, Ghosh D (2003) Cluster stability scores for microarray data in cancer studies. BMC Bioinformatics 4: 36. 10. Antoniak C (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2: 1152–1174. 11. Ferguson T (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1: 209–230. 12. Savage RS, Ghahramani Z, Griffin JE, de la Cruz B, et al. (2010) Discovering transcriptional modules by bayesian data integration. Bioinformatics 26: 158–167. 13. Kundaje A, Middendorf M, Gao F, Wiggins C, Leslie C (2005) Combining sequence and time series expression data to learn transcriptional modules. IEEE/ACM Trans Comput Biol Bioinform 2: 194–202. 14. Berger JA, Hautaniemi S, Mitra SK, Astola J (2006) Jointly analyzing gene expression and copy number data in breast cancer using data reduction models. IEEE/ACM Trans Comput Biol Bioinform 3: 2–16. 15. Chin S, Teschendorff A, Marioni J, Wang Y, Barbosa-Morais N, et al. (2007) High-resolution acgh and expression profiling identifies a novel genomic subtype of er negative breast cancer. Genome Biol 8: R215. 16. Chin K, Devries S, Fridlyand J, Spellman PT, Roydasgupta R, et al. (2006) Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 10: 529–541. 17. Jiang M, Li M, Fu X, Huang Y, Qian H, et al. (2008) Simultaneously detection of genomic and expression alterations in prostate cancer using cdna microarray. Prostate 68: 1496–509. 18. Rasmussen CE (2000) The infinite Gaussian mixture model. In: Proceedings of Advances in Neural InformationProcessing Systems 12. Cambridge (Massachusetts): MIT Press. pp 554–560. 19. Wild D, Rasmussen C, Ghahramani Z, Cregg J, de la Cruz BJ, et al. (2002) A Bayesian approach to modeling uncertainty in gene expression clusters. In: Proceedings of 3rd International Conference on Systems Biology, Sweden. 20. Medvedovic M, Sivaganesan S (2002) Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18: 1194–1206. 21. Medvedovic M, Yeung KY, Bumgarner RE (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20: 1222–1232. 22. Liu X, Sivaganesan S, Yeung KY, Guo J, Bumgarner RE, et al. (2006) Contextspecific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics 22: 1737–1744. 23. Dahl D (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In:, , Kim- Anh Do MVE Peter Mu¨ ller, editor (2006) Bayesian Inference for Gene Expression and Proteomics. Cambridge: Cambridge University Press. 24. Qin ZS (2006) Clustering microarray gene expression data using weighted Chinese restaurant process. Bioinformatics 22: 1988–1997. 25. Rasmussen C, de la Cruz B, Ghahramani Z, Wild DL (2007) Modeling and visualizing uncertainty in gene expression clusters using Dirichlet process mixtures. IEEE/ACM Trans Comput Biol Bioinform 6: 615–628. 26. van de Wiel MA, van Wieringen WN (2007) Cghregions: Dimension reduction for array cgh data with minimal information loss. Cancer informatics 3: 55–63. 27. Smyth GK (2005) Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions using R and Bioconductor. New York: Springer. pp 397–420. 28. Prasad, Goel R, Kandasamy K, Keerthikumar S, Kumar S, et al. (2009) Human Protein Reference Database–2009 update. Nucleic Acids Res 37: D767–72. 29. Beisser D, Klau GW, Dandekar T, Mu¨ ller T, Dittrich MT (2010) BioNet: an RPackage for the functional analysis of biological networks. Bioinformatics 26: 1129–1130. 30. Sieuwerts AM, Look MP, Meijer-van Gelder ME, Timmermans M, Trapman AM, et al. (2006) Which cyclin e prevails as prognostic marker for breast cancer? results from a retrospective study involving 635 lymph node negative breast cancer patients. Clin Cancer Res 12: 3319–3328. 31. Frescas D, Pagano M (2008) Deregulated proteolysis by the F-box proteins SKP2 and TrCP: tipping the scales of cancer. Nat Rev Cancer 8: 438–449. 32. Langerod A, Zhao H, Borgan O, Nesland J, Bukholm I, et al. (2007) Tp53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer. Breast Cancer Res 9: R30. 33. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, et al. (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36: D480–484. 34. Merico D, Isserlin R, Stueker O, Emili A, Bader GD (2010) Enrichment map: A network-based method for gene-set enrichment visualization and interpretation. PLoS ONE 5: e13984. 35. Wang X, Terfve C, Rose JC, Markowetz F (2011) HTSanalyzeR: a R/ Bioconductor package for integrated network analysis of high-throughput screens. Bioinformatics 27: 879–880. 36. Ertel A, Verghese A, Byers SW, Ochs M, Tozeren A (2006) Pathway-specific differences between tumor cell lines and normal and tumor tissue cells. Mol Cancer 5: 55. 37. Miecznikowski J, Wang D, Liu S, Sucheston L, Gold D (2010) Comparative survival analysis of breast cancer microarray studies identifies important prognostic genetic pathways. BMC Cancer 10: 573. 38. Rubin JB (2009) Chemokine signaling in cancer: One hump or two? Semin Cancer Biol 19: 116–122. 39. Hembruff SL, Cheng N (2009) Chemokine signaling in cancer: Implications on the tumor microenvironment and therapeutic targeting. Cancer Ther 7: 254–267. 40. Thurn KT, Arora H, Paunesku T, Wu A, Brown EMB, et al. (2011) Endocytosis of titanium dioxide nanoparticles in prostate cancer pc-3m cells. Nanomedicine 7: 123–30. 41. Polo S, Pece S, Di Fiore PP (2004) Endocytosis and cancer. Curr Opin Cell Biol 16: 156–61. 42. Zheng C, Ren Z, Wang H, Zhang W, Kalvakolanu DV, et al. (2009) E2f1 induces tumor cell survival via nuclear factor-kappab-dependent induction of egr1 transcription in prostate cancer cells. Cancer Res 69: 2324–31. 43. van deWiel M, Kim K, Vosse S, vanWieringen W, Wilting S, et al. (2007) CGHcall: calling aberrations for array CGH tumor profiles. Bioinformatics 23: 892–894. 44. Geier F, Timmer J, Fleck C (2007) Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge. BMC Syst Biol 1: 11. 45. Warnat P, Eils R, Brors B (2005) Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC Bioinformatics 6: 265. 46. Bicciato S, Spinelli R, Zampieri M, Mangano E, Ferrari F, et al. (2009) A computational procedure to identify significant overlap of differentially expressed and genomic imbalanced regions in cancer datasets. Nucleic Acids Res 37: 5057–70.
URI: http://wrap.warwick.ac.uk/id/eprint/39088

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us