Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Error, reproducibility and sensitivity : a pipeline for data processing of Agilent oligonucleotide expression arrays

Tools
- Tools
+ Tools

Chain, B. M., Bowen, Helen C., Hammond, John P., Posch, Wilfried, Rasaiyaah, Jane, Tsang, Jhen and Noursadeghi, Mahdad. (2010) Error, reproducibility and sensitivity : a pipeline for data processing of Agilent oligonucleotide expression arrays. BMC Bioinformatics, Vol.11 (No.344). ISSN 1471-2105

[img] PDF
WRAP_Hammond_Agilent_oligonucleotide.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Download (2140Kb)
Official URL: http://dx.doi.org/10.1186/1471-2105-11-344

Abstract

Background Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2% of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log2 units ( 6% of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.

Item Type: Journal Article
Subjects: Q Science > QP Physiology
Divisions: Faculty of Science > Life Sciences (2010- ) > Warwick HRI (2004-2010)
Library of Congress Subject Headings (LCSH): DNA microarrays, Human genome -- Data processing, Oligonucleotides -- Data processing
Journal or Publication Title: BMC Bioinformatics
Publisher: BioMed Central Ltd.
ISSN: 1471-2105
Date: 24 June 2010
Volume: Vol.11
Number: No.344
Identification Number: 10.1186/1471-2105-11-344
Status: Peer Reviewed
Access rights to Published version: Open Access
Funder: Biotechnology and Biological Sciences Research Council (Great Britain) (BBSRC), Wellcome Trust (London, England), National Institute for Health Research (Great Britain) (NIHR)
References: 1. Shi L, et al.: The MicroArray Quality Control (MAQC) project shows interand intraplatform reproducibility of gene expression measurements. Nat Biotechnol 2006, 24:1151-1161. 2. Agilent Whole Human Genome Expression Arrays 2010 [http:// www.chem.agilent.com/en-us/products/instruments/dnamicroarrays/ wholehumangenomeoligomicroarraykit/pages/default.aspx]. 3. Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19:185-193. 4. McCall MN, Bolstad BM, Irizarry RA: Frozen robust multiarray analysis (fRMA). Biostatistics 2010, 11:242-253. 5. Everitt BS, Dunn J: Multidimensional scaling. In Applied multivariate data analysis 2nd edition. Hodder Arnold, London, UK; 2001:93-122. 6. Rasaiyaah J, Noursadeghi M, Kellam P, Chain B: Transcriptional and functional defects of dendritic cells derived from the MUTZ-3 leukaemia line. Immunology 2009, 127:429-441. 7. Durbin BP, Hardin JS, Hawkins DM, Rocke DM: A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics 2002, 18(Suppl 1):S105-S110. 8. Kroll TC, Wolfl S: Ranking: a closer look on globalisation methods for normalisation of gene expression arrays. Nucleic Acids Res 2002, 30:e50. 9. Dabney AR, Storey JD: Normalization of two-channel microarrays accounting for experimental design and intensity-dependent relationships. Genome Biol 2007, 8:R44. 10. Dabney AR, Storey JD: A new approach to intensity-dependent normalization of two-channel microarrays. Biostatistics 2007, 8:128-139. 11. Fan J, Niu Y: Selection and validation of normalization methods for c- DNA microarrays using within-array replications. Bioinformatics 2007, 23:2391-2398. 12. Wang D, Zhang CH, Soares MB, Huang J: Systematic approaches for incorporating control spots and data quality information to improve normalization of cDNA microarray data. J Biopharm Stat 2007, 17:415-431. 13. Patterson TA, Lobenhofer EK, Fulmer-Smentek SB, Collins PJ, Chu TM, Bao W, Fang H, Kawasaki ES, Hager J, Tikhonova IR, Walker SJ, Zhang L, Hurban P, de Longueville F, Fuscoe JC, Tong W, Shi L, Wolfinger RD: Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project. Nat Biotechnol 2006, 24:1140-1150. 14. Johnson WE, Li C, Rabinovic A: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007, 8:118-127. 15. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo ML: A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 2008, 321:956-960. 16. 't Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT: Deep sequencingbased expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res 2008, 36:e141. 17. Birney E, et al.: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007, 447:799-816. 18. Noursadeghi M, Tsang J, Miller RF, Straschewski S, Kellam P, Chain BM, Katz DR: Genome-wide innate immune responses in HIV-1-infected macrophages are preserved despite attenuation of the NF-kappa B activation pathway. J Immunol 2009, 182:319-328. 19. Rasaiyaah J, Noursadeghi M, Kellam P, Chain B: Transcriptional and functional defects of dendritic cells derived from the MUTZ-3 leukaemia line. Immunology 2009, 127:429-441. 20. Hampton CR, Bowen HC, Broadley MR, Hammond JP, Mead A, Payne KA, Pritchard J, White PJ: Cesium toxicity in Arabidopsis. Plant Physiol 2004, 136:3824-3837. 21. Hammond JP, Bowen HC, White PJ, Mills V, Pyke KA, Baker AJ, Whiting SN, May ST, Broadley MR: A comparison of the Thlaspi caerulescens and Thlaspi arvense shoot transcriptomes. New Phytol 2006, 170:239-260.
URI: http://wrap.warwick.ac.uk/id/eprint/3339

Data sourced from Thomson Reuters' Web of Knowledge

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us