Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Informative sequence-based models for fragment distributions in ChIP-seq, RNA-seq and ChIP-chip data

Tools
- Tools
+ Tools

Dyer, Nigel (2011) Informative sequence-based models for fragment distributions in ChIP-seq, RNA-seq and ChIP-chip data. PhD thesis, University of Warwick.

[img] Text
WRAP_THESIS_Dyer_2011.pdf - Submitted Version
Restricted to Repository staff only until 30 June 2013.

Download (13Mb)
Official URL: http://webcat.warwick.ac.uk/record=b2582571~S1

Abstract

Many high throughput sequencing protocols for RNA and DNA require that the polynucleic acid is fragmented so that the identity of a limited number of nucleic acids of one or both of the ends of the fragments can be determined by sequencing. The nucleic acid sequence allows the fragment to be located within the genome, and the fragment distribution can then be used for a variety of different purposes. In the case of DNA this includes identifying the locations where specific proteins are bound to the genome. In the case of RNA this includes quantifying the expression levels of different gene variants or transcripts. If the locations of the polynucleic acid fragments are partly determined by the underlying nucleic acid sequence this could bias any results derived from the data. Unfortunately, such sequence dependencies have already been observed in the distribution of both RNA and DNA fragments. Previous analyses of such data in order to reduce the bias have examined the role of regional characteristics such as GC bias, or the bias towards a specific sequence at the start of the fragments. This thesis introduces a new method for modelling the bias which considers the degree to which the nucleotide sequence affects the likelihood of a fragment originating at that location. This shows that there is often not a single bias characteristic, but multiple, alternative sequence biases that coexist within a single dataset. This also shows that the nucleotide sequence immediately proximal to the fragment also has a significant effect on the fragment likelihood. This new approach highlights characteristics that were previously hidden and provides a more powerful basis for correcting such bias. Multiple alternative sequence biases are observed when both RNA and DNA are fragmented, but the more detailed information provided by the new technique shows in detail how the characteristics are different for RNA and DNA and indicates that very different molecular mechanisms are responsible for the biases in the two processes. This thesis also shows how removing the effect of this bias in ChIP-seq experiments can reveal more subtle features of the distribution of the fragments. This can provide information on the nature of the binding between proteins and the DNA with per-nucleotide precision, revealed through the change in likelihood of the DNA fragmenting at each position in the binding site. It is also shown how the model fitting technique developed to analyse sequence bias can also be used to obtain additional information from the results of ChIP-chip experiments. The approach is used to find the nucleotide sequence preference of DNA binding proteins, and also the cooperative effects associated with binding at multiple binding sites in close proximity.

Item Type: Thesis or Dissertation (PhD)
Subjects: Q Science > QP Physiology
Library of Congress Subject Headings (LCSH): Nucleotide sequence
Date: September 2011
Institution: University of Warwick
Theses Department: Molecular Organisation and Assembly in Cells
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Ott, Sascha ; Beynon, Jim, 1956-
Sponsors: Engineering and Physical Sciences Research Council (EPSRC)
Extent: xvi, 197 leaves : charts
Language: eng
URI: http://wrap.warwick.ac.uk/id/eprint/49963

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us