The Library
Informative sequence-based models for fragment distributions in ChIP-seq, RNA-seq and ChIP-chip data
Tools
Dyer, Nigel (2011) Informative sequence-based models for fragment distributions in ChIP-seq, RNA-seq and ChIP-chip data. PhD thesis, University of Warwick.
|
Text
WRAP_THESIS_Dyer_2011.pdf - Submitted Version Download (13Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b2582571~S1
Abstract
Many high throughput sequencing protocols for RNA and DNA require that the
polynucleic acid is fragmented so that the identity of a limited number of nucleic acids of one
or both of the ends of the fragments can be determined by sequencing. The nucleic acid
sequence allows the fragment to be located within the genome, and the fragment distribution
can then be used for a variety of different purposes. In the case of DNA this includes
identifying the locations where specific proteins are bound to the genome. In the case of RNA
this includes quantifying the expression levels of different gene variants or transcripts. If the
locations of the polynucleic acid fragments are partly determined by the underlying nucleic
acid sequence this could bias any results derived from the data. Unfortunately, such sequence
dependencies have already been observed in the distribution of both RNA and DNA
fragments. Previous analyses of such data in order to reduce the bias have examined the role
of regional characteristics such as GC bias, or the bias towards a specific sequence at the start
of the fragments.
This thesis introduces a new method for modelling the bias which considers the degree
to which the nucleotide sequence affects the likelihood of a fragment originating at that
location. This shows that there is often not a single bias characteristic, but multiple,
alternative sequence biases that coexist within a single dataset. This also shows that the
nucleotide sequence immediately proximal to the fragment also has a significant effect on the
fragment likelihood. This new approach highlights characteristics that were previously hidden
and provides a more powerful basis for correcting such bias.
Multiple alternative sequence biases are observed when both RNA and DNA are
fragmented, but the more detailed information provided by the new technique shows in detail
how the characteristics are different for RNA and DNA and indicates that very different
molecular mechanisms are responsible for the biases in the two processes.
This thesis also shows how removing the effect of this bias in ChIP-seq experiments can
reveal more subtle features of the distribution of the fragments. This can provide information
on the nature of the binding between proteins and the DNA with per-nucleotide precision,
revealed through the change in likelihood of the DNA fragmenting at each position in the
binding site.
It is also shown how the model fitting technique developed to analyse sequence bias can
also be used to obtain additional information from the results of ChIP-chip experiments. The
approach is used to find the nucleotide sequence preference of DNA binding proteins, and
also the cooperative effects associated with binding at multiple binding sites in close
proximity.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QP Physiology | ||||
Library of Congress Subject Headings (LCSH): | Nucleotide sequence | ||||
Official Date: | September 2011 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Molecular Organisation and Assembly in Cells | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Ott, Sascha ; Beynon, Jim, 1956- | ||||
Sponsors: | Engineering and Physical Sciences Research Council (EPSRC) | ||||
Extent: | xvi, 197 leaves : charts | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year