The Library

Computational prediction of functional similarity of CRMs

Tools

Koohy, Hashem (2010) Computational prediction of functional similarity of CRMs. PhD thesis, University of Warwick.

Preview

PDF
WRAP_Thesis_Koohy_2010.pdf - Submitted Version - Requires a PDF viewer.
Download (14Mb) | Preview

Official URL: http://webcat.warwick.ac.uk/record=b2717156~S1

Request Changes to record.

Abstract

Transcriptional regulation of genes is fundamental to all living organisms. The spatial, temporal and condition-specific expression levels of genes are in part determined by inherited regulatory codes in non-coding regions of the DNA. A large set of methods have been proposed to detect conserved regions of regulatory DNA by means of sequence alignments. However, it has become clear that some regulatory regions do not show statistically significant alignments even in the presence of functional conservation. Therefore, detecting and characterising elusive regulatory codes remains a challenging problem.
In this thesis we develop and validate a novel computational alignment free model for detection of functional similarity of regulatory sequences. We show that our model can detect functional links between pairs of sequences that do not align with a significant score. We apply the model to a) detect enhancers within the same genome that are likely to have similar functions and b) to detect functionally conserved enhancer regions in orthologous genomes. Our method finds regulatory codes that are common to groups of similar enhancers and consistent with previous biological knowledge.
The inputs for our model are two sequences that we wish to compare in terms of their functional similarity as well as a set of transcription factor motifs. The mathematical framework of our model is built on two main components: In the first model component, each sequence is mapped to a vector of estimated occupancy levels for all motifs. These vectors are representing which motifs at what multiplicity and specificity are present in each sequence.
In the second model component, a statistical approach is established where we first estimate a probability distribution of motif occupancy levels for sequences that function similar to the template sequence. We then compute a statistical similarity score to evaluate if the sequences are more similar to each other than to random background sequences.
Two applications of this model are presented: First it is applied to a set of experimentally validated non-alignable enhancers from
D. melanogaster. We show that:
• Our model can detect statistical links between these enhancers,
• Weak binding sites can make a strong contribution to sequence similarity,
• Our model treats statistically significant presence and absence of motifs symmetrically. Similarity of sequences, therefore, can be based on a combination of the two. We show examples of motifs making contributions to sequence similarity through their absence.
• Using our model, we can create a network of similarities among the fly enhancers. Groups of enhancers in this network show common
regulatory codes. One of these regulatory codes is strongly supported by existing experimental data.
In the second application of our model we predict functional subregions of a known D. melanogaster enhancer. To achieve this, we first show that the model can detect the orthology of this enhancer between 10 Drosophila species. We then demonstrate how this statistical link can be used to predict functional subregions within this enhancer.

Item Type:

Thesis (PhD)

Subjects:

Q Science > QH Natural history > QH426 Genetics

Library of Congress Subject Headings (LCSH):

Genetic regulation, Nucleotide sequence -- Mathematical models

Official Date:

October 2010

Dates:

Date	Event
October 2010	Submitted

Institution:

University of Warwick

Theses Department:

Systems Biology Doctoral Training Centre

Thesis Type:

PhD

Publication Status:

Unpublished

Supervisor(s)/Advisor:

Ott, Sascha ; Koentges, Georgy

Sponsors:

Warwick Systems Biology Centre ; Human Frontier Science Program

Extent:

xii, 127 leaves : ill., charts

Language:

eng

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

University of Warwick
Publications service & WRAP

Highlight your research

The Library

Computational prediction of functional similarity of CRMs

Abstract

Repository staff actions (login required)

Downloads

University of WarwickPublications service & WRAP

Highlight your research

The Library

Computational prediction of functional similarity of CRMs

Abstract

Repository staff actions (login required)

Downloads

University of Warwick
Publications service & WRAP