The Library
Learning and approximation algorithms for problems motivated by evolutionary trees
Tools
Cryan, Mary (1999) Learning and approximation algorithms for problems motivated by evolutionary trees. PhD thesis, University of Warwick.

PDF
WRAP_THESIS_Cryan_1999.pdf  Submitted Version  Requires a PDF viewer. Download (1056Kb)  Preview 
Official URL: http://webcat.warwick.ac.uk/record=b1368032~S1
Abstract
In this thesis we consider some computational problems motivated by the biological problem of reconstructing evolutionary trees. In this thesis, we are concerned with the design and analysis of efficient algorithms for clearly defined combinatorial problems motived by this application area. We present results for two different kinds of problem. Our first problem is motivated by models of evolution that describe the evolution of biological species in terms of a stochastic process that alters the DNA of species. The particular stochastic model that we considered is called the TwoState General Markov Model. In this model, an evolutionary tree can be associated with a distribution on the different "patterns" that may appear among the sequences for all the species in the evolutionary tree. Then the data for a collection of species whose evolutionary tree is unknown can be viewed as samples from this (unknown) distribution. An interesting problem asks whether we can use samples from an unknown evolutionary tree M to find another tree M*for those species, so that the distribution of M* is similar to that of M. This is essentially a PAClearning problem ("Probably Approximately Correct") in the sense of Valiant and Kearns et al.. Our results show that evolutionary trees in the TwoState General Markov can be efficiently PAClearned in the variation distance metric using a "reasonable" number of samples. The two other problems that we consider are combinatorial problems that are also motivated by evolutionary tree construction. The input to each of these problems consists of a fixed tree topology whose leaves are bijectively labelled by the elements of a species set, as well as data for those species. Both problems involve labelling the internal nodes in the fixed topology in order to minimize some function on that tree (both functions that we consider are assumed to test the quality of the tree topology in some way). The two problems that we consider are known to be NPhard. Our contribution is to present efficient approximation algorithms for both problems.
Item Type:  Thesis or Dissertation (PhD)  

Subjects:  Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software  
Divisions:  Faculty of Science > Computer Science  
Library of Congress Subject Headings (LCSH):  Evolution (Biology)  Mathematical models  
Publisher:  Department of Computer Science  
Place of Publication:  Coventry, UK  
Official Date:  October 1999  
Dates: 


DOI:  CSRR373  
Institution:  University of Warwick  
Theses Department:  Department of Computer Science  
Thesis Type:  PhD  
Publication Status:  Published  
Supervisor(s)/Advisor:  Goldberg, Leslie Ann  
Sponsors:  University of Warwick. Department of Computer Science  
Extent:  vi, 169 p.  
Language:  eng 
Request changes or add full text files to a record
Repository staff actions (login required)
View Item 
Downloads
Downloads per month over past year