The Library
Learning and approximation algorithms for problems motivated by evolutionary trees
Tools
Cryan, Mary (1999) Learning and approximation algorithms for problems motivated by evolutionary trees. PhD thesis, University of Warwick.
|
PDF
WRAP_THESIS_Cryan_1999.pdf - Submitted Version - Requires a PDF viewer. Download (1056Kb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b1368032~S1
Abstract
In this thesis we consider some computational problems motivated by the biological problem of reconstructing evolutionary trees. In this thesis, we are concerned with the design and analysis of efficient algorithms for clearly defined combinatorial problems motived by this application area. We present results for two different kinds of problem. Our first problem is motivated by models of evolution that describe the evolution of biological species in terms of a stochastic process that alters the DNA of species. The particular stochastic model that we considered is called the Two-State General Markov Model. In this model, an evolutionary tree can be associated with a distribution on the different "patterns" that may appear among the sequences for all the species in the evolutionary tree. Then the data for a collection of species whose evolutionary tree is unknown can be viewed as samples from this (unknown) distribution. An interesting problem asks whether we can use samples from an unknown evolutionary tree M to find another tree M*for those species, so that the distribution of M* is similar to that of M. This is essentially a PAC-learning problem ("Probably Approximately Correct") in the sense of Valiant and Kearns et al.. Our results show that evolutionary trees in the Two-State General Markov can be efficiently PAC-learned in the variation distance metric using a "reasonable" number of samples. The two other problems that we consider are combinatorial problems that are also motivated by evolutionary tree construction. The input to each of these problems consists of a fixed tree topology whose leaves are bijectively labelled by the elements of a species set, as well as data for those species. Both problems involve labelling the internal nodes in the fixed topology in order to minimize some function on that tree (both functions that we consider are assumed to test the quality of the tree topology in some way). The two problems that we consider are known to be NP-hard. Our contribution is to present efficient approximation algorithms for both problems.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software | ||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||
Library of Congress Subject Headings (LCSH): | Evolution (Biology) -- Mathematical models | ||||
Publisher: | Department of Computer Science | ||||
Place of Publication: | Coventry, UK | ||||
Official Date: | October 1999 | ||||
Dates: |
|
||||
DOI: | CS-RR-373 | ||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Computer Science | ||||
Thesis Type: | PhD | ||||
Publication Status: | Published | ||||
Supervisor(s)/Advisor: | Goldberg, Leslie Ann | ||||
Sponsors: | University of Warwick. Department of Computer Science | ||||
Extent: | vi, 169 p. | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year