Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Strategy iteration algorithms for games and Markov decision processes

Tools
- Tools
+ Tools

Fearnley, John (2010) Strategy iteration algorithms for games and Markov decision processes. PhD thesis, University of Warwick.

[img]
Preview
PDF
WRAP_THESIS_Fearnley_2010.pdf - Requires a PDF viewer.

Download (1166Kb)
Official URL: http://webcat.warwick.ac.uk/record=b2482615~S15

Request Changes to record.

Abstract

In this thesis, we consider the problem of solving two player infinite games,
such as parity games, mean-payoff games, and discounted games, the problem of
solving Markov decision processes. We study a specific type of algorithm for solving
these problems that we call strategy iteration algorithms. Strategy improvement
algorithms are an example of a type of algorithm that falls under this classification.
We also study Lemke’s algorithm and the Cottle-Dantzig algorithm, which
are classical pivoting algorithms for solving the linear complementarity problem.
The reduction of Jurdzinski and Savani from discounted games to LCPs allows these
algorithms to be applied to infinite games [JS08]. We show that, when they are
applied to games, these algorithms can be viewed as strategy iteration algorithms.
We also resolve the question of their running time on these games by providing a
family of examples upon which these algorithm take exponential time.
Greedy strategy improvement is a natural variation of strategy improvement,
and Friedmann has recently shown an exponential lower bound for this algorithm
when it is applied to infinite games [Fri09]. However, these lower bounds do not
apply for Markov decision processes. We extend Friedmann’s work in order to prove
an exponential lower bound for greedy strategy improvement in the MDP setting.
We also study variations on strategy improvement for infinite games. We
show that there are structures in these games that current strategy improvement
algorithms do not take advantage of. We also show that lower bounds given by
Friedmann [Fri09], and those that are based on his work [FHZ10], work because they
exploit this ignorance. We use our insight to design strategy improvement algorithms
that avoid poor performance caused by the structures that these examples use.

Item Type: Thesis or Dissertation (PhD)
Subjects: Q Science > QA Mathematics
Library of Congress Subject Headings (LCSH): Game theory, Markov processes, Iterative methods (Mathematics)
Official Date: August 2010
Dates:
DateEvent
August 2010Submitted
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Jurdziński, Marcin ; Lazić, Ranko
Extent: ix, 215 leaves : ill.
Language: eng

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us