Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Monitoring, analysis and optimisation of I/O in parallel applications

Tools
- Tools
+ Tools

Wright, Steven A. (2014) Monitoring, analysis and optimisation of I/O in parallel applications. PhD thesis, University of Warwick.

[img]
Preview
PDF
WRAP_THESIS_Wright_2014.pdf - Submitted Version - Requires a PDF viewer.

Download (3873Kb) | Preview
Official URL: http://webcat.warwick.ac.uk/record=b2750471~S1

Request Changes to record.

Abstract

High performance computing (HPC) is changing the way science is performed in the 21st Century; experiments that once took enormous amounts of time, were dangerous and often produced inaccurate results can now be performed and refined in a fraction of the time in a simulation environment. Current generation supercomputers are running in excess of 1016 floating point operations per second, and the push towards exascale will see this increase by two orders of magnitude. To achieve this level of performance it is thought that applications may have to scale to potentially billions of simultaneous threads, pushing hardware to its limits and severely impacting failure rates.

To reduce the cost of these failures, many applications use checkpointing to periodically save their state to persistent storage, such that, in the event of a failure, computation can be restarted without significant data loss. As computational power has grown by approximately 2x every 18 − 24 months, persistent storage has lagged behind; checkpointing is fast becoming a bottleneck to performance.

Several software and hardware solutions have been presented to solve the current I/O problem being experienced in the HPC community and this thesis examines some of these. Specifically, this thesis presents a tool designed for analysing and optimising the I/O behaviour of scientific applications, as well as a tool designed to allow the rapid analysis of one software solution to the problem of parallel I/O, namely the parallel log-structured file system (PLFS). This thesis ends with an analysis of a modern Lustre file system under contention from multiple applications and multiple compute nodes running the same problem through PLFS. The results and analysis presented outline a framework through which application settings and procurement decisions can be made.

Item Type: Thesis or Dissertation (PhD)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Library of Congress Subject Headings (LCSH): High performance computing, Parallel file systems (Computer science)
Official Date: 24 November 2014
Dates:
DateEvent
24 November 2014Published
28 October 2014Accepted
25 July 2014Submitted
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Jarvis, Stephen A., 1970-
Sponsors: University of Warwick. Department of Computer Science
Extent: xi, 144 leaves : charts
Language: eng

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us