Towards scalable adaptive mesh refinement on future parallel architectures

[thumbnail of WRAP_Thesis_Beckingsale_2015.pdf]
Preview
PDF
WRAP_Thesis_Beckingsale_2015.pdf - Requires a PDF viewer.

Download (3MB) | Preview

Request Changes to record.

Abstract

In the march towards exascale, supercomputer architectures are undergoing a significant change. Limited by power consumption and heat dissipation, future supercomputers are likely to be built around a lower-power many-core model. This shift in supercomputer design will require sweeping code changes in order to take advantage of the highly-parallel architectures. Evolving or rewriting legacy applications to perform well on these machines is a significant challenge.

Mini-applications, small computer programs that represent the performance characteristics of some larger application, can be used to investigate new programming models and improve the performance of the legacy application by proxy. These applications, being both easy to modify and representative, are essential for establishing a path to move legacy applications into the exascale era.

The focus of the work presented in this thesis is the design, development and employment of a new mini-application, CleverLeaf, for shock hydro- dynamics with block-structured adaptive mesh refinement (AMR). We report on the development of CleverLeaf, and show how the fresh start provided by a mini-application can be used to develop an application that is flexible, accurate, and easy to employ in the investigation of exascale architectures.

We also detail the development of the first reported resident parallel block-structured AMR library for Graphics Processing Units (GPUs). Extending the SAMRAI library using the CUDA programming model, we develop datatypes that store data only in GPU memory, as well the necessary operators for moving and interpolating data on an adaptive mesh. We show that executing AMR simulations on a GPU is up to 4.8⇥ faster than a CPU, and demonstrate scalability on over 4,000 nodes using a combination of CUDA and MPI.

Finally, we show how mini-applications can be employed to improve the performance of production applications on existing parallel architectures by selecting the optimal application configuration. Using CleverLeaf, we identify the most appropriate configurations on three contemporary supercomputer architectures. Selecting the best parameters for our application can reduce run-time by up to 82% and reduce memory usage by up to 32%.

Item Type: Thesis [via Doctoral College] (PhD)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Library of Congress Subject Headings (LCSH): High performance computing, Supercomputers, Parallel processing (Electronic computers), Software engineering
Official Date: February 2015
Dates:
Date
Event
February 2015
Submitted
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Jarvis, Stephen A.,1970-
Extent: xxv, 209 leaves : illustrations
Language: eng
URI: https://wrap.warwick.ac.uk/72739/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item