Performance engineering unstructured mesh, geometric multigrid codes.

[thumbnail of WRAP_Theses_Bunt_2016.pdf]
Preview
PDF
WRAP_Theses_Bunt_2016.pdf - Submitted Version - Requires a PDF viewer.

Download (2MB) | Preview

Request Changes to record.

Abstract

High Performance Computing (HPC) is a vital tool for scientific simulations; it allows the recreation of conditions which are too expensive to produce in situ or over too vast a time scale. However, in order to achieve the increasing levels of performance demanded by these applications, the architecture of computers has shifted several times since the 1970s. The challenge of engineering applications to leverage the performance which comes with past and future shifts is an on-going challenge. This work focuses on solving this challenge for unstructured mesh, geometric multigrid applications through three existing performance engineering methodologies: instrumentation, performance modelling and mini-applications.

First, an auto instrumentation tool is developed which enables the collection of performance data over several versions of a code base, with only a single definition of the data to collect. This information allows the comparison of prospective optimisations (e.g. reduced synchronisation), and an assessment of competing hardware (e.g. Intel Haswell/Ivybridge).

Second, this work details the development and use of a runtime performance model of unstructured mesh, geometric multigrid behaviour. The power of the model is demonstrated by i) exposing a synchronisation issue which degrades total application runtime by 1.41x on machines which have poor support for overlapping communication with computation; and, ii) accurately predicting the negative impact of the geometric partitioning algorithm on executions using 512 partitions.

Third, a mini-application is developed to provide a vehicle for optimising and porting activities, where it would be prohibitively time consuming to use a large, legacy application. The use of the mini-application is demonstrated by examining the impact of Intel Haswell's fused multiply and advanced vector extension instructions on performance. It is found that significant code modifications would be required to benefit from these instructions, but the architecture shows promise from an energy perspective.

Item Type: Thesis [via Doctoral College] (PhD)
Subjects: Q Science > QA Mathematics
Library of Congress Subject Headings (LCSH): High performance computing -- Research, Science -- Computer simulation, Parallel computers, Parallel processing (Electronic computers)
Official Date: September 2016
Dates:
Date
Event
September 2016
Submitted
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Jarvis, Stephen A., 1970-
Sponsors: University of Warwick ; Rolls-Royce Ltd. ; Royal Society (Great Britain)
Format of File: pdf
Extent: xix, 152 leaves : illustrations, charts
Language: eng
URI: https://wrap.warwick.ac.uk/89503/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item