
The Library
Towards the use of mini-applications in performance prediction and optimisation of production codes
Tools
Owenson, A. M. B (2020) Towards the use of mini-applications in performance prediction and optimisation of production codes. PhD thesis, University of Warwick.
|
PDF
WRAP_Theses_Owenson_2020.pdf - Submitted Version - Requires a PDF viewer. Download (1324Kb) | Preview |
Abstract
Maintaining the performance of large scientific codes is a difficult task. To aid in this task a number of mini-applications have been developed that are more tract able to analyse than large-scale production codes, while retaining the performance characteristics of them. These “mini-apps” also enable faster hardware evaluation, and for sensitive commercial codes allow evaluation of code and system changes outside of access approval processes.
Techniques for validating the representativeness of a mini-application to a target code are ultimately qualitative, requiring the researcher to decide whether the similarity is strong enough for the mini-application to be trusted to provide accurate predictions of the target performance. Little consideration is given to the sensitivity of those predictions to the few differences between the mini-application and its target, how those potentially-minor static differences may lead to each code responding very differently to a change in the computing environment.
An existing mini-application, ‘Mini-HYDRA’, of a production CFD simulation code is reviewed. Arithmetic differences lead to divergence in intra-node performance scaling, so the developers had removed some arithmetic from Mini-HYDRA, but this breaks the simulation so limits numerical research. This work restores the arithmetic, repeating validation for similar performance scaling, achieving similar intra-node scaling performance whilst neither are memory-bound. MPI strong scaling functionality is also added, achieving very similar multi-node scaling performance.
The arithmetic restoration inevitably leads to different memory-bounds, and also different and varied responses to changes in processor architecture or instruction set. A performance model is developed that predicts this difference in response, in terms of the arithmetic differences. It is supplemented by a new benchmark that measures the memory-bound of CFD loops. Together, they predict the strong scaling performance of a production ‘target’ code, with a mean error of 8.8% (s = 5.2%). Finally, the model is used to investigate limited speedup from vectorisation despite not being memory-bound. It identifies that instruction throughput is significantly reduced relative to serial counterparts, independent of data ordering in memory, indicating a bottleneck within the processor core.
Item Type: | Thesis or Dissertation (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software | ||||
Library of Congress Subject Headings (LCSH): | Application software, Computer simulation, Computer programming | ||||
Official Date: | March 2020 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Computer Science | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Jarvis, Stephen A., 1970- | ||||
Sponsors: | Rolls-Royce plc ; Horizon 2020 (Programme) ; Engineering and Physical Sciences Research Council ; Intel Corporation | ||||
Format of File: | |||||
Extent: | xiv, 78 leaves : illustrations (some colour) | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year