Performance analysis of a hybrid MPI/CUDA implementation of the NAS-LU benchmark
Pennycook, Simon J., Hammond, Simon D., Mudalige, Gihan R. and Jarvis, Stephen A., 1970- (2010) Performance analysis of a hybrid MPI/CUDA implementation of the NAS-LU benchmark. In: 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS 10), New Orleans, LA, USA, 13-19, Nov 2010Full text not available from this repository.
The emergence of Graphics Processing Units (GPUs) as a potential alternative to conventional general-purpose processors has led to significant interest in these architectures by both the academic community and the High Performance Computing (HPC) industry. While GPUs look likely to deliver unparalleled levels of performance, the publication of studies claiming performance improvements in excess of 30,000x are misleading. Significant on-node performance improvements have been demonstrated for code kernels and algorithms amenable to GPU acceleration; studies demonstrating comparable results for full scientific applications requiring multiple-GPU architectures are rare. In this paper we present an analysis of a port of the NAS LU benchmark to NVIDIA's Compute Unified Device Architecture (CUDA) - the most stable GPU programming model currently available. Our solution is also extended to multiple nodes and multiple GPU devices. Runtime performance on several GPUs is presented, ranging from low-end, consumer-grade cards such as the 8400GS to NVIDIA's flagship Fermi HPC processor found in the recently released C2050. We compare the runtimes of these devices to several processors including those from Intel, AMD and IBM. In addition to this we utilise a recently developed performance model of LU. With this we predict the runtime performance of LU on large-scale distributed GPU clusters, which are predicted to become commonplace in future high-end HPC architectural solutions.
|Item Type:||Conference Item (Paper)|
|Subjects:||Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
?? QA76.73 ??
|Divisions:||Faculty of Science > Computer Science|
|Status:||Not Peer Reviewed|
|Description:||The 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computing Systems (PMBS 10) was held as part of the ACM/IEEE International Conference for High Performance, Networking, Storage and Analysis (SC 10), in New Orleans, Louisiana, USA|
|Conference Paper Type:||Paper|
|Title of Event:||1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS 10)|
|Type of Event:||Workshop|
|Location of Event:||New Orleans, LA, USA|
|Date(s) of Event:||13-19, Nov 2010|
Actions (login required)