Performance analysis of a hybrid MPI/CUDA implementation of the NAS-LU benchmark
Pennycook, Simon J., Hammond, Simon D., Mudalige, Gihan R. and Jarvis, Stephen A., 1970- (2010) Performance analysis of a hybrid MPI/CUDA implementation of the NAS-LU benchmark. In: 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS 10), New Orleans, LA, USA, 13-19, Nov 2010. Published in: ACM SIGMETRICS Performance Evaluation Review, Volume 38 (Number 4). pp. 23-29.
WRAP_Pennycook_per-gpu.pdf - Submitted Version - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (557Kb) | Preview
sc-lu.pdf - Published Version
Restricted to Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://dx.doi.org/10.1145/1964218.1964223
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (NPB) suite to NVIDIA's Compute Unified Device Architecture (CUDA), and report on the optimisation efforts employed to take advantage of this platform. Execution times are reported for several different GPUs, ranging from low-end consumer-grade products to high-end HPC-grade devices, including the Tesla C2050 built on NVIDIA's Fermi processor.
We also utilise recently developed performance models of LU to facilitate a comparison between future large-scale distributed clusters of GPU devices and existing clusters built on traditional CPU architectures, including a quad-socket, quad-core AMD Opteron cluster and an IBM BlueGene/P.
|Item Type:||Conference Item (Paper)|
|Subjects:||Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software|
|Divisions:||Faculty of Science > Computer Science|
|Library of Congress Subject Headings (LCSH):||Graphics processing units, High performance computing|
|Journal or Publication Title:||ACM SIGMETRICS Performance Evaluation Review|
|Official Date:||19 November 2010|
|Page Range:||pp. 23-29|
|Status:||Not Peer Reviewed|
|Access rights to Published version:||Restricted or Subscription Access|
|Description:||The 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computing Systems (PMBS 10) was held as part of the ACM/IEEE International Conference for High Performance, Networking, Storage and Analysis (SC 10), in New Orleans, Louisiana, USA|
|Funder:||Royal Society (Great Britain)|
|Conference Paper Type:||Paper|
|Title of Event:||1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS 10)|
|Type of Event:||Workshop|
|Location of Event:||New Orleans, LA, USA|
|Date(s) of Event:||13-19, Nov 2010|
 CUDA Community Showcase. http://www.nvidia.
Actions (login required)