The Library

TurboMGNN : improving concurrent GNN training tasks on GPU with fine-grained kernel fusion

Tools

Wu, Wenchao, Shi, Xuanhua, He, Ligang and Jin, Hai (2023) TurboMGNN : improving concurrent GNN training tasks on GPU with fine-grained kernel fusion. IEEE Transactions on Parallel and Distributed Systems, 36 (6). pp. 1968-1981. doi:10.1109/tpds.2023.3267943 ISSN 1045-9219.

Preview

PDF
WRAP-TurboMGNN-improving-concurrent-GNN-training-tasks-GPU-fine-grained-kernel-fusion-He-2023.pdf - Published Version - Requires a PDF viewer.
Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0.
Download (2644Kb) | Preview

Official URL: https://doi.org/10.1109/tpds.2023.3267943

Request Changes to record.

Abstract

Graph Neural Networks (GNN) have evolved as powerful models for graph representation learning. Many works have been proposed to support GNN training efficiently on GPU. However, these works only focus on a single GNN training task such as operator optimization, task scheduling, and programming model. Concurrent GNN training, which is needed in the applications such as neural network structure search, has not been explored yet. This work aims to improve the training efficiency of the concurrent GNN training tasks on GPU by developing fine-grained methods to fuse the kernels from different tasks. Specifically, we propose a fine-grained Sparse Matrix Multiplication (SpMM) based kernel fusion method to eliminate redundant accesses to graph data. In order to increase the fusion opportunity and reduce the synchronization cost, we further propose a novel technique to enable the fusion of the kernels in forward and backward propagation. Finally, in order to reduce the resource contention caused by the increased number of concurrent, heterogeneous GNN training tasks, we propose an adaptive strategy to group the tasks and match their operators according to resource contention. We have conducted extensive experiments, including kernel- and model-level benchmarks. The results show that the proposed methods can achieve up to 2.6X performance speedup.

Item Type:

Journal Article

Subjects:

Q Science > QA Mathematics
Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software

Divisions:

Faculty of Science, Engineering and Medicine > Science > Computer Science

SWORD Depositor:

Library Publications Router

Library of Congress Subject Headings (LCSH):

Neural networks (Computer science), Graph theory, Deep learning (Machine learning), Kernel functions

Journal or Publication Title:

IEEE Transactions on Parallel and Distributed Systems

Publisher:

IEEE

ISSN:

1045-9219

Official Date:

June 2023

Dates:

Date	Event
June 2023	Published
17 April 2023	Available
25 March 2023	Accepted

Volume:

Number:

Page Range:

pp. 1968-1981

DOI:

10.1109/tpds.2023.3267943

Status:

Peer Reviewed

Publication Status:

Published

Access rights to Published version:

Open Access (Creative Commons)

Date of first compliant deposit:

1 June 2023

Date of first compliant Open Access:

2 June 2023

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

University of Warwick
Publications service & WRAP

Highlight your research

The Library

TurboMGNN : improving concurrent GNN training tasks on GPU with fine-grained kernel fusion

Abstract

Repository staff actions (login required)

Downloads

University of WarwickPublications service & WRAP

Highlight your research

The Library

TurboMGNN : improving concurrent GNN training tasks on GPU with fine-grained kernel fusion

Abstract

Repository staff actions (login required)

Downloads

University of Warwick
Publications service & WRAP