
The Library
Scalable graph convolutional network training on distributed-memory systems
Tools
Demirci, Gunduz Vehbi, Haldar, Aparajita and Ferhatosmanoglu, Hakan (2022) Scalable graph convolutional network training on distributed-memory systems. Proceedings of the VLDB Endowment, 16 (4). pp. 711-724. doi:10.14778/3574245.3574256 ISSN 2150-8097.
|
PDF
WRAP-scalable-graph-convolutional-network-training-distributed-memory-systems-Demirci-2022.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0. Download (1697Kb) | Preview |
Official URL: https://doi.org/10.14778/3574245.3574256
Abstract
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts. In our solution, the large adjacency and vertex-feature matrices are partitioned among processors. We exploit the vertex-partitioning of the graph to use non-blocking point-to-point communication operations between processors for better scalability. To further minimize the parallelization overheads, we introduce a sparse matrix partitioning scheme based on a hypergraph partitioning model for full-batch training. We also propose a novel stochastic hypergraph model to encode the expected communication volume in mini-batch training. We show the merits of the hypergraph model, previously unexplored for GCN training, over the standard graph partitioning model which does not accurately encode the communication costs. Experiments performed on real-world graph datasets demonstrate that the proposed algorithms achieve considerable speedups over alternative solutions. The optimizations achieved on communication costs become even more pronounced at high scalability with many processors. The performance benefits are preserved in deeper GCNs having more layers as well as on billion-scale graphs.
Item Type: | Journal Article | ||||||
---|---|---|---|---|---|---|---|
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software |
||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||
Library of Congress Subject Headings (LCSH): | Neural networks (Computer science), Graph theory, Deep learning (Machine learning) | ||||||
Journal or Publication Title: | Proceedings of the VLDB Endowment | ||||||
Publisher: | ACM | ||||||
ISSN: | 2150-8097 | ||||||
Official Date: | December 2022 | ||||||
Dates: |
|
||||||
Volume: | 16 | ||||||
Number: | 4 | ||||||
Page Range: | pp. 711-724 | ||||||
DOI: | 10.14778/3574245.3574256 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Access rights to Published version: | Open Access (Creative Commons) | ||||||
Date of first compliant deposit: | 5 January 2023 | ||||||
Date of first compliant Open Access: | 5 January 2023 | ||||||
RIOXX Funder/Project Grant: |
|
||||||
Related URLs: | |||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year