The Library
Partitioning sparse deep neural networks for scalable training and inference
Tools
Demirci, Gunduz Vehbi and Ferhatosmanoglu, Hakan (2021) Partitioning sparse deep neural networks for scalable training and inference. In: 2021 International Conference on Supercomputing (ICS ’21), Virtual conference -USA, 14-17 Jun 2021. Published in: ICS '21: Proceedings of the ACM International Conference on Supercomputing pp. 254-265. doi:10.1145/3447818.3460372
|
PDF
WRAP-Partitioning-sparse-deep-neural-networks-scalable-training-2021.pdf - Accepted Version - Requires a PDF viewer. Download (1437Kb) | Preview |
|
|
PDF
WRAP-partitioning-sparse-deep-neural-networks-scalable-training-arXiv-2021.pdf - Submitted Version - Requires a PDF viewer. Download (886Kb) | Preview |
|
Plain Text
151073_permissions.txt - Permissions Correspondence Embargoed item. Restricted access to Repository staff only Download (1932b) |
Official URL: https://doi.org/10.1145/3447818.3460372
Abstract
The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning. Both the feedforward (inference) and backpropagation steps in stochastic gradient descent (SGD) algorithm for training sparse DNNs involve consecutive sparse matrix-vector multiplications (SpMVs). We first introduce a distributed-memory parallel SpMV-based solution for the SGD algorithm to improve its scalability. The parallelization approach is based on row-wise partitioning of weight matrices that represent neuron connections between consecutive layers. We then propose a novel hypergraph model for partitioning weight matrices to reduce the total communication volume and ensure computational load-balance among processors. Experiments performed on sparse DNNs demonstrate that the proposed solution is highly efficient and scalable. By utilizing the proposed matrix partitioning scheme, the performance of our solution is further improved significantly.
Item Type: | Conference Item (Paper) | ||||||
---|---|---|---|---|---|---|---|
Alternative Title: | |||||||
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software | ||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||
Library of Congress Subject Headings (LCSH): | Neural networks (Computer science), Machine learning, Parallel algorithms | ||||||
Journal or Publication Title: | ICS '21: Proceedings of the ACM International Conference on Supercomputing | ||||||
Publisher: | ACM | ||||||
Official Date: | 3 June 2021 | ||||||
Dates: |
|
||||||
Page Range: | pp. 254-265 | ||||||
DOI: | 10.1145/3447818.3460372 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Reuse Statement (publisher, data, author rights): | © ACM, 2021 This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ICS '21: Proceedings of the ACM International Conference on Supercomputing http://doi.acm.org/10.1145/3447818.3460372 | ||||||
Access rights to Published version: | Restricted or Subscription Access | ||||||
Date of first compliant deposit: | 14 April 2021 | ||||||
Date of first compliant Open Access: | 1 September 2021 | ||||||
Conference Paper Type: | Paper | ||||||
Title of Event: | 2021 International Conference on Supercomputing (ICS ’21) | ||||||
Type of Event: | Conference | ||||||
Location of Event: | Virtual conference -USA | ||||||
Date(s) of Event: | 14-17 Jun 2021 | ||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year