The Library

An efficient task-based all-reduce for machine learning applications

Tools

Li, Zhenyu, Davis, James A. and Jarvis, Stephen A. (2017) An efficient task-based all-reduce for machine learning applications. In: Machine Learning on HPC Environments, ACM New York, NY, USA, 12-17 Nov 2017. Published in: Proceedings of the Machine Learning on HPC Environments (MLHPC'17) ISBN 9781450351379. doi:10.1145/3146347.3146350

Preview

PDF
WRAP-efficient-task-based-all-reduce-machine-learning-applications-Li-2017.pdf - Accepted Version - Requires a PDF viewer.
Download (1402Kb) | Preview

Official URL: http://dx.doi.org/10.1145/3146347.3146350

Request Changes to record.

Abstract

All-Reduce is a collective-combine operation frequently utilised in synchronous parameter updates in parallel machine learning algorithms. The performance of this operation - and subsequently of the algorithm itself - is heavily dependent on its implementation, configuration and on the supporting hardware on which it is run. Given the pivotal role of all-reduce, a failure in any of these regards will significantly impact the resulting scientific output.

In this research we explore the performance of alternative all-reduce algorithms in data-flow graphs and compare these to the commonly used reduce-broadcast approach. We present an architecture and interface for all-reduce in task-based frameworks, and a parallelization scheme for object-serialization and computation. We present a concrete, novel application of a butterfly all-reduce algorithm on the Apache Spark framework on a high-performance compute cluster, and demonstrate the effectiveness of the new butterfly algorithm with a logarithmic speed-up with respect to the vector length compared with the original reduce-broadcast method - a 9x speed-up is observed for vector lengths in the order of 108. This improvement is comprised of both algorithmic changes (65%) and parallel-processing optimization (35%).

The effectiveness of the new butterfly all-reduce is demonstrated using real-world neural network applications with the Spark framework. For the model-update operation we observe significant speed-ups using the new butterfly algorithm compared with the original reduce-broadcast, for both smaller (Cifar and Mnist) and larger (ImageNet) datasets.

Item Type:

Conference Item (Paper)

Subjects:

Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software

Divisions:

Faculty of Science, Engineering and Medicine > Science > Computer Science

Library of Congress Subject Headings (LCSH):

Machine learning, Computer algorithms, Parallel programming (Computer science), Parallel processing (Electronic computers), Parallel algorithms, Electronic data processing -- Distributed processing

Journal or Publication Title:

Proceedings of the Machine Learning on HPC Environments (MLHPC'17)

Publisher:

ACM

ISBN:

9781450351379

Book Title:

Proceedings of the Machine Learning on HPC Environments - MLHPC'17

Official Date:

12 November 2017

Dates:

Date	Event
12 November 2017	Published
27 September 2017	Accepted

DOI:

10.1145/3146347.3146350

Status:

Peer Reviewed

Publication Status:

Published

Access rights to Published version:

Restricted or Subscription Access

Date of first compliant deposit:

5 December 2017

Date of first compliant Open Access:

6 December 2017

Funder:

atos

RIOXX Funder/Project Grant:

Project/Grant ID	RIOXX Funder Name	Funder ID
UNSPECIFIED	Atos IT Solutions and Services	https://viaf.org/viaf/316476904
EP/L016400/1	Engineering and Physical Sciences Research Council	http://dx.doi.org/10.13039/501100000266

Conference Paper Type:

Paper

Title of Event:

Machine Learning on HPC Environments

Type of Event:

Conference

Location of Event:

ACM New York, NY, USA

Date(s) of Event:

12-17 Nov 2017

Related URLs:

Organisation

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

University of Warwick
Publications service & WRAP

Highlight your research

The Library

An efficient task-based all-reduce for machine learning applications

Abstract

Repository staff actions (login required)

Downloads

University of WarwickPublications service & WRAP

Highlight your research

The Library

An efficient task-based all-reduce for machine learning applications

Abstract

Repository staff actions (login required)

Downloads

University of Warwick
Publications service & WRAP