Autoencoder-Based Gradient Compression for Distributed Training

Lusine Abrahamyan, Nikos Deligiannis, Ioannis Bekoulis, Yiming Chen

Research output: Contribution to journalConference paper

1 Citation (Scopus)

Abstract

Large-scale distributed training has recently been proposed as a solution to speed-up the training of deep neural networks on huge datasets. Distributed training, however, entails high communication rates for gradient exchange among computing nodes and requires expensive high-bandwidth network infrastructure. Various gradient compression methods have been proposed to overcome this limitation, including sparsification, quantization, and entropy encoding of the gradients. However, most existing methods leverage only the intra-node information redundancy, that is, they compress gradients at each node independently. In contrast, we advocate that the gradients across the nodes are correlated and propose a method to leverage this inter-node redundancy to obtain higher compression rates. In this work, we propose the Learned Gradient Compression (LGC) framework to reduce communication rates within a distributed training with the parameter server communication protocol. Our framework leverages an autoencoder to capture the common information in the gradients of the distributed nodes and eliminate the transmission of redundant information. Our experiments show that the proposed approach achieves significantly higher gradient compression ratios than state-of-the-art approaches like DGC and ScaleCom.

Original languageEnglish
Pages (from-to)2179-2183
Number of pages5
JournalProceedings of EUSIPCO
Volume29
DOIs
Publication statusPublished - 2021

Fingerprint

Dive into the research topics of 'Autoencoder-Based Gradient Compression for Distributed Training'. Together they form a unique fingerprint.

Cite this