Hybrid Consensus Alternating Direction Method of Multipliers (H-CADMM) The present work introduces the hybrid consensus alternating direction method of multipliers (H-CADMM), a novel framework for optimization over networks which unifies existing distributed optimization approaches, including the centralized and the decentralized consensus ADMM. H-CADMM provides a flexible tool that leverages the underlying graph topology in order to achieve a desirable sweet-spot between node-to-node communication overhead and rate of convergence — thereby alleviating known limitations of both C-CADMM and D-CADMM. A rigorous analysis of the novel method establishes linear convergence rate, and also guides the choice of parameters to optimize this rate. The novel hybrid update rules of H-CADMM lend themselves to ‘in-network acceleration’ that is shown to effect considerable — and essentially ‘free-of-charge’ — performance boost over the fully decentralized ADMM. Comprehensive numerical tests validate the analysis and showcase the potential of the method in tackling efficiently, widely useful learning tasks. …
Internal Node Bagging We introduce a novel view to understand how dropout works as an inexplicit ensemble learning method, which do not point out how many and which nodes to learn a certain feature. We propose a new training method named internal node bagging, this method explicitly force a group of nodes to learn a certain feature in training time, and combine those nodes to be one node in inference time. It means we can use much more parameters to improve model’s fitting ability in training time while keeping model small in inference time. We test our method on several benchmark datasets and find it significantly more efficiency than dropout on small model. …
Deep Learning Approximation Neural networks offer high-accuracy solutions to a range of problems, but are costly to run in production systems because of computational and memory requirements during a forward pass. Given a trained network, we propose a techique called Deep Learning Approximation to build a faster network in a tiny fraction of the time required for training by only manipulating the network structure and coefficients without requiring re-training or access to the training data. Speedup is achieved by by applying a sequential series of independent optimizations that reduce the floating-point operations (FLOPs) required to perform a forward pass. First, lossless optimizations are applied, followed by lossy approximations using singular value decomposition (SVD) and low-rank matrix decomposition. The optimal approximation is chosen by weighing the relative accuracy loss and FLOP reduction according to a single parameter specified by the user. On PASCAL VOC 2007 with the YOLO network, we show an end-to-end 2x speedup in a network forward pass with a 5% drop in mAP that can be re-gained by finetuning. …
Like this:
Like Loading…
Related