## A Sharp Convergence Rate Analysis for Distributed Accelerated Gradient MethodsHuan Li, Cong Fang, Wotao Yin, Zhouchen LinSubmitted ## OverviewThe underlying problem is defined on a network with When minimizing a convex function, the performance of gradient descent is affected by the smoothness constant and strong-convexity constant of the function, as well as the target accuracy . When we solve the problem over a network, the connectivity of the network also directly affects the performance because the nodes share their local information with their adjacent nodes (network neighbors) through the network. This information exchange is described by a “gossip” matrix . Its second largest singular value is a good indication of the network connectivity ( always equals 1). Like minimizing a convex function, there is a minimal number of gradient
evaluations that any gradient algorithm can guarantee to solve each instance
in a problem class (described by ) to reach accuracy .
The number is called the In short, this paper presents two new algorithms that achieve both (computation and communication) lower bounds up to an extra logarithm factor. The references in these tables can be found in our report. [2] is Scaman et al.’17, [3] is Uribe et al’18, [24] is Qu-Li’17, and [30] is Nesterov’13. We implemented our methods and tested them on common decentralized problems. Acceleration was very well reflected by numerical results. Theory and practice have a good match here. ## Citation
« Back |