The paper studies a distributed gradient descent (DGD) process and considers\nthe problem of showing that in nonconvex optimization problems, DGD typically\nconverges to local minima rather than saddle points. The paper considers\nunconstrained minimization of a smooth objective function. In centralized\nsettings, the problem of demonstrating nonconvergence to saddle points of\ngradient descent (and variants) is typically handled by way of the\nstable-manifold theorem from classical dynamical systems theory. However, the\nclassical stable-manifold theorem is not applicable in distributed settings.\nThe paper develops an appropriate stable-manifold theorem for DGD showing that\nconvergence to saddle points may only occur from a low-dimensional stable\nmanifold. Under appropriate assumptions (e.g., coercivity), this result implies\nthat DGD typically converges to local minima and not to saddle points.\n
Discussion(0)
No comments yet. Be the first to comment.