Department of Mathematics
 Search | Help | Login | pdf version | printable version

Math @ Duke





.......................

.......................


Publications [#349561] of Vahid Tarokh

Papers Published

  1. Zhou, Y; Yang, J; Zhang, H; Liang, Y; Tarokh, V, SGD converges to global minimum in deep learning via star-convex path, 7th International Conference on Learning Representations, Iclr 2019 (January, 2019)
    (last updated on 2023/06/01)

    Abstract:
    Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks. However, there is still a lack of understanding on how and why SGD can train these complex networks towards a global minimum. In this study, we establish the convergence of SGD to a global minimum for nonconvex optimization problems that are commonly encountered in neural network training. Our argument exploits the following two important properties: 1) the training loss can achieve zero value (approximately), which has been widely observed in deep learning; 2) SGD follows a star-convex path, which is verified by various experiments in this paper. In such a context, our analysis shows that SGD, although has long been considered as a randomized algorithm, converges in an intrinsically deterministic manner to a global minimum.

 

dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821

Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320