Advances of Momentum in Optimization Algorithm and Neural Architecture Design

Dr. Bao Wang

Abstract:  We will present a few recent results on leveraging momentum techniques to improve stochastic optimization and neural architecture design. First, designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and gradient descent (GD). We then integrate momentum into this frame- work and propose a new family of RNNs, called MomentumRNNs. We theoretically prove and numerically demonstrate that MomentumRNNs alleviate the vanishing gradient issue in training RNNs. Also, we show the empirical advantage of the momentum enhanced RNNs over the baseline models. Second, we will present the recent advances of the adaptive momentum in accelerating the stochastic gradient descent (SGD). The adaptive momentum assisted SGD remarkably improves the deep neural net-work training in terms of acceleration and improved generalization and significantly reduces the effort for hyperparameter tuning.

Speaker’s Bio: Bao Wang is an Assistant Professor in the Mathematics Department and affiliated with the Scientific Computing and Imaging Institute at the University of Utah. He received his Ph.D. from Michigan State University in 2016 in Computational Mathematics. He started research in deep learning after he joined UCLA as a postdoc. He has published many refereed journals and conference papers and has expertise in adversarial defense for deep learning and other areas of data science, including optimization, privacy, and data security, spatio-temporal event modeling, and prediction. He is a recipient of the Chancellor's Award for postdoc research of 2020 at the University of California.

Last Updated: July 2, 2021 - 8:32 am