题目: Stochastic Optimization for Large-scale Learning
报告人: 张利军 博士 南京大学
摘要: Leaning from big data is challenging due to the high computational cost. In this talk, I will discuss how to utilize stochastic optimization to reduce the time complexity, as well as the space complexity of learning algorithms. First, I focus on the empirical risk minimization problem in machine learning, and present a variant of stochastic gradient descent, named mixed gradient descent (MGD) to control the variance of gradients. Under the assumption that the loss function is smooth, MGD achieves a linear convergence rate, thus reducing the time complexity dramatically. Second, I study convex optimization problem with a nuclear norm regularizer, and develop an algorithm based on stochastic proximal gradient descent (SPGD). I will show that during the optimization process, the space complexity is linear in the sum of the dimensions of the matrix instead of their product, thus reducing the space complexity significantly. As a theoretical contribution, the convergence rate of the last iterate of SPGD is established for nuclear norm regularization.