Advanced Optimization (2023 Fall)
Course Information
This course aims to provide an overview of modern optimization theory designed for machine learning, particularly focusing on the gradientbased firstorder optimization methods (with offline and online deployments), which serve as the foundational optimization tools for modern largescale learning tasks.
 Instructor: Peng Zhao (zhaop@lamda.nju.edu.cn)
 TA: Jing Wang (wangjing@lamda.nju.edu.cn) & YanFeng Xie (xieyf@lamda.nju.edu.cn)
 Location: Room 212, Yifu Building B, Xianlin Campus (逸B212)
 Office hour: appointment by email
Announcement
 [New! 2023.12.29] The thirteenth (which is the last!) lecture will take place on 12.29 (14:0016:30) onsite.
 [2023.12.13] The twelfth lecture will take place on 12.15 (14:0016:30) onsite (TA will discuss hw & video lecture).
 [2023.12.08] The eleventh lecture will take place on 12.08 (14:0016:30) onsite.
 [Important! 2023.11.28] Our homework2 is finally out, see details in the next section!
 [2023.11.25] The tenth lecture will take place on 12.01 (14:0016:30) onsite.
 [2023.11.18] Our homework1 ddl is approaching, with only one week left! Please do it on time. Note that plagiarism is strictly prohibited in this course!
 [2023.11.18] The ninth lecture will take place on 11.24 (14:0016:30) onsite.
 [2023.11.13] The eighth lecture will take place on 11.17 (14:0016:30) onsite.
 [2023.11.10] The seventh lecture will take place on 11.10 (14:0016:30) onsite.
 [2023.11.05] The sixth lecture will take place on 11.07 (18:3021:30) onsite. The room location will be at Yifu Building B207.
 [2023.10.25] Our homework1 is finally out, see details in the next section!
 [2023.10.23] The fifth lecture will take place on 10.27 (14:0016:30) onsite.
 [2023.10.19] The forth lecture will take place on 10.20 (14:0016:30) onsite.
 [2023.10.12] The third lecture will take place on 10.13 (14:0016:30) onsite.
 [2023.10.08] The second lecture will take place on 10.08 (14:0016:30) onsite.
 [2023.09.21] Due to an important task nessitating a trip out of Nanjing, the first lecture has to be held online and delayed for one hour to start, i.e., the first lecture will take place on 09.22 via Tencent meeting (15:0017:30). The meeting ID will be shared in QQ.
 Since this course is designed for both graduate fresh students and undergraduate returned students, we will start from the third week, i.e.,
the first lecture will take place on 09.22 (14:0016:30).  [2023.08.29] We now have a course website!
The first lecture will take place on 09.08 (14:0016:30).
Homework

Homework 1:
 Homework1 file download: homework1_file.zip [update history, last update@11.07]
 Instruction: submission rules.html
 ID List of who already submitted: list.txt
DDL: 11.25 (Saturday) 23:59 Beijing time

Homework 2:
 Homework2 file download: homework2_file.zip [update history, last update@12.05]
 Instruction: submission rules.html
 ID List of who already submitted: list.txt
DDL: 12.28 (Thursday) 23:59 Beijing time
Course agenda
Week  Date  Topic  Slides 
Lecture Notes/ Readings 

1  09.22  Introduction; Mathematical Background  Lecture 1  on matrix norm 
09.29  no lecture (due to National Holiday)  
2  10.08  Convex Optimization Basics; Function Properties  Lecture 2  Chapter 3.2 of Boyd and Vandenberghe’s book (on operations that preserve convexity) Chapter 3 of Amir Beck’s book (on subgradients) Chapter 5 of Amir Beck’s book (on smoothness and strong convexity) 
3  10.13  GD Methods I: GD method, Lipschitz optimization  Lecture 3  Chapter 8.2 of Amir Beck’s book (on GD methods for convex and strongly convex functions) 
4  10.20  GD Methods II: GD method, smooth optimization, Nesterov’s AGD, composite optimization  Lecture 4  Chapter 3.2 of Bubeck’s book (on GD methods for smooth functions) Chapter 14 & 15 of Ashok Cutkosky’s lecture note (on momentum and acceleration) Chapter 10 of Amir Beck’s book (on composite optimization and proximal gradient) 
5  10.27  Online Convex Optimization: OGD, convex functions, strongly convex functions, online Newton step, expconcave functions  Lecture 5  Chapter 3 of Hazan’s book (on OGD for convex and strongly convex functions) Chapter 4 of Hazan’s book (on ONS for expconcave functions) 
11.03  no lecture (due to Sports Day)  
6  11.07  Prediction with Expert Advice: Hedge, minimax bound, lower bound; mirror descent (motivation and preliminary)  Lecture 6  Lecture Note 2 of Luo’s course (on PEA problem) 
7  11.10  Online Mirror Descent: OMD framework, regret analysis, primaldual view, mirror map, FTRL, dual averaging  Lecture 7  Chapter 6 of Orabona’s note (on OMD) Chapter 7 of Orabona’s note (on FTRL) Chapter 4 of Bubeck’s book (on mirror map, dual averaging) 
8  11.17  Adaptive Online Convex Optimization: problemdependent guarantee, smallloss bound, selfconfident tuning, smallloss OCO, selfbounding property bound  Lecture 8  Lecture Note 4 of Luo’s course (on smallloss PEA) Chapter 4.2 of Orabona’s note (on smallloss OCO) 
9  11.24  Optimistic Online Mirror Descent: optimistic online learning, predictable sequence, smallloss bound, gradientvariance bound, gradientvariation bound  Lecture 9 
Chapter 7.12 of Orabona’s note (on variance/variation bound of OCO) 
10  12.01  Online Learning in Games: twoplayer zerosum games, repeated play, minimax theorem, fast convergence  Lecture 10 
Chapter 8 of Hazan’s book (on games and repeated play) Lecture Note 7 of Luo’s course (on fastconvergence games ) 
11  12.08  Adversarial Bandits: MAB, IW estimator, Exp3, lower bound, BCO, gradient estimator, selfconcordant barrier  Lecture 11 
Lecture Note 12 of Luo’s course (on adversarial MAB) Chapter 12 of Lattimore and Szepesvári’s book (on Exp3.IX algorithm for highprobability regret) 
12  12.15  Stochastic Bandits: MAB, UCB, linear bandits, selfnormalized concentration, generalized linear bandits  Lecture 12  Lecture Note 14 of Luo’s course (on stochastic MAB) Lecture Note 15 of Luo’s course (on stochastic linear bandits) Chapter 6 & 7 of Lattimore and Szepesvári’s book (on ETC and UCB) 
12.22  no lecture  
13  12.29  Advanced Topics: nonstationary online learning, universal online learning, online ensemble, base algorithm, meta algorithm  Lecture 13  see the references in the slides 
Prerequisites
Familiar with calculus, probability, and linear algebra. Basic knowledge in convex optimization and machine learning.
Reading
Unfortunately, we don’t have a specific textbook for this course. In addition to the course slides and lecture notes (will write if time permits), the following books are very good materials for extra readings.
 Amir Beck. FirstOrder Methods in Optimization. MOSSIAM Series on Optimization, 2017.
 Sébastien Bubeck. Convex Optimization: Algorithms and Complexity. Foundation and Trends in Machine Learning, 2015.
 Elad Hazan. Introduction to Online Convex Optimization (second edition). MIT Press, 2022.
 Francesco Orabona. A Modern Introduction to Online Learning. Lecture Notes, 2022.
 Tor Lattimore and Csaba Szepesvári. Bandit Algorithms. Cambridge University Press, 2021.
Some related courses:
 CSCI 659: Introduction to Online Optimization/Learning, Fall 2022. University of South California, Haipeng Luo.
 EC525: Optimization for Machine Learning, Fall 2022. Boston University, Ashok Cutkosky.
Last modified: 20231229 by Peng Zhao