## Math 285J: A First Course on Large-Scale Optimization Methods## LinksInstructors: Ernest Ryu and Wotao Yin Every Monday, Wednesday, and Friday 3-3:50pm in MS 5118 Lecture notes on piazza.com
## Course descriptionIn this course, we use monotone operator theory to derive and analyze a wide range of classical and modern convex optimization algorithms, including stochastic (randomized), parallel, distributed, and decentralized methods that are well-suited for large-scale and big data problems. The list of covered algorithms include: gradient descent, proximal-gradient method, method of multipliers, ADMM, linearized ADMM, PDHG (Chambolle-Pock), Condat-Vu, SGD, Finito, SAGA, SVRG, distributed ADMM, distributed gradient descent, and so on. This course strives for intellectual economy. We analyze a large number of methods in a unified and streamlined manner using a handful of mathematical concepts from monotone operator theory. ## Intended audience and pre-requisiteThis course should benefit students who already find convex optimization useful and wish to learn more about algorithms. This course assumes familiarity with calculus and linear algebra at the level of Math 32a and Math 115a. Prior knowledge of optimization is helpful but not necessary. Basic convex optimization will be reviewed in the first week. This course also assumes proficiency at basic programming. Assignments will involve coding in Python. ## What this course is notThis course is not a course on monotone operator theory, although it is an interesting branch of mathematics in its own right. We only use monotone operator theory as a tool to analyze convex optimization algorithms. This course is not a comprehensive study of the strongest theoretical results. This course presents a unified and streamlined approach to develop and analyze algorithms, and we only discuss results we can obtain through this approach. These results are often not the strongest, but the simplest. ## List of topics(Note: this is not weekly plan.) Preliminaries Monotone, nonexpansive, and averaged operators Fixed-point iteration, resolvent, proximal-point method Operator splitting schemes Variable metric methods Duality and dualization techniques Primal-dual splitting methods Stochastic coordinate fixed-point iteration Asynchronous coordinate fixed-point iteration Stochastic finite-sum zero finding Distributed and decentralized optimization
## Possible additional topics:Maximal and cyclic monotone operators Scaled relative graphs Quasi-properties Subgradient and stochastic subgradient methods
## WorkloadAbout eight homework sets; one every week except for two short weeks No midterm, no final exam, no final project
« Back |