报告题目:A Hessian-Aware Stochastic Differential Equation Modelling of SGD
时 间:2024年5月14日(星期二)下午16:00
地 点:腾讯会议:314104875
主 办:数学与统计学院、分析数学及应用教育部重点实验室、福建省分析数学及应用重点实验室、统计学与人工智能福建省高校重点实验室、福建省应用数学中心(福建师范大学)
参加对象:概率统计系及其他感兴趣的师生
报告摘要:Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its escaping behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equation (HA-SME), an SDE that incorporates Hessian information of the objective function into both its drift and diffusion terms. Our analysis shows that HA-SME matches the order-best approximation error guarantee among existing SDE models in literature, while achieving a significantly reduced dependence on the smoothness parameter. Further, for quadratics objectives, under mild conditions, HA-SME is proved to be the first SDE model that recovers exactly the SGD dynamics in the distributional sense.
报告人简介: Xiang Li is a second-year PhD student at ETH Zurich, specializing in theoretical analysis and guarantees for optimization techniques in machine learning. His research is particularly focused on continuous-time modeling of optimization algorithms and adaptive gradient methods, aiming to provide a deeper understanding of their behavior and performance.