Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

Mignacco, Francesca and Urbani, Pierfrancesco and Zdeborová, Lenka (2021) Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem. Machine Learning: Science and Technology, 2 (3). 035029. ISSN 2632-2153

[thumbnail of Mignacco_2021_Mach._Learn.__Sci._Technol._2_035029.pdf] Text
Mignacco_2021_Mach._Learn.__Sci._Technol._2_035029.pdf - Published Version

Download (4MB)

Abstract

In this paper we investigate how gradient-based algorithms such as gradient descent (GD), (multi-pass) stochastic GD, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity. We consider the loss landscape of the high-dimensional phase retrieval problem as a prototypical highly non-convex example. We observe that for phase retrieval the stochastic variants of GD are able to reach perfect generalization for regions of control parameters where the GD algorithm is not. We apply dynamical mean-field theory from statistical physics to characterize analytically the full trajectories of these algorithms in their continuous-time limit, with a warm start, and for large system sizes. We further unveil several intriguing properties of the landscape and the algorithms such as that the GD can obtain better generalization properties from less informed initializations.

Item Type: Article
Subjects: South Asian Library > Multidisciplinary
Depositing User: Unnamed user with email support@southasianlibrary.com
Date Deposited: 05 Jul 2023 04:28
Last Modified: 08 Jun 2024 08:53
URI: http://journal.repositoryarticle.com/id/eprint/1270

Actions (login required)

View Item
View Item