A proximal modified quasi-Newton method for nonsmooth regularized optimization

Diouane, Youssef; Laghdaf Habiboullah, Mohamed; Orban, Dominique

We develop R2N, a modified quasi-Newton method for minimizing the sum of a $\mathcal{C}^1$ function $f$ and a lower semi-continuous prox-bounded $h$ . Both $f$ and $h$ may be nonconvex. At each iteration, our method computes a step by minimizing the sum of a quadratic model of $f$ , a model of $h$ , and an adaptive quadratic regularization term. A step may be computed by way of a variant of the proximal-gradient method. An advantage of R2N over competing trust-region methods is that proximal operators do not involve an extra trust-region indicator. We also develop the variant R2DH, in which the model Hessian is diagonal, which allows us to compute a step without relying on a subproblem solver when $h$ is separable. R2DH can be used as standalone solver, but also as subproblem solver inside R2N. We describe non-monotone variants of both R2N and R2DH. Global convergence of a first-order stationarity measure to zero holds without relying on local Lipschitz continuity of $\nabla f$ , while allowing model Hessians to grow unbounded, an assumption particularly relevant to quasi-Newton models. Under Lipschitz-continuity of $\nabla f$ , we establish a tight worst-case evaluation complexity bound of $O(1 / \epsilon^{2/(1 - p)})$ to bring said measure below $\epsilon > 0$ , where $0 \leq p < 1$ controls the growth of model Hessians. Specifically, the latter must not diverge faster than $|\mathcal{S}_k|^p$ , where $\mathcal{S}_k$ is the set of successful iterations up to iteration $k$ . When $p = 1$ , we establish the tight exponential complexity bound $O(\exp(c \epsilon^{-2}))$ where $c > 0$ is a constant. We describe our Julia implementation and report numerical experience on a classic basis-pursuit problem, an image denoising problem, a minimum-rank matrix completion problem, and a nonlinear support vector machine. In particular, the minimum-rank problem cannot be solved directly at this time by a trust-region approach as corresponding proximal operators are not known analytically.

Published September 2024 , 24 pages

Research Axis

Axis 2: Decision support made in complex systems

Research application

Engineering (engineering design, digital design)

Document

G2464.pdf (600 KB)

GERAD

G-2024-64

A proximal modified quasi-Newton method for nonsmooth regularized optimization

Youssef Diouane, Mohamed Laghdaf Habiboullah, and Dominique Orban

Research Axis

Research application

Document