G-2024-64
A proximal modified quasi-Newton method for nonsmooth regularized optimization
, , and BibTeX reference
We develop R2N, a modified quasi-Newton method for minimizing the sum of a \(\mathcal{C}^1\)
function \(f\)
and a lower semi-continuous prox-bounded \(h\)
.
Both \(f\)
and \(h\)
may be nonconvex.
At each iteration, our method computes a step by minimizing the sum of a quadratic model of \(f\)
, a model of \(h\)
, and an adaptive quadratic regularization term.
A step may be computed by way of a variant of the proximal-gradient method.
An advantage of R2N over competing trust-region methods is that proximal operators do not involve an extra trust-region indicator.
We also develop the variant R2DH, in which the model Hessian is diagonal, which allows us to compute a step without relying on a subproblem solver when \(h\)
is separable.
R2DH can be used as standalone solver, but also as subproblem solver inside R2N.
We describe non-monotone variants of both R2N and R2DH.
Global convergence of a first-order stationarity measure to zero holds without relying on local Lipschitz continuity of \(\nabla f\)
, while allowing model Hessians to grow unbounded, an assumption particularly relevant to quasi-Newton models.
Under Lipschitz-continuity of \(\nabla f\)
, we establish a tight worst-case evaluation complexity bound of \(O(1 / \epsilon^{2/(1 - p)})\)
to bring said measure below \(\epsilon > 0\)
, where \(0 \leq p < 1\)
controls the growth of model Hessians.
Specifically, the latter must not diverge faster than \(|\mathcal{S}_k|^p\)
, where \(\mathcal{S}_k\)
is the set of successful iterations up to iteration \(k\)
.
When \(p = 1\)
, we establish the tight exponential complexity bound \(O(\exp(c \epsilon^{-2}))\)
where \(c > 0\)
is a constant.
We describe our Julia implementation and report numerical experience on a classic basis-pursuit problem, an image denoising problem, a minimum-rank matrix completion problem, and a nonlinear support vector machine.
In particular, the minimum-rank problem cannot be solved directly at this time by a trust-region approach as corresponding proximal operators are not known analytically.
Published September 2024 , 24 pages
Research Axis
Research application
Document
G2464.pdf (600 KB)