pyproximal.optimization.primal.AndersonProximalGradient#
- pyproximal.optimization.primal.AndersonProximalGradient(proxf, proxg, x0, epsg=1.0, tau=None, niter=10, nhistory=10, epsr=1e-10, safeguard=False, tol=None, callback=None, show=False)[source]#
Proximal gradient with Anderson acceleration
Solves the following minimization problem using the Proximal gradient algorithm with Anderson acceleration:
\[\mathbf{x} = \argmin_\mathbf{x} f(\mathbf{x}) + \epsilon g(\mathbf{x})\]where \(f(\mathbf{x})\) is a smooth convex function with a uniquely defined gradient and \(g(\mathbf{x})\) is any convex function that has a known proximal operator.
- Parameters
- proxf
pyproximal.ProxOperator
Proximal operator of f function (must have
grad
implemented)- proxg
pyproximal.ProxOperator
Proximal operator of g function
- x0
numpy.ndarray
Initial vector
- epsg
float
ornumpy.ndarray
, optional Scaling factor of g function
- tau
float
ornumpy.ndarray
, optional Positive scalar weight, which should satisfy the following condition to guarantees convergence: \(\tau \in (0, 1/L]\) where
L
is the Lipschitz constant of \(\nabla f\). Whentau=None
, backtracking is used to adaptively estimate the best tau at each iteration. Finally, note that \(\tau\) can be chosen to be a vector when dealing with problems with multiple right-hand-sides- niter
int
, optional Number of iterations of iterative scheme
- nhistory
int
, optional Number of previous iterates to be kept in memory (to compute the scaling factors
- epsr
float
, optional Scaling factor for regularization added to the inverse of :math:mathbf{R}^T mathbf{R}`
- safeguard
bool
, optional Apply safeguarding strategy to the update (
True
) or not (False
)- tol
float
, optional Tolerance on change of objective function (used as stopping criterion). If
tol=None
, run untilniter
is reached- callback
callable
, optional Function with signature (
callback(x)
) to call after each iteration wherex
is the current model vector- show
bool
, optional Display iterations log
- proxf
- Returns
- x
numpy.ndarray
Inverted model
- x
Notes
The Proximal gradient algorithm with Anderson acceleration can be expressed by the following recursion [1]:
\[\begin{split}m_k = min(m, k)\\ \mathbf{g}^{k} = \mathbf{x}^{k} - \tau^k \nabla f(\mathbf{x}^k)\\ \mathbf{r}^{k} = \mathbf{g}^{k} - \mathbf{g}^{k}\\ \mathbf{G}^{k} = [\mathbf{g}^{k},..., \mathbf{g}^{k-m_k}]\\ \mathbf{R}^{k} = [\mathbf{r}^{k},..., \mathbf{r}^{k-m_k}]\\ \alpha_k = (\mathbf{R}^{kT} \mathbf{R}^{k})^{-1} \mathbf{1} / \mathbf{1}^T (\mathbf{R}^{kT} \mathbf{R}^{k})^{-1} \mathbf{1}\\ \mathbf{y}^{k+1} = \mathbf{G}^{k} \alpha_k\\ \mathbf{x}^{k+1} = \prox_{\tau^{k+1} g}(\mathbf{y}^{k+1})\end{split}\]where \(m\) equals
nhistory
, \(k=1,2,...,n_{iter}\), \(\mathbf{y}^{0}=\mathbf{x}^{0}\), \(\mathbf{y}^{1}=\mathbf{x}^{0} - \tau^0 \nabla f(\mathbf{x}^0)\), \(\mathbf{x}^{1}=\prox_{\tau^k g}(\mathbf{y}^{1})\), and \(\mathbf{g}^{0}=\mathbf{y}^{1}\).Refer to [1] for the guarded version of the algorithm (when
safeguard=True
).
Examples using pyproximal.optimization.primal.AndersonProximalGradient
#
IHT, ISTA, FISTA, AA-ISTA, and TWIST for Compressive sensing