pyproximal.optimization.primal.ADMM#
- pyproximal.optimization.primal.ADMM(proxf: ProxOperator, proxg: ProxOperator, x0: ndarray[tuple[Any, ...], dtype[_ScalarT]], tau: float, niter: int = 10, z0: Optional[ndarray[tuple[Any, ...], dtype[_ScalarT]]] = None, gfirst: bool = False, callback: Optional[Callable[[...], None]] = None, callbackz: bool = False, show: bool = False) Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]#
Alternating Direction Method of Multipliers
Solves the following minimization problem using Alternating Direction Method of Multipliers:
\[\begin{split}\mathbf{x},\mathbf{z} = \argmin_{\mathbf{x},\mathbf{z}} f(\mathbf{x}) + g(\mathbf{z}) \\ s.t. \; \mathbf{x}=\mathbf{z}\end{split}\]where \(f(\mathbf{x})\) and \(g(\mathbf{z})\) are any convex function that has a known proximal operator.
ADMM can also solve the problem of the form above with a more general constraint: \(\mathbf{Ax}+\mathbf{Bz}=\mathbf{c}\). This routine implements the special case where \(\mathbf{A}=\mathbf{I}\), \(\mathbf{B}=-\mathbf{I}\), and \(\mathbf{c}=\mathbf{0}\), as a general algorithm can be obtained for any choice of \(f\) and \(g\) provided they have a known proximal operator.
On the other hand, for more general choice of \(\mathbf{A}\), \(\mathbf{B}\), and \(\mathbf{c}\), the iterations are not generalizable, i.e. they depend on the choice of the \(f\) and \(g\) functions. For this reason, we currently only provide an additional solver for the special case where \(f\) is a
pyproximal.proximal.L2operator with a linear operator \(\mathbf{G}\) and data \(\mathbf{y}\), \(\mathbf{B}=-\mathbf{I}\) and \(\mathbf{c}=\mathbf{0}\), calledpyproximal.optimization.primal.ADMML2. Note that for the very same choice of \(\mathbf{B}\) and \(\mathbf{c}\), thepyproximal.optimization.primal.LinearizedADMMcan also be used (and this does not require a specific choice of \(f\)).- Parameters
- proxf
pyproximal.ProxOperator Proximal operator of f function
- proxg
pyproximal.ProxOperator Proximal operator of g function
- x0
numpy.ndarray Initial vector (not required when
gfirst=False, can passNone)- tau
float Positive scalar weight, which should satisfy the following condition to guarantees convergence: \(\tau \in (0, 1/L]\) where
Lis the Lipschitz constant of \(\nabla f\).- niter
int Number of iterations of iterative scheme
- z0
numpy.ndarray, optional Initial z vector (not required when
gfirst=True)- gfirst
bool, optional Apply Proximal of operator
gfirst (True) or Proximal of operatorffirst (False)- callback
callable, optional Function with signature (
callback(x)) to call after each iteration wherexis the current model vector- callbackz
bool, optional Modify callback signature to (
callback(x, z)) whencallbackz=True- show
bool, optional Display iterations log
- proxf
- Returns
- x
numpy.ndarray Inverted model
- z
numpy.ndarray Inverted second model
- x
- Raises
- ValueError
If both
x0andz0are set toNone
See also
ADMML2ADMM with L2 misfit function
LinearizedADMMLinearized ADMM
Notes
The ADMM algorithm can be expressed by the following recursion [1]:
\[\begin{split}\mathbf{x}^{k+1} = \prox_{\tau f}(\mathbf{z}^{k} - \mathbf{u}^{k})\\ \mathbf{z}^{k+1} = \prox_{\tau g}(\mathbf{x}^{k+1} + \mathbf{u}^{k})\\ \mathbf{u}^{k+1} = \mathbf{u}^{k} + \mathbf{x}^{k+1} - \mathbf{z}^{k+1}\end{split}\]Note that
xandzconverge to each other, however if iterations are stopped too earlyxis guaranteed to belong to the domain offwhilezis guaranteed to belong to the domain ofg. Depending on the problem either of the two may be the best solution.- 1
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3 (1), 1-122. https://doi.org/10.1561/2200000016.