Note
Go to the end to download the full example code.
Adaptive Primal-Dual#
This tutorial compares the traditional Chambolle-Pock Primal-dual algorithm with the Adaptive Primal-Dual Hybrid Gradient of Goldstein and co-authors.
By adaptively changing the step size in the primal and the dual directions, this algorithm shows faster convergence, which is of great importance for some of the problems that the Primal-Dual algorithm can solve - especially those with an expensive proximal operator.
For this example, we consider a simple denoising problem.
import numpy as np
import matplotlib.pyplot as plt
import pylops
from skimage.data import camera
import pyproximal
plt.close('all')
def callback(x, f, g, K, cost, xtrue, err):
cost.append(f(x) + g(K.matvec(x)))
err.append(np.linalg.norm(x - xtrue))
Let’s start by loading a sample image and adding some noise
We can now define a pylops.Gradient
operator as well as the
different proximal operators to be passed to our solvers
# Gradient operator
sampling = 1.
Gop = pylops.Gradient(dims=(ny, nx), sampling=sampling, edge=False,
kind='forward', dtype='float64')
L = 8. / sampling ** 2 # maxeig(Gop^H Gop)
# L2 data term
lamda = .04
l2 = pyproximal.L2(b=noise_img.ravel(), sigma=lamda)
# L1 regularization (isotropic TV)
l1iso = pyproximal.L21(ndim=2)
To start, we solve our denoising problem with the original Primal-Dual algorithm
# Primal-dual
tau = 0.95 / np.sqrt(L)
mu = 0.95 / np.sqrt(L)
cost_fixed = []
err_fixed = []
iml12_fixed = \
pyproximal.optimization.primaldual.PrimalDual(l2, l1iso, Gop,
tau=tau, mu=mu, theta=1.,
x0=np.zeros_like(img.ravel()),
gfirst=False, niter=300, show=True,
callback=lambda x: callback(x, l2, l1iso,
Gop, cost_fixed,
img.ravel(),
err_fixed))
iml12_fixed = iml12_fixed.reshape(img.shape)
Primal-dual: min_x f(Ax) + x^T z + g(x)
---------------------------------------------------------
Proximal operator (f): <class 'pyproximal.proximal.L2.L2'>
Proximal operator (g): <class 'pyproximal.proximal.L21.L21'>
Linear operator (A): <class 'pylops.basicoperators.gradient.Gradient'>
Additional vector (z): None
tau = 0.33587572106361 mu = 0.33587572106361
theta = 1.00 niter = 300
Itn x[0] f g z^x J = f + g + z^x
1 2.80061e+00 1.148e+08 1.329e+05 0.000e+00 1.149e+08
2 5.43283e+00 1.118e+08 1.381e+05 0.000e+00 1.120e+08
3 7.99482e+00 1.090e+08 1.214e+05 0.000e+00 1.091e+08
4 1.05579e+01 1.062e+08 1.113e+05 0.000e+00 1.063e+08
5 1.31439e+01 1.034e+08 1.108e+05 0.000e+00 1.036e+08
6 1.57354e+01 1.008e+08 1.142e+05 0.000e+00 1.009e+08
7 1.82972e+01 9.822e+07 1.186e+05 0.000e+00 9.834e+07
8 2.07957e+01 9.570e+07 1.239e+05 0.000e+00 9.583e+07
9 2.32102e+01 9.325e+07 1.302e+05 0.000e+00 9.338e+07
10 2.55372e+01 9.087e+07 1.373e+05 0.000e+00 9.100e+07
31 6.79311e+01 5.300e+07 2.876e+05 0.000e+00 5.329e+07
61 1.11500e+02 2.519e+07 4.530e+05 0.000e+00 2.564e+07
91 1.40858e+02 1.266e+07 5.661e+05 0.000e+00 1.323e+07
121 1.60327e+02 7.016e+06 6.420e+05 0.000e+00 7.658e+06
151 1.73542e+02 4.463e+06 6.929e+05 0.000e+00 5.156e+06
181 1.82359e+02 3.305e+06 7.270e+05 0.000e+00 4.032e+06
211 1.88247e+02 2.778e+06 7.499e+05 0.000e+00 3.528e+06
241 1.92208e+02 2.536e+06 7.652e+05 0.000e+00 3.301e+06
271 1.94845e+02 2.424e+06 7.755e+05 0.000e+00 3.199e+06
292 1.96155e+02 2.383e+06 7.806e+05 0.000e+00 3.164e+06
293 1.96208e+02 2.382e+06 7.808e+05 0.000e+00 3.163e+06
294 1.96261e+02 2.380e+06 7.810e+05 0.000e+00 3.161e+06
295 1.96314e+02 2.379e+06 7.813e+05 0.000e+00 3.160e+06
296 1.96365e+02 2.378e+06 7.815e+05 0.000e+00 3.159e+06
297 1.96416e+02 2.376e+06 7.817e+05 0.000e+00 3.158e+06
298 1.96466e+02 2.375e+06 7.818e+05 0.000e+00 3.157e+06
299 1.96516e+02 2.374e+06 7.820e+05 0.000e+00 3.156e+06
300 1.96565e+02 2.373e+06 7.822e+05 0.000e+00 3.155e+06
Total time (s) = 9.88
---------------------------------------------------------
We do the same with the adaptive algorithm
cost_ada = []
err_ada = []
iml12_ada, steps = \
pyproximal.optimization.primaldual.AdaptivePrimalDual(l2, l1iso, Gop,
tau=tau, mu=mu,
x0=np.zeros_like(img.ravel()),
niter=45, show=True, tol=0.05,
callback=lambda x: callback(x, l2, l1iso,
Gop, cost_ada,
img.ravel(),
err_ada))
iml12_ada = iml12_ada.reshape(img.shape)
Adaptive Primal-dual: min_x f(Ax) + x^T z + g(x)
---------------------------------------------------------
Proximal operator (f): <class 'pyproximal.proximal.L2.L2'>
Proximal operator (g): <class 'pyproximal.proximal.L21.L21'>
Linear operator (A): <class 'pylops.basicoperators.gradient.Gradient'>
Additional vector (z): None
tau0 = 3.358757e-01 mu0 = 3.358757e-01
alpha0 = 5.000000e-01 eta = 9.500000e-01
s = 1.000000e+00 delta = 1.500000e+00
niter = 45 tol = 5.000000e-02
Itn x[0] f g z^x J = f + g + z^x
2 2.80061e+00 1.148e+08 1.329e+05 0.000e+00 1.149e+08
3 7.99618e+00 1.090e+08 1.621e+05 0.000e+00 1.091e+08
4 1.73999e+01 9.886e+07 2.026e+05 0.000e+00 9.906e+07
5 3.32965e+01 8.314e+07 2.856e+05 0.000e+00 8.342e+07
6 5.74605e+01 6.211e+07 4.071e+05 0.000e+00 6.251e+07
7 8.94965e+01 3.914e+07 5.553e+05 0.000e+00 3.969e+07
8 1.14518e+02 2.500e+07 6.637e+05 0.000e+00 2.567e+07
9 1.33989e+02 1.630e+07 7.387e+05 0.000e+00 1.704e+07
10 1.49094e+02 1.094e+07 7.898e+05 0.000e+00 1.173e+07
13 1.74505e+02 4.724e+06 8.557e+05 0.000e+00 5.580e+06
17 1.80410e+02 3.752e+06 8.449e+05 0.000e+00 4.596e+06
21 1.83097e+02 3.301e+06 8.320e+05 0.000e+00 4.133e+06
25 1.86269e+02 2.925e+06 8.415e+05 0.000e+00 3.767e+06
29 1.89437e+02 2.690e+06 8.324e+05 0.000e+00 3.522e+06
33 1.91875e+02 2.550e+06 8.151e+05 0.000e+00 3.365e+06
37 1.93706e+02 2.464e+06 8.046e+05 0.000e+00 3.269e+06
38 1.94142e+02 2.449e+06 8.030e+05 0.000e+00 3.252e+06
39 1.94575e+02 2.435e+06 8.017e+05 0.000e+00 3.236e+06
40 1.95001e+02 2.422e+06 8.007e+05 0.000e+00 3.223e+06
41 1.95415e+02 2.411e+06 7.999e+05 0.000e+00 3.211e+06
42 1.95813e+02 2.401e+06 7.992e+05 0.000e+00 3.201e+06
43 1.96189e+02 2.393e+06 7.988e+05 0.000e+00 3.191e+06
44 1.96540e+02 2.385e+06 7.984e+05 0.000e+00 3.183e+06
45 1.96863e+02 2.378e+06 7.981e+05 0.000e+00 3.176e+06
46 1.97157e+02 2.372e+06 7.978e+05 0.000e+00 3.170e+06
Total time (s) = 1.71
Let’s now compare the final results as well as the convergence curves of the two algorithms. We can see how the adaptive Primal-Dual produces a better estimate of the clean image in a much smaller number of iterations
fig, axs = plt.subplots(1, 4, figsize=(16, 4))
axs[0].imshow(img, cmap='gray', vmin=0, vmax=255)
axs[0].set_title('Original')
axs[0].axis('off')
axs[0].axis('tight')
axs[1].imshow(noise_img, cmap='gray', vmin=0, vmax=255)
axs[1].set_title('Noisy')
axs[1].axis('off')
axs[1].axis('tight')
axs[2].imshow(iml12_fixed, cmap='gray', vmin=0, vmax=255)
axs[2].set_title('PD')
axs[2].axis('off')
axs[2].axis('tight')
axs[3].imshow(iml12_ada, cmap='gray', vmin=0, vmax=255)
axs[3].set_title('Adaptive PD')
axs[3].axis('off')
axs[3].axis('tight')
fig, axs = plt.subplots(2, 1, figsize=(12, 7))
axs[0].plot(cost_fixed, 'k', label='Fixed step')
axs[0].plot(cost_ada, 'r', label='Adaptive step')
axs[0].legend()
axs[0].set_title('Functional')
axs[1].plot(err_fixed, 'k', label='Fixed step')
axs[1].plot(err_ada, 'r', label='Adaptive step')
axs[1].set_title('MSE')
axs[1].legend()
plt.tight_layout()
fig, axs = plt.subplots(3, 1, figsize=(12, 7))
axs[0].plot(steps[0], 'k')
axs[0].set_title(r'$\tau^k$')
axs[1].plot(steps[1], 'k')
axs[1].set_title(r'$\mu^k$')
axs[2].plot(steps[2], 'k')
axs[2].set_title(r'$\alpha^k$')
plt.tight_layout();
Total running time of the script: (0 minutes 12.522 seconds)