Adaptive Primal-Dual#

This tutorial compares the traditional Chambolle-Pock Primal-dual algorithm with the Adaptive Primal-Dual Hybrid Gradient of Goldstein and co-authors.

By adaptively changing the step size in the primal and the dual directions, this algorithm shows faster convergence, which is of great importance for some of the problems that the Primal-Dual algorithm can solve - especially those with an expensive proximal operator.

For this example, we consider a simple denoising problem.

import numpy as np
import matplotlib.pyplot as plt
import pylops
from skimage.data import camera

import pyproximal

plt.close('all')

def callback(x, f, g, K, cost, xtrue, err):
    cost.append(f(x) + g(K.matvec(x)))
    err.append(np.linalg.norm(x - xtrue))

Let’s start by loading a sample image and adding some noise

# Load image
img = camera()
ny, nx = img.shape

# Add noise
sigman = 20
n = np.random.normal(0, sigman, img.shape)
noise_img = img + n

We can now define a pylops.Gradient operator as well as the different proximal operators to be passed to our solvers

# Gradient operator
sampling = 1.
Gop = pylops.Gradient(dims=(ny, nx), sampling=sampling, edge=False,
                      kind='forward', dtype='float64')
L = 8. / sampling ** 2 # maxeig(Gop^H Gop)

# L2 data term
lamda = .04
l2 = pyproximal.L2(b=noise_img.ravel(), sigma=lamda)

# L1 regularization (isotropic TV)
l1iso = pyproximal.L21(ndim=2)

To start, we solve our denoising problem with the original Primal-Dual algorithm

# Primal-dual
tau = 0.95 / np.sqrt(L)
mu = 0.95 / np.sqrt(L)

cost_fixed = []
err_fixed = []
iml12_fixed = \
    pyproximal.optimization.primaldual.PrimalDual(l2, l1iso, Gop,
                                                  tau=tau, mu=mu, theta=1.,
                                                  x0=np.zeros_like(img.ravel()),
                                                  gfirst=False, niter=300, show=True,
                                                  callback=lambda x: callback(x, l2, l1iso,
                                                                              Gop, cost_fixed,
                                                                              img.ravel(),
                                                                              err_fixed))
iml12_fixed = iml12_fixed.reshape(img.shape)

Primal-dual: min_x f(Ax) + x^T z + g(x)
---------------------------------------------------------
Proximal operator (f): <class 'pyproximal.proximal.L2.L2'>
Proximal operator (g): <class 'pyproximal.proximal.L21.L21'>
Linear operator (A): <class 'pylops.basicoperators.gradient.Gradient'>
Additional vector (z): None
tau = 0.33587572106361          mu = 0.33587572106361
theta = 1.00            niter = 300

   Itn       x[0]          f           g          z^x       J = f + g + z^x
 3.15860e+00   1.148e+08   1.329e+05   0.000e+00       1.149e+08
 5.95850e+00   1.118e+08   1.384e+05   0.000e+00       1.119e+08
 8.58035e+00   1.089e+08   1.218e+05   0.000e+00       1.090e+08
 1.11684e+01   1.061e+08   1.117e+05   0.000e+00       1.063e+08
 1.37181e+01   1.034e+08   1.110e+05   0.000e+00       1.035e+08
 1.62291e+01   1.008e+08   1.145e+05   0.000e+00       1.009e+08
 1.87041e+01   9.820e+07   1.190e+05   0.000e+00       9.832e+07
 2.11462e+01   9.568e+07   1.244e+05   0.000e+00       9.581e+07
 2.35565e+01   9.323e+07   1.308e+05   0.000e+00       9.336e+07
 2.59354e+01   9.085e+07   1.379e+05   0.000e+00       9.099e+07
 6.91910e+01   5.299e+07   2.889e+05   0.000e+00       5.328e+07
 1.13307e+02   2.519e+07   4.553e+05   0.000e+00       2.564e+07
 1.42868e+02   1.267e+07   5.689e+05   0.000e+00       1.324e+07
 1.62676e+02   7.020e+06   6.451e+05   0.000e+00       7.665e+06
 1.75948e+02   4.467e+06   6.962e+05   0.000e+00       5.163e+06
 1.84842e+02   3.310e+06   7.305e+05   0.000e+00       4.040e+06
 1.90801e+02   2.782e+06   7.535e+05   0.000e+00       3.536e+06
 1.94795e+02   2.541e+06   7.689e+05   0.000e+00       3.309e+06
 1.97470e+02   2.429e+06   7.792e+05   0.000e+00       3.208e+06
 1.98799e+02   2.388e+06   7.844e+05   0.000e+00       3.172e+06
 1.98853e+02   2.386e+06   7.846e+05   0.000e+00       3.171e+06
 1.98907e+02   2.385e+06   7.848e+05   0.000e+00       3.170e+06
 1.98960e+02   2.384e+06   7.850e+05   0.000e+00       3.169e+06
 1.99012e+02   2.382e+06   7.852e+05   0.000e+00       3.167e+06
 1.99064e+02   2.381e+06   7.854e+05   0.000e+00       3.166e+06
 1.99115e+02   2.380e+06   7.856e+05   0.000e+00       3.165e+06
 1.99165e+02   2.378e+06   7.858e+05   0.000e+00       3.164e+06
 1.99214e+02   2.377e+06   7.860e+05   0.000e+00       3.163e+06

Total time (s) = 6.89
---------------------------------------------------------

We do the same with the adaptive algorithm

cost_ada = []
err_ada = []
iml12_ada, steps = \
    pyproximal.optimization.primaldual.AdaptivePrimalDual(l2, l1iso, Gop,
                                                          tau=tau, mu=mu,
                                                          x0=np.zeros_like(img.ravel()),
                                                          niter=45, show=True, tol=0.05,
                                                          callback=lambda x: callback(x, l2, l1iso,
                                                                                      Gop, cost_ada,
                                                                                      img.ravel(),
                                                                                      err_ada))
iml12_ada = iml12_ada.reshape(img.shape)

Adaptive Primal-dual: min_x f(Ax) + x^T z + g(x)
---------------------------------------------------------
Proximal operator (f): <class 'pyproximal.proximal.L2.L2'>
Proximal operator (g): <class 'pyproximal.proximal.L21.L21'>
Linear operator (A): <class 'pylops.basicoperators.gradient.Gradient'>
Additional vector (z): None
tau0 = 3.358757e-01     mu0 = 3.358757e-01
alpha0 = 5.000000e-01   eta = 9.500000e-01
s = 1.000000e+00        delta = 1.500000e+00
niter = 45              tol = 5.000000e-02

   Itn       x[0]          f           g          z^x       J = f + g + z^x
     2   3.15860e+00   1.148e+08   1.329e+05   0.000e+00       1.149e+08
     3   8.68514e+00   1.089e+08   1.625e+05   0.000e+00       1.091e+08
     4   1.82426e+01   9.884e+07   2.033e+05   0.000e+00       9.904e+07
     5   3.40762e+01   8.312e+07   2.863e+05   0.000e+00       8.341e+07
     6   5.78726e+01   6.209e+07   4.082e+05   0.000e+00       6.250e+07
     7   8.92975e+01   3.913e+07   5.573e+05   0.000e+00       3.969e+07
     8   1.13902e+02   2.500e+07   6.664e+05   0.000e+00       2.567e+07
     9   1.33167e+02   1.630e+07   7.419e+05   0.000e+00       1.704e+07
    10   1.48255e+02   1.094e+07   7.932e+05   0.000e+00       1.173e+07
    13   1.74190e+02   4.728e+06   8.594e+05   0.000e+00       5.588e+06
    17   1.80754e+02   3.756e+06   8.482e+05   0.000e+00       4.604e+06
    21   1.84435e+02   3.305e+06   8.354e+05   0.000e+00       4.141e+06
    25   1.88410e+02   2.930e+06   8.450e+05   0.000e+00       3.775e+06
    29   1.91757e+02   2.695e+06   8.357e+05   0.000e+00       3.530e+06
    33   1.94331e+02   2.555e+06   8.183e+05   0.000e+00       3.373e+06
    37   1.96311e+02   2.469e+06   8.077e+05   0.000e+00       3.277e+06
    38   1.96730e+02   2.453e+06   8.062e+05   0.000e+00       3.260e+06
    39   1.97123e+02   2.439e+06   8.049e+05   0.000e+00       3.244e+06
    40   1.97491e+02   2.427e+06   8.040e+05   0.000e+00       3.231e+06
    41   1.97835e+02   2.416e+06   8.033e+05   0.000e+00       3.219e+06
    42   1.98157e+02   2.406e+06   8.028e+05   0.000e+00       3.209e+06
    43   1.98459e+02   2.397e+06   8.024e+05   0.000e+00       3.200e+06
    44   1.98742e+02   2.390e+06   8.021e+05   0.000e+00       3.192e+06
    45   1.99007e+02   2.383e+06   8.018e+05   0.000e+00       3.185e+06
    46   1.99254e+02   2.377e+06   8.016e+05   0.000e+00       3.178e+06

Total time (s) = 1.04

Let’s now compare the final results as well as the convergence curves of the two algorithms. We can see how the adaptive Primal-Dual produces a better estimate of the clean image in a much smaller number of iterations

fig, axs = plt.subplots(1, 4, figsize=(16, 4))
axs[0].imshow(img, cmap='gray', vmin=0, vmax=255)
axs[0].set_title('Original')
axs[0].axis('off')
axs[0].axis('tight')
axs[1].imshow(noise_img, cmap='gray', vmin=0, vmax=255)
axs[1].set_title('Noisy')
axs[1].axis('off')
axs[1].axis('tight')
axs[2].imshow(iml12_fixed, cmap='gray', vmin=0, vmax=255)
axs[2].set_title('PD')
axs[2].axis('off')
axs[2].axis('tight')
axs[3].imshow(iml12_ada, cmap='gray', vmin=0, vmax=255)
axs[3].set_title('Adaptive PD')
axs[3].axis('off')
axs[3].axis('tight')

fig, axs = plt.subplots(2, 1, figsize=(12, 7))
axs[0].plot(cost_fixed, 'k', label='Fixed step')
axs[0].plot(cost_ada, 'r', label='Adaptive step')
axs[0].legend()
axs[0].set_title('Functional')
axs[1].plot(err_fixed, 'k', label='Fixed step')
axs[1].plot(err_ada, 'r', label='Adaptive step')
axs[1].set_title('MSE')
axs[1].legend()
plt.tight_layout()

fig, axs = plt.subplots(3, 1, figsize=(12, 7))
axs[0].plot(steps[0], 'k')
axs[0].set_title(r'$\tau^k$')
axs[1].plot(steps[1], 'k')
axs[1].set_title(r'$\mu^k$')
axs[2].plot(steps[2], 'k')
axs[2].set_title(r'$\alpha^k$')
plt.tight_layout();

$\tau^k$, $\mu^k$, $\alpha^k$

Total running time of the script: (0 minutes 8.664 seconds)

Gallery generated by Sphinx-Gallery