Article Contents
Article Contents
Early Access

Early Access articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Early Access publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Early Access articles via the “Early Access” tab for the selected journal.

# An efficient augmented Lagrangian method with semismooth Newton solver for total generalized variation

• Total generalization variation (TGV) is a very powerful and important regularization for various inverse problems and computer vision tasks. In this paper, we propose a semismooth Newton based augmented Lagrangian method for solving this problem. The augmented Lagrangian method (also called as method of multipliers) is widely used for lots of smooth or nonsmooth variational problems. However, its efficiency heavily depends on solving the corresponding coupled and nonlinear system together and simultaneously. With efficient primal-dual semismooth Newton methods for the challenging and highly coupled nonlinear subproblems involving total generalized variation, we develop a highly efficient and competitive augmented Lagrangian method compared with some fast first-order method. With the analysis of the metric subregularities of the corresponding functions, we give both the global convergence and local linear convergence rate for the proposed augmented Lagrangian methods.

Mathematics Subject Classification: Primary: 65K10, 49J52; Secondary: 49M15.

 Citation:

• Figure 1.  Original and corrupted images for PSNR test with TGV regularization. The sizes of the original images are Train: 512 $\times$ 357; Sails: 512 $\times$ 768; Baboon: 512 $\times$ 512; Man: 1024 $\times$ 1024. The corresponding noise level of the corrupted images are Train 1, Baboon 1, Sails 1, Man 2: with 10% Gaussian noise; Man 1: with 20% Gaussian noise; Train 2, Baboon 2, Sails 2: with 5% Gaussian noise

Figure 2.  Images (a), (d), and (g) show the original Turtle, Cameraman and Two macaws images (It is taken from http://r0k.us/graphics/kodak/kodim23.html authored by Kelly S.). (b) and (h) are noisy versions corrupted by 10% Gaussian noise; (e) is corrupted by 5% Gaussian noise. (c), (f), and (i) show the denoised images with ALM-PDP with $\mathfrak{U}^{k+1} <10^{-6}$. Their sizes are: Turle: $128 \times 128$; Cameraman: $256 \times 256$; Two macaws: $768 \times 512$

Table 1.  PSNR results of TGV regularized image denoising. Train 1, Baboon 1, Sails 1, Man 2: with 10% gaussian noise, $\alpha = [0.2, 0.1]$; Man 1: with 20% gaussian noise, $\alpha = [0.2, 0.1]$; Train 2, Baboon 2, Sails 2: with 5% gaussian noise, $\alpha = [0.1, 0.05]$

 $a = 1.0$ $a = 10^{-8}$ $a = 0$ $\text{PSNR}$ $\text{RMSE}$ $\text{SSIM}$ $\text{PSNR}$ $\text{RMSE}$ $\text{SSIM}$ $\text{PSNR}$ $\text{RMSE}$ $\text{SSIM}$ Train 1 26.604 2.186e-3 7.691e-1 26.595 2.190e-3 7.688e-1 26.596 2.190e-3 7.688e-1 Train 2 29.854 1.034e-3 8.504e-1 29.835 1.039e-3 8.500e-1 29.835 1.039e-3 8.500e-1 Man 1 13.886 4.087e-2 5.590e-1 13.886 4.087e-2 5.589e-1 13.886 4.087e-2 5.590e-1 Man 2 27.472 1.790e-3 6.768e-1 27.469 1.791e-3 6.769e-1 27.470 1.790e-3 6.769e-1 Baboon 1 22.961 5.057e-3 5.768e-1 22.952 5.067e-3 5.757e-1 22.952 5.067e-3 5.758e-1 Baboon 2 24.368 3.658e-3 6.990e-1 24.371 3.655e-3 6.993e-1 24.371 3.655e-3 6.993e-1 Sails 1 19.027 1.251e-2 6.147e-1 19.023 1.252-2 6.133e-1 19.023 1.252-2 6.133e-1 Sails 2 26.168 2.416e-3 6.915e-1 26.173 2.414-3 6.918e-1 26.172 2.414-3 6.918e-1

Table 2.  Image Cameraman denoised by TGV with algorithm ALM-PDP. $N_{SSN}$ denotes the number of Newton iterations. $N_{ABCG}$ denotes average number of BiCGSTAB iterations. Here $\sigma_0 = 4$, $\sigma_{k+1} = 4\sigma_k$

 $k=1$ $k=2$ $k=3$ $k=4$ $k=5$ $k=6$ $k=7$ res($u$) 1.04e-3 5.67e-4 2.48e-4 2.53e-5 1.09e-6 4.70e-7 9.91e-7 res($w$) 3.92e-4 2.47e-4 1.29e-4 8.02e-6 2.14e-7 1.43e-7 2.53e-7 res($\lambda$) 5.55 1.15 2.77e-1 6.89e-2 1.65e-2 3.51e-3 6.20e-4 res($\mu$) 1.09e1 3.02 8.03e-1 2.15e-1 5.19e-2 1.02e-2 1.88e-3 Gap 4.52e-4 7.82e-5 1.83e-5 4.22e-6 8.48e-7 1.42e-7 2.28e-8 $N_{SSN}$ 6 5 6 9 11 14 24 $N_{ABCG}$ 10 13 27 43 75 97 122

Table 3.  For $n(t)$ of the first line of each algorithm, $n$ presents the iteration number for the primal dual gap less than the stopping value; $t$ denotes the CPU time

 TGV: $\alpha = [0.2, 0.1]$ Turtle: $128 \times 128$ $n(t)$ $\text{res}(u)$ $\text{res}(w)$ $\text{res}(\lambda)$ $\text{res}(\mu)$ Gap $\text{PSNR}$ $\mathfrak{U}$ ALM-PDP 7(20.72s) 2.40e-3 1.42e-3 4.35e-4 1.80e-3 3.60e-8 24.93 1e-4 ALG2 3176(33.36s) 6.38e-3 5.77e-4 9.37e-6 2.20e-5 1.53e-9 24.93 1e-4 ALM-PDP 9(114.03.s) 8.97e-6 5.63e-6 1.38e-5 3.02e-5 5.93e-10 24.93 1e-6 ALG2 10808(1033.93s) 6.33e-5 4.43e-6 4.42e-9 8.32e-9 2.07e-13 24.93 1e-6

Table 4.  For $n(t)$ of the first line of each algorithm, $n$ presents the iteration number for the primal-dual gap less than the stopping value, $t$ denoting the CPU time

 TGV: $\alpha = [0.1, 0.05]$ Cameraman: $256 \times 256$ $n(t)$ $\text{res}(u)$ $\text{res}(w)$ $\text{res}(\lambda)$ $\text{res}(\mu)$ Gap $\text{PSNR}$ $\mathfrak{U}$ ALM-PDP 7(66.15s) 5.71e-3 3.84e-3 6.49e-4 1.90e-3 2.32e-8 30.16 1e-4 ALG2 3115(77.45s) 1.27e-2 8.31e-4 1.02e-5 4.65e-5 1.62e-9 30.16 1e-4 ALM-PDP 9(369.39s) 2.25e-5 1.27e-5 1.49e-5 6.64e-5 5.58e-10 30.16 1e-6 ALG2 81706(2000.26s) 1.26e-4 9.11e-6 1.60e-8 3.06e-8 2.85e-13 30.16 1e-6

Table 5.  For $n(t)$ of the first line of each algorithm, $n$ presents the iteration number for the primal-dual gap less than the stopping value, $t$ denoting the CPU time. The notation "$<$eps" denotes the corresponding quality less than the machine precision in Matlab "eps"

 TGV: $\alpha = [0.2, 0.1]$ Two macaws: $768 \times 512$ $n(t)$ $\text{res}(u)$ $\text{res}(w)$ $\text{res}(\lambda)$ $\text{res}(\mu)$ Gap $\text{PSNR}$ $\mathfrak{U}$ ALM-PDP 3(238.01s) 2.43e-2 1.39e-2 6.55e-2 2.28e-5 1.70e-6 31.53 1e-3 ALG2 683(99.84s) 2.57e-1 3.01e-2 3.14e-3 7.98e-3 1.63e-7 31.52 1e-3 ALM-PDP 8(1499.23s) 2.79e-4 1.02e-4 4.52e-4 1.66e-3 9.09e-9 31.53 1e-5 ALG2 15638(2332.49s) 2.82e-3 1.55e-4 7.40e-7 2.48e-6 1.66e-11 31.53 1e-5
•  [1] F. J. Aragón Artacho and M. H. Geoffory, Metric subregularity of the convex subdifferential in Banach spaces, J. Nonlinear Convex Anal., 15 (2014), 35-47. [2] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, New York, 2011. doi: 10.1007/978-1-4419-9467-7. [3] D. P. Bertsekas,  Constrained Optimization and Lagrange Multiplier Methods, Academic Press, Paris, 1982. [4] K. Bredies, K. Kunisch and T. Pock, Total generalized variation, SIAM J. Imaging Sci., 3 (2010), 492-526.  doi: 10.1137/090769521. [5] K. Bredies, R. Nuster and R. Watschinger, TGV-regularized inversion of the Radon transform for photoacoustic tomography, Biomedical Optics Express, 11 (2020), 994-1019.  doi: 10.1364/BOE.379941. [6] K. Bredies and H. P. Sun, Preconditioned Douglas–Rachford algorithms for TV- and TGV-regularized variational imaging problems, J. Math. Imaging and Vis., 52 (2015), 317-344.  doi: 10.1007/s10851-015-0564-1. [7] K. Bredies and T. Valkonen, Inverse problems with second-order total generalized variation constraints, Proceedings of SampTA 2011 - 9th International Conference on Sampling Theory and Applications, Singapore, 2011. [8] A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging and Vis., 40 (2011), 120-145.  doi: 10.1007/s10851-010-0251-1. [9] F. H. Clarke, Optimization and Nonsmooth Analysis, Vol. 5, Classics Appl. Math., SIAM, Philadelphia, 1990. doi: 10.1137/1.9781611971309. [10] C. Clason, Nonsmooth Analysis and Optimization, Lecture Notes, 2018. doi: 10.48550/arXiv.1708.04180. [11] Y. Cui, D. Sun and K.-C. Toh, On the R-superlinear convergence of the KKT residues generated by the augmented Lagrangian method for convex composite conic programming, Math. Program., Ser. A, 178 (2019), 381-415.  doi: 10.1007/s10107-018-1300-6. [12] A. L. Dontchev and R. T. Rockafellar, Functions and Solution Mappings: A View from Variational Analysis, Second Edition, Springer Series in Operations Research and Financial Engineering, New York, 2014. doi: 10.1007/978-1-4939-1037-3. [13] W. J. Duncan, Some devices for the solution of large sets of simultaneous linear equations, Philos. Mag. Ser, 35 (1944), 660-670.  doi: 10.1080/14786444408520897. [14] F. Facchinei and J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, Volume I, Springer-Verlag New York, Inc, 2003. [15] M. Fortin and R. Glowinski (eds.), Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems, North-Holland, Amsterdam, 1983. [16] R. Glowinski, S. J. Osher and W. Yin (eds.), Splitting Methods in Communication, Imaging, Science, and Engineering, Springer, 2016. doi: 10.1007/978-3-319-41589-5. [17] L. Guttman, Enlargement methods for computing the inverse matrix, Ann. Math. Statist., 17 (1946), 336-343.  doi: 10.1214/aoms/1177730946. [18] W. W. Hager, Updating the inverse of a matrix, SIAM Review., 31 (1989), 221-239.  doi: 10.1137/1031049. [19] H. V. Henderson and S. R. Searle, On deriving the inverse of a sum of matrices, SIAM Review., 23 (1981), 53-60.  doi: 10.1137/1023004. [20] M. R. Hestenes, Multiplier and gradient methods, J. Optim. Theory Appl., 4 (1969), 303-320.  doi: 10.1007/BF00927673. [21] M. Hintermüller and K. Kunisch, Total bounded variation regularization as a bilaterally constrained optimization problem, SIAM J. Appl. Math., 64 (2004), 1311-1333.  doi: 10.1137/S0036139903422784. [22] M. Hintermüller, K. Papafitsoros, C. N. Rautenberg and H. Sun, Dualization and automatic distributed parameter selection of total generalized variation via bilevel optimization, Numerical Functional Analysis and Optimization, 43 (2022), 887-932.  doi: 10.1080/01630563.2022.2069812. [23] M. Hintermüller and G. Stadler, An infeasible primal-dual algorithm for total bounded variaton-based inf-convolution-type image restoration, SIAM J. Sci. Comput., 28 (2006), 1-23.  doi: 10.1137/040613263. [24] A. J. Hoffman, On approximate solutions of systems of linear inequalities, J. Research Nat. Bur. Standards, 49 (1952), 263-265.  doi: 10.6028/jres.049.027. [25] K. Ito and K. Kunisch, Lagrange Multiplier Approach to Variational Problems and Applications, Advances in Design and Control, 15, Philadelphia, SIAM, 2008. doi: 10.1137/1.9780898718614. [26] K. Ito and K. Kunisch, An active set strategy based on the augmented Lagrangian formulation for image restoration, RAIRO, Math. Mod. and Num. Analysis, 33 (1999), 1-21.  doi: 10.1051/m2an:1999102. [27] D. Klatte and B. Kummer, Constrained minima and Lipschitzian penalties in metric spaces, SIAM J. Optim., 13 (2002), 619-633.  doi: 10.1137/S105262340139625X. [28] D. Klatte and B. Kummer, Nonsmooth Equations in Optimization. Regularity, Calculus, Methods and Applications, (Nonconvex Optimization and Its Applications, 60), Springer, Boston, MA, 2002. [29] F. Knoll, K. Bredies, T. Pock and R. Stollberger, Second order total generalized variation (TGV) for MRI, Magnetic Resonance in Medicine, 65 (2011), 480-491.  doi: 10.1002/mrm.22595. [30] D. Leventhal, Metric subregularity and the proximal point method, J. Math. Anal. Appl., 360 (2009), 681-688.  doi: 10.1016/j.jmaa.2009.07.012. [31] X. Li, D. Sun and K.-C. Toh, A highly efficient semismooth Newton augmented Lagrangian method for solving lasso problems, SIAM J. Optim., 28 (2018), 433-458.  doi: 10.1137/16M1097572. [32] F. J. Luque, Asymptotic convergence analysis of the proximal point algorithm, SIAM J. Control Optim., 22 (1984), 277-293.  doi: 10.1137/0322019. [33] R. Mifflin, Semismooth and semiconvex functions in constrained optimization, SIAM J. Control Optim., 15 (1977), 959-972.  doi: 10.1137/0315061. [34] M. J. D. Powell, A method for nonlinear constraints in minimization problems, in Optimization, R. Fletcher, ed., Academic Press, New York, 1968, 283-298. [35] R. T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim., 14 (1976), 877-898.  doi: 10.1137/0314056. [36] R. T. Rockafellar, Augmented Lagrangians and applications of the proximal point algorithm in convex programming, Math. Oper. Res., 1 (1976), 97-116.  doi: 10.1287/moor.1.2.97. [37] S. Scholtes, Introduction to Piecewise Differentiable Equations, Springer Briefs in Optimization, Springer, New York, 2012. doi: 10.1007/978-1-4614-4340-7. [38] G. Stadler, Semismooth Newton and augmented Lagrangian methods for a simplified friction problem, SIAM J. Optim., 15 (2004), 39-62.  doi: 10.1137/S1052623403420833. [39] G. Stadler, Infinite-Dimensional Semi-Smooth Newton and Augmented Lagrangian Methods for Friction and Contact Problems in Elasticity, PhD thesis, University of Graz, 2004. [40] D. Sun and J. Han, Newton and quasi-Newton methods for a class of nonsmooth equations and related problems, SIAM J. Optim., 7 (1997), 463-480.  doi: 10.1137/S1052623494274970. [41] H. Sun, An investigation on semismooth Newton based augmented Lagrangian method for image restoration, J. Sci. Comput., 92 (2022), Paper No. 82, arXiv: 1911.10968, 2019, submitted. doi: 10.1007/s10915-022-01907-7. [42] M. Ulbrich, Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces, MOS-SIAM Series on Optimization, 2011. doi: 10.1137/1.9781611970692. [43] H. A. Van der Vorst,  Iterative Krylov Methods for Large Linear Systems, Cambridge University Press, Cambridge, 2003.  doi: 10.1017/CBO9780511615115. [44] J. Ye, X. Yuan, S. Zeng and J. Zhang, Variational analysis perspective on linear convergence of some first order methods for nonsmooth convex optimization problems, Set-Valued and Variational Analysis, 29 (2021), 803-837.  doi: 10.1007/s11228-021-00591-3. [45] F. Zhang (eds.), The Schur Complement and Its Applications, Numerical Methods and Algorithms, 4, Springer US, 2005. doi: 10.1007/b105056. [46] Y. Zhang, N. Zhang, D. Sun and K.-C. Toh, An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems, Mathematical Programming A, 179 (2020), 223–263. doi: 10.1007/s10107-018-1329-6. [47] X.-Y. Zhao, D. Sun and K.-C. Toh, A Newton-CG augmented Lagrangian method for semidefinite programming, SIAM J. Optim., 20 (2010), 1737-1765.  doi: 10.1137/080718206. [48] Z. Zhou and A. Man-Cho So, A unified approach to error bounds for structured convex optimization problems, Math. Program., Ser. A, 165 (2007), 689-728.  doi: 10.1007/s10107-016-1100-9.

Figures(2)

Tables(5)