Article Contents
Article Contents

# A proximal ADMM with the Broyden family for convex optimization problems

This work is supported by Grant-in-Aid for Scientific Research (C) (17K00032) from Japan Society for the Promotion of Science.

• Alternating direction methods of multipliers (ADMM) have been well studied and effectively used in various application fields. The classical ADMM must solve two subproblems exactly at each iteration. To overcome the difficulty of computing the exact solution of the subproblems, some proximal terms are added to the subproblems. Recently, {{a special proximal ADMM has been studied}} whose regularized matrix in the proximal term is generated by the BFGS update (or limited memory BFGS) at every iteration for a structured quadratic optimization problem. {{The numerical experiments also showed}} that the numbers of iterations were almost same as those by the exact ADMM. In this paper, we propose such a proximal ADMM for more general convex optimization problems, and extend the proximal term by the Broyden family update. We also show the convergence of the proposed method under standard assumptions.

Mathematics Subject Classification: Primary: 90C25, 90C53, 65K05.

 Citation:

• Figure 1.  Evolution of the objective function values with respect to CPU time for small problems

Table 1.  Comparison on iteration steps and CPU time (seconds) for different $t$ of ADM-Broyden

 Setting ADM-Broyden $t=-0.1$ ADM-Broyden $t=0$ ADM-Broyden $t=0.1$ $m$ $n$ $p$ $\beta$ Iter. Int.Iter. Time $\beta$ Iter. Int.Iter. Time $\beta$ Iter. Int.Iter. Time 500 200 0.1 2.4 72.0 216.0 0.16 2.3 74.0 222.0 0.16 2.4 72.0 216.0 0.16 500 500 0.1 3.3 81.0 243.0 0.46 3.5 83.0 249.0 0.49 2.2 82.0 246.0 0.44 1000 500 0.1 2.3 158.0 474.0 1.14 2.2 155.0 465.0 1.11 2.2 157.0 471.0 1.12

Table 2.  Comparison on iteration steps and CPU time (seconds) among the four methods

 Setting ADMM-1 ADMM-2 ADM-PRO ADM-BFGS $m$ $n$ $p$ $\beta$ Iter. Int.Iter. Time $\beta$ Iter. Int.Iter. Time $\beta$ Iter. Int.Iter. Time $\beta$ Iter. Int.Iter. Time 500 200 0.1 4.0 15.0 60.0 0.35 0.6 77.0 306.0 0.14 0.3 131.0 524.0 0.23 2.3 74.0 222.0 0.16 500 500 0.1 4.0 20.0 100.0 2.50 0.6 105.0 418.0 0.26 0.3 390.0 1559.0 0.75 3.5 83.0 249.0 0.49 1000 500 0.1 9.0 19.0 85.0 4.59 0.8 128.0 510.0 0.51 0.3 233.0 931.0 0.88 2.2 155.0 465.0 1.11

Table 3.  Comparison on iteration steps and CPU time (seconds) among the five methods

 Algorithm $m=1000$, $n=500$, $p=0.1$ $m=1000$, $n=1000$, $p=0.1$ $m=1000$, $n=2000$, $p=0.1$ $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M ADMM-1 9.0 17.0 85.0 4.59 $-$ $-$ 10.0 19.0 95.0 13.46 $-$ $-$ 9.0 23.0 115.0 53.47 $-$ $-$ [3pt] ADMM-2 0.8 128.0 510.0 0.51 0.46 0.05 0.8 159.0 633.0 1.18 0.99 0.19 0.6 223.0 891.0 3.13 2.93 0.20 [3pt] ADM-PRO 0.3 233.0 931.0 0.86 0.83 0.03 0.4 550.0 2199.0 2.64 2.59 0.05 0.2 941.0 3764.0 10.68 10.61 0.07 [3pt] ADM-LBFGS 0.7 143.0 570.0 0.55 0.52 0.03 0.8 184.0 733.0 0.94 0.89 0.05 0.8 305.0 1217.0 3.75 3.68 0.07 [3pt] ADM-ILBFGS 0.7 139.0 554.0 0.55 0.52 0.03 1.0 185.0 736.0 0.95 0.90 0.05 0.9 294.0 1173.0 3.58 3.51 0.07 $m=5000$, $n=1000$, $p=0.1$ $m=5000$, $n=1000$, $p=0.5$ $m=5000$, $n=1000$, $p=1.0$ $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M ADMM-1 40 16.0 80.0 153.79 $-$ $-$ 161 15.0 74.0 612.32 $-$ $-$ 250 15.0 74.0 984.23 $-$ $-$ [3pt] ADMM-2 1.7 266.0 1057.0 11.67 11.41 0.26 3.9 526.0 1578.0 81.18 79.56 1.62 4.6 665.0 1995.0 142.19 138.47 3.72 [3pt] ADM-PRO 1.1 344.0 1373.0 14.40 14.19 0.12 2.1 719.0 2859.0 111.96 111.41 0.55 2.6 920.0 3625.0 196.81 195.91 0.90 [3pt] ADM-LBFGS 1.7 268.0 1064.0 11.22 11.10 0.12 3.7 531.0 1593.0 80.30 79.75 0.55 4.6 669.0 2007.0 142.71 141.81 0.90 [3pt] ADM-ILBFGS 1.7 267.0 1061.0 11.02 10.90 0.12 3.8 528.0 1584.0 79.80 79.25 0.55 4.6 667.0 2001.0 141.63 140.73 0.90 $m=10000$, $n=5000$, $p=0.1$ $m=10000$, $n=5000$, $p=0.5$ $m=10000$, $n=5000$, $p=1.0$ $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M $\beta$ Iter. Int.Iter. Time T-A T-M ADMM-2 2.2 475.0 1892.0 193.10 181.67 11.43 4.3 1005.0 3015.0 1407.36 1316.69 90.67 5.3 1258.0 3774.0 2769.45 2554.13 215.32 [5pt] ADM-PRO 0.6 736.0 2944.0 262.77 259.87 2.90 0.9 1405.0 5618.0 1843.21 1834.80 8.41 1.0 1780.0 7118.0 3664.39 3653.84 10.55 [5pt] ADM-LBFGS 2.2 486.0 1935.0 178.95 176.05 2.90 4.0 1038.0 3114.0 1370.20 1361.79 8.41 4.5 1319.0 3957.0 2712.25 2701.70 10.55 [5pt] ADM-ILBFGS 2.3 477.0 1898.0 175.98 173.08 2.90 4.2 1024.0 3072.0 1359.35 1350.94 8.41 5.0 1278.0 3834.0 2629.89 2619.34 10.55
•  [1] S. Banert, R. I. Bot and E. R. Csetnek, Fixing and extending some recent results on the ADMM algorithm, preprint, arXiv: 1612.05057. [2] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and TrendsⓇ in Machine Learning, 3 (2011), 1–122. [3] G. Chen and M. Teboulle, A proximal-based decomposition method for convex minimization problems, Math. Programming, 64 (1994), 81-101.  doi: 10.1007/BF01582566. [4] W. Deng and W. Yin, On the global and linear convergence of the generalized alternating direction method of multipliers, J. Sci. Comput., 66 (2016), 889-916.  doi: 10.1007/s10915-015-0048-x. [5] J. Eckstein and D. P. Bertsekas, On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators, Math. Programming, 55 (1992), 293-318.  doi: 10.1007/BF01581204. [6] J. Eckstein and M. Fukushima, Some reformulations and applications of the alternating direction method of multipliers, in Large Scale Optimization (Gainesville, FL, 1993), Kluwer Acad. Publ., Dordrecht, 1994,115–134. doi: 10.1007/978-1-4613-3632-7_7. [7] J. Eckstein and W. Yao, Approximate ADMM algorithms derived from Lagrangian splitting, Comput. Optim. Appl., 68 (2017), 363-405.  doi: 10.1007/s10589-017-9911-z. [8] J. Eckstein and W. Yao, Relative-error approximate versions of Douglas-Rachford splitting and special cases of the ADMM, Math. Program., 170 (2018), 417-444.  doi: 10.1007/s10107-017-1160-5. [9] M. Fazel, T. K. Pong, D. Sun and P. Tseng, Hankel matrix rank minimization with applications to system identification and realization, SIAM J. Matrix Anal. Appl., 34 (2013), 946-977.  doi: 10.1137/110853996. [10] R. Fletcher, Practical Methods of Optimization, Wiley-Interscience [John Wiley & Sons], New York, 2001. [11] C. A. Floudas and P. M. Pardalos, Encyclopedia of Optimization. Vol. I–VI, Kluwer Academic Publishers, Dordrecht, 2001. doi: 10.1016/j.tcs.2009.07.038. [12] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Computers & Mathematics with Applications, 2 (1976), 17-40.  doi: 10.1016/0898-1221(76)90003-1. [13] R. Glowinski and A. Marroco, Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité, d'une classe de problèmes de Dirichlet non linéaires, Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Rouge Anal. Numér., 9 (1975), 41–76. doi: 10.1051/m2an/197509R200411. [14] M. L. N. Gon{\c{c}}alves, M. M. Alves and J. G. Melo, Pointwise and ergodic convergence rates of a variable metric proximal alternating direction method of multipliers, J. Optim. Theory Appl., 177 (2018), 448-478.  doi: 10.1007/s10957-018-1232-6. [15] Y. Gu and N. Yamashita, An alternating direction method of multipliers with the BFGS update for structured convex quadratic optimization, preprint, arXiv: 1903.02270. [16] B. He, L.-Z. Liao, D. Han and H. Yang, A new inexact alternating directions method for monotone variational inequalities, Math. Program., 92 (2002), 103-118.  doi: 10.1007/s101070100280. [17] B. He, F. Ma and X. Yuan, Optimally linearizing the alternating direction method of multipliers for convex programming, Comput. Optim. Appl., 75 (2020), 361-388.  doi: 10.1007/s10589-019-00152-3. [18] B. He and X. Yuan, On the ${O}(1/n)$ convergence rate of the {D}ouglas-{R}achford alternating direction method, SIAM J. Numer. Anal., 50 (2012), 700-709.  doi: 10.1137/110836936. [19] K. Koh, S.-J. Kim and S. Boyd, An interior-point method for large-scale $l_1$-regularized logistic regression, J. Mach. Learn. Res., 8 (2007), 1519-1555. [20] M. Li, D. Sun and K.-C. Toh, A majorized ADMM with indefinite proximal terms for linearly constrained convex composite optimization, SIAM J. Optim., 26 (2016), 922-950.  doi: 10.1137/140999025. [21] P. A. Lotito, L. A. Parente and M. V. Solodov, A class of variable metric decomposition methods for monotone variational inclusions, J. Convex Anal., 16 (2009), 857-880. [22] D. G. Luenberger and Y. Ye, Linear and nonlinear programming, in International Series in Operations Research & Management Science, 228, Springer, Cham, 1984. doi: 10.1007/978-3-319-18842-3. [23] J. Nocedal and S. J. Wright, Numerical Optimization, Springer Series in Operations Research and Financial Engineering, 2$^nd$ edition, Springer, New York, 2006. [24] R. T. Rockafellar and R. J.-B. Wets, Numerical Optimization, Springer Science & Business Media, 2009. [25] M. H. Xu and T. Wu, A class of linearized proximal alternating direction methods, J. Optim. Theory Appl., 151 (2011), 321-337.  doi: 10.1007/s10957-011-9876-5. [26] X.-M. Yuan, The improvement with relative errors of He et al.'s inexact alternating direction method for monotone variational inequalities, Math. Comput. Modelling, 42 (2005), 1225-1236.  doi: 10.1016/j.mcm.2005.04.007.

Figures(1)

Tables(3)