doi: 10.3934/jimo.2021084

Variable metric proximal stochastic variance reduced gradient methods for nonconvex nonsmooth optimization

1. 

School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China

2. 

Institute of Mathematics, Hebei University of Technology, Tianjin 300401, China

3. 

LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

4. 

School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China

5. 

School of Business, National University of Singapore, Singapore 119245, Singapore

* Corresponding author: Xin-Wei Liu

Received  January 2021 Revised  February 2021 Published  April 2021

We study the problem of minimizing the sum of two functions. The first function is the average of a large number of nonconvex component functions and the second function is a convex (possibly nonsmooth) function that admits a simple proximal mapping. With a diagonal Barzilai-Borwein stepsize for updating the metric, we propose a variable metric proximal stochastic variance reduced gradient method in the mini-batch setting, named VM-SVRG. It is proved that VM-SVRG converges sublinearly to a stationary point in expectation. We further suggest a variant of VM-SVRG to achieve linear convergence rate in expectation for nonconvex problems satisfying the proximal Polyak-Łojasiewicz inequality. The complexity of VM-SVRG is lower than that of the proximal gradient method and proximal stochastic gradient method, and is the same as the proximal stochastic variance reduced gradient method. Numerical experiments are conducted on standard data sets. Comparisons with other advanced proximal stochastic gradient methods show the efficiency of the proposed method.

Citation: Tengteng Yu, Xin-Wei Liu, Yu-Hong Dai, Jie Sun. Variable metric proximal stochastic variance reduced gradient methods for nonconvex nonsmooth optimization. Journal of Industrial & Management Optimization, doi: 10.3934/jimo.2021084
References:
[1]

J. Barzilai and J. M. Borwein, Two-point step size gradient methods, IMA J. Numer. Anal., 8 (1988), 141-148.  doi: 10.1093/imanum/8.1.141.  Google Scholar

[2]

J. F. BonnansJ. Ch. GilbertC. Lemaréchal and C. A. Sagastizábal, A family of variable metric proximal methods, Math. Programming, 68 (1995), 15-47.  doi: 10.1007/BF01585756.  Google Scholar

[3]

L. BottouF. E. Curtis and J. Nocedal, Optimization methods for large-scale machine learning, SIAM Rev., 60 (2018), 223-311.  doi: 10.1137/16M1080173.  Google Scholar

[4]

Y.-H. DaiM. Al-Baali and X. Yang, A positive Barzilai-Borwein-like stepsize and an extension for symmetric linear systems, Numerical Analysis and Optimization, 134 (2015), 59-75.  doi: 10.1007/978-3-319-17689-5_3.  Google Scholar

[5]

Y.-H. DaiY. Huang and X.-W. Liu, A family of spectral gradient methods for optimization, Comput. Optim. Appl., 74 (2019), 43-65.  doi: 10.1007/s10589-019-00107-8.  Google Scholar

[6]

A. Defazio, F. Bach and S. Lacoste-Julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, in Advances in Neural Information Processing Systems, (2014), 1646-1654. Google Scholar

[7]

R. Fletcher, On the Barzilai-Borwein method, in Optimization and control with applications, Springer, New York, 96 (2005), 235-256. doi: 10.1007/0-387-24255-4_10.  Google Scholar

[8]

S. Ghadimi and G. Lan, Stochastic first-and zeroth-order methods for nonconvex stochastic programming, SIAM J. Optim., 23 (2013), 2341-2368.  doi: 10.1137/120880811.  Google Scholar

[9]

S. GhadimiG. Lan and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program., 155 (2016), 267-305.  doi: 10.1007/s10107-014-0846-1.  Google Scholar

[10]

T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics. Springer, New York, 2009. doi: 10.1007/978-0-387-84858-7.  Google Scholar

[11]

Y. HuangY.-H. DaiX.-W. Liu and H. Zhang, Gradient methods exploiting spectral properties, Optim. Methods Softw., 35 (2020), 681-705.  doi: 10.1080/10556788.2020.1727476.  Google Scholar

[12]

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in Neural Information Processing Systems, (2013), 315-323. Google Scholar

[13]

H. Karimi, J. Nutini and M. Schmidt, Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, (2016), 795-811. doi: 10.1007/978-3-319-46128-1_50.  Google Scholar

[14]

J. KonečnỳJ. LiuP. Richtárik and M. Takáč, Mini-batch semi-stochastic gradient descent in the proximal setting, IEEE Journal of Selected Topics in Signal Processing, 10 (2015), 242-255.   Google Scholar

[15]

J. Konečnỳ and P. Richtárik, Semi-stochastic gradient descent methods, Frontiers in Applied Mathematics and Statistics, 3 (2017), 9. Google Scholar

[16]

L. Lei, C. Ju, J. Chen and M. I. Jordan, Non-convex finite-sum optimization via SCSG methods, in Advances in Neural Information Processing Systems, (2017), 2348-2358. Google Scholar

[17]

Z. Li and J. Li, A simple proximal stochastic gradient method for nonsmooth nonconvex optimization, in Advances in Neural Information Processing Systems, (2018), 5564-5574. Google Scholar

[18]

Y. LiuX. Wang and T. Guo, A linearly convergent stochastic recursive gradient method for convex optimization, Optim. Lett., 14 (2020), 2265-2283.  doi: 10.1007/s11590-020-01550-x.  Google Scholar

[19]

Y. Nesterov, Introductory Lectures on Convex Programming: A Basic Course, Applied Optimization, 87. Kluwer Academic Publishers, Boston, MA, 2004. doi: 10.1007/978-1-4419-8853-9.  Google Scholar

[20]

L. M. NguyenJ. LiuK. Scheinberg and M. Takáč, SARAH: A novel method for machine learning problems using stochastic recursive gradient, Proceedings of the 34th International Conference on Machine Learning, 70 (2017), 2613-2621.   Google Scholar

[21]

L. A. ParenteP. A. Lotito and M. V. Solodov, A class of inexact variable metric proximal point algorithms, SIAM J. Optim., 19 (2008), 240-260.  doi: 10.1137/070688146.  Google Scholar

[22]

N. Parikh and S. Boyd et al., Proximal algorithms, Foundations and Trends in Optimization, 1 (2014), 127-239.   Google Scholar

[23]

Y. Park, S. Dhar, S. Boyd and M. Shah, Variable metric proximal gradient method with diagonal Barzilai-Borwein stepsize, (2019), arXiv: 1910.07056. Google Scholar

[24]

N. H. Pham, L. M. Nguyen, D. T. Phan and Q. Tran-Dinh, ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization, J. Mach. Learn. Res., 21 (2020), Paper No. 110, 48 pp.  Google Scholar

[25]

S. J. Reddi, A. Hefny, S. Sra, B. Poczos and A. Smola, Stochastic variance reduction for nonconvex optimization, in International Conference on Machine Learning, (2016), 314-323. Google Scholar

[26]

S. J. Reddi, S. Sra, B. Poczos and A. J. Smola, Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization, in Advances in Neural Information Processing Systems, (2016), 1145-1153. Google Scholar

[27]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statistics, 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[28]

M. SchmidtN. Le Roux and F. Bach, Minimizing finite sums with the stochastic average gradient, Math. Program., 162 (2017), 83-112.  doi: 10.1007/s10107-016-1030-6.  Google Scholar

[29]

F. ShangK. ZhouH. LiuJ. ChengI. W. TsangL. ZhangD. Tao and L. Jiao, VR-SGD: A simple stochastic variance reduction method for machine learning, IEEE Transactions on Knowledge and Data Engineering, 32 (2020), 188-202.  doi: 10.1109/TKDE.2018.2878765.  Google Scholar

[30]

C. Tan, S. Ma, Y. -H. Dai and Y. Qian, Barzilai-Borwein step size for stochastic gradient descent, in Advances in Neural Information Processing Systems, (2016), 685-693. Google Scholar

[31]

X. WangX. Wang and Y.-X. Yuan, Stochastic proximal quasi-Newton methods for non-convex composite optimization, Optim. Methods Softw., 34 (2019), 922-948.  doi: 10.1080/10556788.2018.1471141.  Google Scholar

[32]

X. WangS. Wang and H. Zhang, Inexact proximal stochastic gradient method for convex composite optimization, Comput. Optim. Appl., 68 (2017), 579-618.  doi: 10.1007/s10589-017-9932-7.  Google Scholar

[33]

X. Wang and H. Zhang, Inexact proximal stochastic second-order methods for nonconvex composite optimization, Optim. Methods Softw., 35 (2020), 808-835.  doi: 10.1080/10556788.2020.1713128.  Google Scholar

[34]

L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM J. Optim., 24 (2014), 2057-2075.  doi: 10.1137/140961791.  Google Scholar

[35]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, A minibatch proximal stochastic recursive gradient algorithm using a trust-region-like scheme and Barzilai-Borwein stepsizes, IEEE Transactions on Neural Networks and Learning Systems, 2020. doi: 10.1109/TNNLS. 2020.3025383.  Google Scholar

[36]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, Stochastic variance reduced gradient methods using a trust-region-like scheme, J. Sci. Comput., 87 (2021), Article number: 5. doi: 10.1007/s10915-020-01402-x.  Google Scholar

[37]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, A variable metric mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize, 2020, arXiv: 2010.00817. Google Scholar

show all references

References:
[1]

J. Barzilai and J. M. Borwein, Two-point step size gradient methods, IMA J. Numer. Anal., 8 (1988), 141-148.  doi: 10.1093/imanum/8.1.141.  Google Scholar

[2]

J. F. BonnansJ. Ch. GilbertC. Lemaréchal and C. A. Sagastizábal, A family of variable metric proximal methods, Math. Programming, 68 (1995), 15-47.  doi: 10.1007/BF01585756.  Google Scholar

[3]

L. BottouF. E. Curtis and J. Nocedal, Optimization methods for large-scale machine learning, SIAM Rev., 60 (2018), 223-311.  doi: 10.1137/16M1080173.  Google Scholar

[4]

Y.-H. DaiM. Al-Baali and X. Yang, A positive Barzilai-Borwein-like stepsize and an extension for symmetric linear systems, Numerical Analysis and Optimization, 134 (2015), 59-75.  doi: 10.1007/978-3-319-17689-5_3.  Google Scholar

[5]

Y.-H. DaiY. Huang and X.-W. Liu, A family of spectral gradient methods for optimization, Comput. Optim. Appl., 74 (2019), 43-65.  doi: 10.1007/s10589-019-00107-8.  Google Scholar

[6]

A. Defazio, F. Bach and S. Lacoste-Julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, in Advances in Neural Information Processing Systems, (2014), 1646-1654. Google Scholar

[7]

R. Fletcher, On the Barzilai-Borwein method, in Optimization and control with applications, Springer, New York, 96 (2005), 235-256. doi: 10.1007/0-387-24255-4_10.  Google Scholar

[8]

S. Ghadimi and G. Lan, Stochastic first-and zeroth-order methods for nonconvex stochastic programming, SIAM J. Optim., 23 (2013), 2341-2368.  doi: 10.1137/120880811.  Google Scholar

[9]

S. GhadimiG. Lan and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program., 155 (2016), 267-305.  doi: 10.1007/s10107-014-0846-1.  Google Scholar

[10]

T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics. Springer, New York, 2009. doi: 10.1007/978-0-387-84858-7.  Google Scholar

[11]

Y. HuangY.-H. DaiX.-W. Liu and H. Zhang, Gradient methods exploiting spectral properties, Optim. Methods Softw., 35 (2020), 681-705.  doi: 10.1080/10556788.2020.1727476.  Google Scholar

[12]

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in Neural Information Processing Systems, (2013), 315-323. Google Scholar

[13]

H. Karimi, J. Nutini and M. Schmidt, Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, (2016), 795-811. doi: 10.1007/978-3-319-46128-1_50.  Google Scholar

[14]

J. KonečnỳJ. LiuP. Richtárik and M. Takáč, Mini-batch semi-stochastic gradient descent in the proximal setting, IEEE Journal of Selected Topics in Signal Processing, 10 (2015), 242-255.   Google Scholar

[15]

J. Konečnỳ and P. Richtárik, Semi-stochastic gradient descent methods, Frontiers in Applied Mathematics and Statistics, 3 (2017), 9. Google Scholar

[16]

L. Lei, C. Ju, J. Chen and M. I. Jordan, Non-convex finite-sum optimization via SCSG methods, in Advances in Neural Information Processing Systems, (2017), 2348-2358. Google Scholar

[17]

Z. Li and J. Li, A simple proximal stochastic gradient method for nonsmooth nonconvex optimization, in Advances in Neural Information Processing Systems, (2018), 5564-5574. Google Scholar

[18]

Y. LiuX. Wang and T. Guo, A linearly convergent stochastic recursive gradient method for convex optimization, Optim. Lett., 14 (2020), 2265-2283.  doi: 10.1007/s11590-020-01550-x.  Google Scholar

[19]

Y. Nesterov, Introductory Lectures on Convex Programming: A Basic Course, Applied Optimization, 87. Kluwer Academic Publishers, Boston, MA, 2004. doi: 10.1007/978-1-4419-8853-9.  Google Scholar

[20]

L. M. NguyenJ. LiuK. Scheinberg and M. Takáč, SARAH: A novel method for machine learning problems using stochastic recursive gradient, Proceedings of the 34th International Conference on Machine Learning, 70 (2017), 2613-2621.   Google Scholar

[21]

L. A. ParenteP. A. Lotito and M. V. Solodov, A class of inexact variable metric proximal point algorithms, SIAM J. Optim., 19 (2008), 240-260.  doi: 10.1137/070688146.  Google Scholar

[22]

N. Parikh and S. Boyd et al., Proximal algorithms, Foundations and Trends in Optimization, 1 (2014), 127-239.   Google Scholar

[23]

Y. Park, S. Dhar, S. Boyd and M. Shah, Variable metric proximal gradient method with diagonal Barzilai-Borwein stepsize, (2019), arXiv: 1910.07056. Google Scholar

[24]

N. H. Pham, L. M. Nguyen, D. T. Phan and Q. Tran-Dinh, ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization, J. Mach. Learn. Res., 21 (2020), Paper No. 110, 48 pp.  Google Scholar

[25]

S. J. Reddi, A. Hefny, S. Sra, B. Poczos and A. Smola, Stochastic variance reduction for nonconvex optimization, in International Conference on Machine Learning, (2016), 314-323. Google Scholar

[26]

S. J. Reddi, S. Sra, B. Poczos and A. J. Smola, Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization, in Advances in Neural Information Processing Systems, (2016), 1145-1153. Google Scholar

[27]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statistics, 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[28]

M. SchmidtN. Le Roux and F. Bach, Minimizing finite sums with the stochastic average gradient, Math. Program., 162 (2017), 83-112.  doi: 10.1007/s10107-016-1030-6.  Google Scholar

[29]

F. ShangK. ZhouH. LiuJ. ChengI. W. TsangL. ZhangD. Tao and L. Jiao, VR-SGD: A simple stochastic variance reduction method for machine learning, IEEE Transactions on Knowledge and Data Engineering, 32 (2020), 188-202.  doi: 10.1109/TKDE.2018.2878765.  Google Scholar

[30]

C. Tan, S. Ma, Y. -H. Dai and Y. Qian, Barzilai-Borwein step size for stochastic gradient descent, in Advances in Neural Information Processing Systems, (2016), 685-693. Google Scholar

[31]

X. WangX. Wang and Y.-X. Yuan, Stochastic proximal quasi-Newton methods for non-convex composite optimization, Optim. Methods Softw., 34 (2019), 922-948.  doi: 10.1080/10556788.2018.1471141.  Google Scholar

[32]

X. WangS. Wang and H. Zhang, Inexact proximal stochastic gradient method for convex composite optimization, Comput. Optim. Appl., 68 (2017), 579-618.  doi: 10.1007/s10589-017-9932-7.  Google Scholar

[33]

X. Wang and H. Zhang, Inexact proximal stochastic second-order methods for nonconvex composite optimization, Optim. Methods Softw., 35 (2020), 808-835.  doi: 10.1080/10556788.2020.1713128.  Google Scholar

[34]

L. Xiao and T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM J. Optim., 24 (2014), 2057-2075.  doi: 10.1137/140961791.  Google Scholar

[35]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, A minibatch proximal stochastic recursive gradient algorithm using a trust-region-like scheme and Barzilai-Borwein stepsizes, IEEE Transactions on Neural Networks and Learning Systems, 2020. doi: 10.1109/TNNLS. 2020.3025383.  Google Scholar

[36]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, Stochastic variance reduced gradient methods using a trust-region-like scheme, J. Sci. Comput., 87 (2021), Article number: 5. doi: 10.1007/s10915-020-01402-x.  Google Scholar

[37]

T. Yu, X. -W. Liu, Y. -H. Dai and J. Sun, A variable metric mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize, 2020, arXiv: 2010.00817. Google Scholar

Figure 1.  Comparison of VM-SVRG with different mini-batch sizes
Figure 2.  Comparison of VM-SVRG and other modern methods for solving LR problem
Figure 3.  Comparison of VM-SVRG with different mini-batch sizes
Figure 4.  Comparison of VM-SVRG and mS2GD for solving SVM problem
Algorithm 2 PL-VM-SVRG($ w^0,m,b,U_0 $)
Input: Number of inner iterations $ m $, initial point $ \tilde{w}_0=w^0 \in \mathbb{R}^d $, initial metric $ U_0 $, mini-batch size $ b\in \{1,2,\ldots,n\} $;
1: for $ s=0,1,\ldots,S-1 $ do
2:    $ w^{s+1} = $ VM-SVRG($ w^s,m,b,U_0 $).
3: end for
Output: $ w^S $.
Algorithm 2 PL-VM-SVRG($ w^0,m,b,U_0 $)
Input: Number of inner iterations $ m $, initial point $ \tilde{w}_0=w^0 \in \mathbb{R}^d $, initial metric $ U_0 $, mini-batch size $ b\in \{1,2,\ldots,n\} $;
1: for $ s=0,1,\ldots,S-1 $ do
2:    $ w^{s+1} = $ VM-SVRG($ w^s,m,b,U_0 $).
3: end for
Output: $ w^S $.
Table 1.  Comparison of the SFO and PO complexity.
Complexity Prox-GD Prox-SGD Prox-SVRG VM-SVRG
SFO $ \mathcal{O}(n/\epsilon) $ $ \mathcal{O}(1/\epsilon^2) $ $ \mathcal{O}(n+(n^{2/3}/\epsilon)) $ $ \mathcal{O}(n+(n^{2/3}/\epsilon)) $
PO $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $
SFO(PL) $ \mathcal{O}(n\kappa\log(1/\epsilon)) $ $ \mathcal{O}(1/\epsilon^2) $ $ \mathcal{O}((n+\kappa n^{2/3})\log(1/\epsilon)) $ $ \mathcal{O}((n+\kappa n^{2/3})\log(1/\epsilon)) $
PO(PL) $ \mathcal{O}(\kappa \log(1/\epsilon)) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(\kappa \log(1/\epsilon)) $ $ \mathcal{O}(\kappa \log(1/\epsilon)) $
Complexity Prox-GD Prox-SGD Prox-SVRG VM-SVRG
SFO $ \mathcal{O}(n/\epsilon) $ $ \mathcal{O}(1/\epsilon^2) $ $ \mathcal{O}(n+(n^{2/3}/\epsilon)) $ $ \mathcal{O}(n+(n^{2/3}/\epsilon)) $
PO $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(1/\epsilon) $
SFO(PL) $ \mathcal{O}(n\kappa\log(1/\epsilon)) $ $ \mathcal{O}(1/\epsilon^2) $ $ \mathcal{O}((n+\kappa n^{2/3})\log(1/\epsilon)) $ $ \mathcal{O}((n+\kappa n^{2/3})\log(1/\epsilon)) $
PO(PL) $ \mathcal{O}(\kappa \log(1/\epsilon)) $ $ \mathcal{O}(1/\epsilon) $ $ \mathcal{O}(\kappa \log(1/\epsilon)) $ $ \mathcal{O}(\kappa \log(1/\epsilon)) $
Table 2.  The information of data sets
Data sets $ n $ $ d $
ijcnn1 49, 990 22
rcv1 20, 242 47, 236
real-sim 72, 309 20, 958
covtype 581, 012 54
Data sets $ n $ $ d $
ijcnn1 49, 990 22
rcv1 20, 242 47, 236
real-sim 72, 309 20, 958
covtype 581, 012 54
Table 3.  Best choices of parameters for the methods
Data sets mS2GD$ (m,\eta) $ mS2GD-BB mSARAH$ (m,\eta) $ mSARAH-BB VM-SVRG
ijcnn1 $ (0.02n,\frac{4}{L}) $ $ 0.04n $ $ (0.05n,\frac{1.8}{L}) $ $ 0.04n $ $ 0.04n $
rcv1 $ (0.1n,\frac{4}{L}) $ $ 0.11n $ $ (0.1n,\frac{3.5}{L}) $ $ 0.09n $ $ 0.25n $
real-sim $ (0.12n,\frac{0.6}{L}) $ $ 0.15n $ $ (0.07n,\frac{2}{L}) $ $ 0.06 $ $ 0.11n $
covtype $ (0.07n,\frac{21}{L}) $ $ 0.03n $ $ (0.07n,\frac{25}{L}) $ $ 0.008n $ $ 0.01n $
Data sets mS2GD$ (m,\eta) $ mS2GD-BB mSARAH$ (m,\eta) $ mSARAH-BB VM-SVRG
ijcnn1 $ (0.02n,\frac{4}{L}) $ $ 0.04n $ $ (0.05n,\frac{1.8}{L}) $ $ 0.04n $ $ 0.04n $
rcv1 $ (0.1n,\frac{4}{L}) $ $ 0.11n $ $ (0.1n,\frac{3.5}{L}) $ $ 0.09n $ $ 0.25n $
real-sim $ (0.12n,\frac{0.6}{L}) $ $ 0.15n $ $ (0.07n,\frac{2}{L}) $ $ 0.06 $ $ 0.11n $
covtype $ (0.07n,\frac{21}{L}) $ $ 0.03n $ $ (0.07n,\frac{25}{L}) $ $ 0.008n $ $ 0.01n $
[1]

Yan Gu, Nobuo Yamashita. Alternating direction method of multipliers with variable metric indefinite proximal terms for convex optimization. Numerical Algebra, Control & Optimization, 2020, 10 (4) : 487-510. doi: 10.3934/naco.2020047

[2]

Igor Griva, Roman A. Polyak. Proximal point nonlinear rescaling method for convex optimization. Numerical Algebra, Control & Optimization, 2011, 1 (2) : 283-299. doi: 10.3934/naco.2011.1.283

[3]

Foxiang Liu, Lingling Xu, Yuehong Sun, Deren Han. A proximal alternating direction method for multi-block coupled convex optimization. Journal of Industrial & Management Optimization, 2019, 15 (2) : 723-737. doi: 10.3934/jimo.2018067

[4]

Jueyou Li, Guoquan Li, Zhiyou Wu, Changzhi Wu, Xiangyu Wang, Jae-Myung Lee, Kwang-Hyo Jung. Incremental gradient-free method for nonsmooth distributed optimization. Journal of Industrial & Management Optimization, 2017, 13 (4) : 1841-1857. doi: 10.3934/jimo.2017021

[5]

Alain Haraux. Some applications of the Łojasiewicz gradient inequality. Communications on Pure & Applied Analysis, 2012, 11 (6) : 2417-2427. doi: 10.3934/cpaa.2012.11.2417

[6]

Sanming Liu, Zhijie Wang, Chongyang Liu. On convergence analysis of dual proximal-gradient methods with approximate gradient for a class of nonsmooth convex minimization problems. Journal of Industrial & Management Optimization, 2016, 12 (1) : 389-402. doi: 10.3934/jimo.2016.12.389

[7]

Chunming Tang, Jinbao Jian, Guoyin Li. A proximal-projection partial bundle method for convex constrained minimax problems. Journal of Industrial & Management Optimization, 2019, 15 (2) : 757-774. doi: 10.3934/jimo.2018069

[8]

Tim Hoheisel, Maxime Laborde, Adam Oberman. A regularization interpretation of the proximal point method for weakly convex functions. Journal of Dynamics & Games, 2020, 7 (1) : 79-96. doi: 10.3934/jdg.2020005

[9]

Jie Shen, Jian Lv, Fang-Fang Guo, Ya-Li Gao, Rui Zhao. A new proximal chebychev center cutting plane algorithm for nonsmooth optimization and its convergence. Journal of Industrial & Management Optimization, 2018, 14 (3) : 1143-1155. doi: 10.3934/jimo.2018003

[10]

Nobuko Sagara, Masao Fukushima. trust region method for nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2005, 1 (2) : 171-180. doi: 10.3934/jimo.2005.1.171

[11]

Yan Gu, Nobuo Yamashita. A proximal ADMM with the Broyden family for convex optimization problems. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020091

[12]

Sanming Liu, Zhijie Wang, Chongyang Liu. Proximal iterative Gaussian smoothing algorithm for a class of nonsmooth convex minimization problems. Numerical Algebra, Control & Optimization, 2015, 5 (1) : 79-89. doi: 10.3934/naco.2015.5.79

[13]

Hssaine Boualam, Ahmed Roubi. Dual algorithms based on the proximal bundle method for solving convex minimax fractional programs. Journal of Industrial & Management Optimization, 2019, 15 (4) : 1897-1920. doi: 10.3934/jimo.2018128

[14]

Kangkang Deng, Zheng Peng, Jianli Chen. Sparse probabilistic Boolean network problems: A partial proximal-type operator splitting method. Journal of Industrial & Management Optimization, 2019, 15 (4) : 1881-1896. doi: 10.3934/jimo.2018127

[15]

Min Li. A three term Polak-Ribière-Polyak conjugate gradient method close to the memoryless BFGS quasi-Newton method. Journal of Industrial & Management Optimization, 2020, 16 (1) : 245-260. doi: 10.3934/jimo.2018149

[16]

Weijun Zhou, Youhua Zhou. On the strong convergence of a modified Hestenes-Stiefel method for nonconvex optimization. Journal of Industrial & Management Optimization, 2013, 9 (4) : 893-899. doi: 10.3934/jimo.2013.9.893

[17]

Hui Gao, Jian Lv, Xiaoliang Wang, Liping Pang. An alternating linearization bundle method for a class of nonconvex optimization problem with inexact information. Journal of Industrial & Management Optimization, 2021, 17 (2) : 805-825. doi: 10.3934/jimo.2019135

[18]

Yanmei Sun, Yakui Huang. An alternate gradient method for optimization problems with orthogonality constraints. Numerical Algebra, Control & Optimization, 2021  doi: 10.3934/naco.2021003

[19]

A. M. Bagirov, Moumita Ghosh, Dean Webb. A derivative-free method for linearly constrained nonsmooth optimization. Journal of Industrial & Management Optimization, 2006, 2 (3) : 319-338. doi: 10.3934/jimo.2006.2.319

[20]

Dan Li, Li-Ping Pang, Fang-Fang Guo, Zun-Quan Xia. An alternating linearization method with inexact data for bilevel nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2014, 10 (3) : 859-869. doi: 10.3934/jimo.2014.10.859

2019 Impact Factor: 1.366

Metrics

  • PDF downloads (101)
  • HTML views (44)
  • Cited by (0)

Other articles
by authors

[Back to Top]