doi: 10.3934/jimo.2022050
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

A primal and dual active set algorithm for truncated $L_1$ regularized logistic regression

1. 

School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei, 430072, China

2. 

Center for Quantitative Medicine Duke-NUS Medical School, 169857, Singapore

3. 

Department of Anesthesiology, Tongji Hospital, Tongji Medical College, Huazhong University of science and technology, Wuhan, Hubei, 430030, China

*Corresponding author: Chang Zhu

Received  September 2021 Revised  January 2022 Early access April 2022

Truncated $ L_1 $ regularization [2] is one type of approximation to the original $ L_0 $ regularization, and it admits the hard thresholding operator. Thus we consider the truncated $ L_1 $ regularization for variable selection and estimation in the high-dimensional and sparse logistic regression models. Computationally, motivated by the KKT conditions of the truncated $ L_1 $ regularized problem, we propose a primal and dual active set algorithm (PDAS). In PDAS, it first distinguishes the active sets with small size through the primal and dual variables in the previous iteration, then the primal variable is updated by the maximum likelihood estimation limited to the active set and the dual variable is updated explicitly based on the gradient information. Further, we consider a sequential PDAS (SPDAS) with a warm-start and continual strategy. Numerous simulation studies illustrate the effectiveness of the proposed method, and the application is also demonstrate by analysing some binary classification data sets.

Citation: Lican Kang, Yuan Luo, Jerry Zhijian Yang, Chang Zhu. A primal and dual active set algorithm for truncated $L_1$ regularized logistic regression. Journal of Industrial and Management Optimization, doi: 10.3934/jimo.2022050
References:
[1]

P. Breheny and J. Huang, Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, Ann. Appl. Stat., 5 (2011), 232-253.  doi: 10.1214/10-AOAS388.

[2]

J. Fan, Comments on "wavelets in statistics: A review" by A. Antoniadis, Journal of the Italian Statistical Society, 6 (1997), 131-138. 

[3]

J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Amer. Statist. Assoc., 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.

[4]

J. FriedmanT. Hastie and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33 (2010), 1. 

[5]

G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, JJohns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press, Baltimore, MD, 1996.

[6]

D. W. Hosmer, S. Lemeshow and R. X. Sturdivant, Applied Logistic Regression, John Wiley & Sons, 2013.

[7]

J. HuangY. JiaoB. JinJ. LiuX. Lu and C. Yang, A unified primal dual active set algorithm for nonconvex sparse recovery, Statist. Sci., 36 (2021), 215-238.  doi: 10.1214/19-sts758.

[8]

J. HuangY. JiaoL. KangJ. LiuY. Liu and X. Lu, GSDAR: A fast Newton algorithm for $\ell_0 $ regularized generalized linear models with statistical guarantee, Comput. Statist., 37 (2022), 507-533.  doi: 10.1007/s00180-021-01098-z.

[9]

J. Huang, Y. Jiao, L. Kang, J. Liu, Y. Liu, X. Lu and Y. Yang, On newton screening, preprint, arXiv: 200110616, 2020.

[10]

Y. KimS. Kwon and H. Choi, Consistent model selection criteria on high dimensions, J. Mach. Learn. Res., 13 (2012), 1037-1057. 

[11]

D. G. Kleinbaum, K. Dietz, M. Gail and M. Klein, Logistic Regression, Springer-Verlag, New York, 2002.

[12]

X. Li, L. Yang, J. Ge, J. Haupt, T. Zhang and T. Zhao, On quadratic convergence of DC proximal Newton algorithm in nonconvex sparse learning, Advances in Neural Information Processing Systems, 30 (2017).

[13]

P. L. Loh and M. J. Wainwright, Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima, J. Mach. Learn. Res., 16 (2015), 559-616. 

[14]

S. Luo and Z. Chen, Sequential lasso cum EBIC for feature selection with ultra-high dimensional feature space, J. Amer. Statist. Assoc., 109 (2014), 1229-1240.  doi: 10.1080/01621459.2013.877275.

[15]

P. McCullagh and J. A. Nelder, Generalized Linear Models, 2$^{nd}$ edition, Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1989.

[16]

J. A. Nelder and R. W. Wedderburn, Generalized linear models, Journal of the Royal Statistical Society: Series A (General), 135 (1972), 370-384.  doi: 10.2307/2344614.

[17]

Y. Nesterov, Gradient methods for minimizing composite functions, Math. Program., 140 (2013), 125-161.  doi: 10.1007/s10107-012-0629-5.

[18]

M. Y. Park and T. Hastie, $L_1$-regularization path algorithm for generalized linear models, J. R. Stat. Soc. Ser. B Stat. Methodol., 69 (2007), 659-677.  doi: 10.1111/j.1467-9868.2007.00607.x.

[19]

R. T. Rockafellar and R. J. B. Wets, Variational Analysis, Springer Science & Business Media, 1998. doi: 10.1007/978-3-642-02431-3.

[20]

J. Shen and P. Li, On the iteration complexity of support recovery via hard thresholding pursuit, International Conference on Machine Learning, (2017), 3115–3124.

[21]

Y. ShiJ. HuangY. Jiao and Q. Yang, A semismooth newton algorithm for high-dimensional nonconvex sparse learning, IEEE Trans. Neural Netw. Learn. Syst., 31 (2020), 2993-3006.  doi: 10.1109/TNNLS.2019.2935001.

[22]

R. Tibshirani, Regression shrinkage and selection via the lasso, J. Roy. Statist. Soc. Ser. B, 58 (1996), 267-288.  doi: 10.1111/j.2517-6161.1996.tb02080.x.

[23]

S. A. Van de Geer, High-dimensional generalized linear models and the lasso, Ann. Statist., 36 (2008), 614-645.  doi: 10.1214/009053607000000929.

[24]

L. WangY. Kim and R. Li, Calibrating non-convex penalized regression in ultra-high dimension, Ann. Statist., 41 (2013), 2505-2536.  doi: 10.1214/13-AOS1159.

[25]

R. Wang, N. Xiu and S. Zhou, Fast newton method for sparse logistic regression, preprint, arXiv: 1901.02768, 2019.

[26]

Z. WangH. Liu and T. Zhang, Optimal computational and statistical rates of convergence for sparse nonconvex learning problems, Ann. Statist., 42 (2014), 2164-2201.  doi: 10.1214/14-AOS1238.

[27]

X. T. Yuan, P. Li and T. Zhang, Gradient hard thresholding pursuit, J. Mach. Learn. Res., 18 (2017), Paper No. 166, 43 pp.

[28]

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B Stat. Methodol., 67 (2005), 301-320.  doi: 10.1111/j.1467-9868.2005.00503.x.

show all references

References:
[1]

P. Breheny and J. Huang, Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, Ann. Appl. Stat., 5 (2011), 232-253.  doi: 10.1214/10-AOAS388.

[2]

J. Fan, Comments on "wavelets in statistics: A review" by A. Antoniadis, Journal of the Italian Statistical Society, 6 (1997), 131-138. 

[3]

J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Amer. Statist. Assoc., 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273.

[4]

J. FriedmanT. Hastie and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33 (2010), 1. 

[5]

G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, JJohns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press, Baltimore, MD, 1996.

[6]

D. W. Hosmer, S. Lemeshow and R. X. Sturdivant, Applied Logistic Regression, John Wiley & Sons, 2013.

[7]

J. HuangY. JiaoB. JinJ. LiuX. Lu and C. Yang, A unified primal dual active set algorithm for nonconvex sparse recovery, Statist. Sci., 36 (2021), 215-238.  doi: 10.1214/19-sts758.

[8]

J. HuangY. JiaoL. KangJ. LiuY. Liu and X. Lu, GSDAR: A fast Newton algorithm for $\ell_0 $ regularized generalized linear models with statistical guarantee, Comput. Statist., 37 (2022), 507-533.  doi: 10.1007/s00180-021-01098-z.

[9]

J. Huang, Y. Jiao, L. Kang, J. Liu, Y. Liu, X. Lu and Y. Yang, On newton screening, preprint, arXiv: 200110616, 2020.

[10]

Y. KimS. Kwon and H. Choi, Consistent model selection criteria on high dimensions, J. Mach. Learn. Res., 13 (2012), 1037-1057. 

[11]

D. G. Kleinbaum, K. Dietz, M. Gail and M. Klein, Logistic Regression, Springer-Verlag, New York, 2002.

[12]

X. Li, L. Yang, J. Ge, J. Haupt, T. Zhang and T. Zhao, On quadratic convergence of DC proximal Newton algorithm in nonconvex sparse learning, Advances in Neural Information Processing Systems, 30 (2017).

[13]

P. L. Loh and M. J. Wainwright, Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima, J. Mach. Learn. Res., 16 (2015), 559-616. 

[14]

S. Luo and Z. Chen, Sequential lasso cum EBIC for feature selection with ultra-high dimensional feature space, J. Amer. Statist. Assoc., 109 (2014), 1229-1240.  doi: 10.1080/01621459.2013.877275.

[15]

P. McCullagh and J. A. Nelder, Generalized Linear Models, 2$^{nd}$ edition, Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1989.

[16]

J. A. Nelder and R. W. Wedderburn, Generalized linear models, Journal of the Royal Statistical Society: Series A (General), 135 (1972), 370-384.  doi: 10.2307/2344614.

[17]

Y. Nesterov, Gradient methods for minimizing composite functions, Math. Program., 140 (2013), 125-161.  doi: 10.1007/s10107-012-0629-5.

[18]

M. Y. Park and T. Hastie, $L_1$-regularization path algorithm for generalized linear models, J. R. Stat. Soc. Ser. B Stat. Methodol., 69 (2007), 659-677.  doi: 10.1111/j.1467-9868.2007.00607.x.

[19]

R. T. Rockafellar and R. J. B. Wets, Variational Analysis, Springer Science & Business Media, 1998. doi: 10.1007/978-3-642-02431-3.

[20]

J. Shen and P. Li, On the iteration complexity of support recovery via hard thresholding pursuit, International Conference on Machine Learning, (2017), 3115–3124.

[21]

Y. ShiJ. HuangY. Jiao and Q. Yang, A semismooth newton algorithm for high-dimensional nonconvex sparse learning, IEEE Trans. Neural Netw. Learn. Syst., 31 (2020), 2993-3006.  doi: 10.1109/TNNLS.2019.2935001.

[22]

R. Tibshirani, Regression shrinkage and selection via the lasso, J. Roy. Statist. Soc. Ser. B, 58 (1996), 267-288.  doi: 10.1111/j.2517-6161.1996.tb02080.x.

[23]

S. A. Van de Geer, High-dimensional generalized linear models and the lasso, Ann. Statist., 36 (2008), 614-645.  doi: 10.1214/009053607000000929.

[24]

L. WangY. Kim and R. Li, Calibrating non-convex penalized regression in ultra-high dimension, Ann. Statist., 41 (2013), 2505-2536.  doi: 10.1214/13-AOS1159.

[25]

R. Wang, N. Xiu and S. Zhou, Fast newton method for sparse logistic regression, preprint, arXiv: 1901.02768, 2019.

[26]

Z. WangH. Liu and T. Zhang, Optimal computational and statistical rates of convergence for sparse nonconvex learning problems, Ann. Statist., 42 (2014), 2164-2201.  doi: 10.1214/14-AOS1238.

[27]

X. T. Yuan, P. Li and T. Zhang, Gradient hard thresholding pursuit, J. Mach. Learn. Res., 18 (2017), Paper No. 166, 43 pp.

[28]

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B Stat. Methodol., 67 (2005), 301-320.  doi: 10.1111/j.1467-9868.2005.00503.x.

Table 1.  Numerical results with $ p = 10000 $, $ T = 20 $, $ R = 5 $, $ \rho = 0.5 $, $ n = 800:100:1400 $
$ n $ Method RE Time(s) APDR AFDR ADR MSES ACRP
800 Lasso 0.83 32.11 0.94 0.92 1.02 234.82 87.33%
MCP 0.40 93.74 0.94 0.34 1.60 29.45 93.18%
SCAD 0.37 104.31 0.97 0.65 1.32 58.03 92.97%
SPDAS 0.31 7.97 0.90 0.08 1.82 19.31 92.96%
900 Lasso 0.82 36.08 0.96 0.92 1.04 249.44 88.2%
MCP 0.32 99.57 0.97 0.30 1.67 28.42 94.22%
SCAD 0.31 93.22 0.98 0.61 1.37 54.05 94.05%
SPDAS 0.24 8.92 0.95 0.06 1.89 20.24 94.08%
1000 Lasso 0.81 38.91 0.97 0.92 1.05 258.63 89.04%
MCP 0.28 104.73 0.97 0.28 1.69 27.85 94.27%
SCAD 0.28 84.31 0.99 0.62 1.37 54.64 94.06%
SPDAS 0.20 11.13 0.97 0.06 1.91 20.53 94.71%
1100 Lasso 0.80 43.45 0.98 0.93 1.05 271.73 89.97%
MCP 0.23 107.16 0.99 0.27 1.72 28.03 95.02%
SCAD 0.22 87.03 0.99 0.61 1.38 53.44 94.75%
SPDAS 0.20 12.84 0.98 0.04 1.94 20.43 95.07%
1200 Lasso 0.79 46.66 0.98 0.92 1.06 280.43 90.38%
MCP 0.22 109.53 0.99 0.27 1.72 27.79 95.10%
SCAD 0.22 81.74 0.99 0.62 1.37 55.5 95.02%
SPDAS 0.18 16.00 0.99 0.04 1.95 20.47 95.24%
1300 Lasso 0.79 50.37 0.99 0.93 1.06 284.61 90.95%
MCP 0.17 105.95 0.99 0.25 1.74 27.3 95.4%
SCAD 0.19 82.91 1 0.61 1.39 53.28 95.35%
SPDAS 0.15 13.95 0.99 0.03 1.96 20.51 95.47%
1400 Lasso 0.78 52.82 0.99 0.93 1.06 285.5 91.25%
MCP 0.17 97.25 0.99 0.23 1.76 26.47 95.42%
SCAD 0.17 81.00 1 0.58 1.42 52.23 95.21%
SPDAS 0.13 13.82 0.99 0.02 1.97 20.28 95.73%
$ n $ Method RE Time(s) APDR AFDR ADR MSES ACRP
800 Lasso 0.83 32.11 0.94 0.92 1.02 234.82 87.33%
MCP 0.40 93.74 0.94 0.34 1.60 29.45 93.18%
SCAD 0.37 104.31 0.97 0.65 1.32 58.03 92.97%
SPDAS 0.31 7.97 0.90 0.08 1.82 19.31 92.96%
900 Lasso 0.82 36.08 0.96 0.92 1.04 249.44 88.2%
MCP 0.32 99.57 0.97 0.30 1.67 28.42 94.22%
SCAD 0.31 93.22 0.98 0.61 1.37 54.05 94.05%
SPDAS 0.24 8.92 0.95 0.06 1.89 20.24 94.08%
1000 Lasso 0.81 38.91 0.97 0.92 1.05 258.63 89.04%
MCP 0.28 104.73 0.97 0.28 1.69 27.85 94.27%
SCAD 0.28 84.31 0.99 0.62 1.37 54.64 94.06%
SPDAS 0.20 11.13 0.97 0.06 1.91 20.53 94.71%
1100 Lasso 0.80 43.45 0.98 0.93 1.05 271.73 89.97%
MCP 0.23 107.16 0.99 0.27 1.72 28.03 95.02%
SCAD 0.22 87.03 0.99 0.61 1.38 53.44 94.75%
SPDAS 0.20 12.84 0.98 0.04 1.94 20.43 95.07%
1200 Lasso 0.79 46.66 0.98 0.92 1.06 280.43 90.38%
MCP 0.22 109.53 0.99 0.27 1.72 27.79 95.10%
SCAD 0.22 81.74 0.99 0.62 1.37 55.5 95.02%
SPDAS 0.18 16.00 0.99 0.04 1.95 20.47 95.24%
1300 Lasso 0.79 50.37 0.99 0.93 1.06 284.61 90.95%
MCP 0.17 105.95 0.99 0.25 1.74 27.3 95.4%
SCAD 0.19 82.91 1 0.61 1.39 53.28 95.35%
SPDAS 0.15 13.95 0.99 0.03 1.96 20.51 95.47%
1400 Lasso 0.78 52.82 0.99 0.93 1.06 285.5 91.25%
MCP 0.17 97.25 0.99 0.23 1.76 26.47 95.42%
SCAD 0.17 81.00 1 0.58 1.42 52.23 95.21%
SPDAS 0.13 13.82 0.99 0.02 1.97 20.28 95.73%
Table 2.  Numerical results with $ n = 1500 $, $ T = 20 $, $ R = 5 $, $ \rho = 0.5 $, $ p = 18000:2000:30000 $
$ p $ Method RE Time(s) APDR AFDR ADR MSES ACRP
18000 Lasso 0.79 81.13 1 0.95 1.05 329.51 91.04%
MCP 0.16 124.80 1 0.27 1.73 28.19 95.57%
SCAD 0.17 106.38 1 0.64 1.36 59.47 95.39%
SPDAS 0.13 28.37 1 0.02 1.98 20.41 95.58%
20000 Lasso 0.79 87.31 1 0.94 1.06 339.38 91.39%
MCP 0.17 133.90 1 0.28 1.72 28.58 95.76%
SCAD 0.18 113.40 1 0.65 1.35 61.63 95.69%
SPDAS 0.11 40.37 1 0.02 1.98 20.35 95.51%
22000 Lasso 0.79 91.23 1 0.94 1.06 339.04 91.21%
MCP 0.15 130.43 1 0.27 1.73 28 95.41%
SCAD 0.17 116.26 1 0.63 1.37 57.93 95.25%
SPDAS 0.12 45.11 1 0.02 1.98 20.36 95.58%
24000 Lasso 0.79 98.64 1 0.94 1.06 343.51 90.78%
MCP 0.17 137.09 1 0.28 1.72 28.87 95.23%
SCAD 0.19 120.02 1 0.66 1.34 61.97 95.13%
SPDAS 0.12 42.89 1 0.02 1.98 20.32 95.58%
26000 Lasso 0.79 103.52 1 0.94 1.06 352.07 91.04%
MCP 0.18 145.16 1 0.30 1.70 29.19 95.62%
SCAD 0.20 129.01 1 0.66 1.34 62.43 95.43%
SPDAS 0.12 41.18 1 0.02 1.98 20.29 95.65%
28000 Lasso 0.79 107.16 1 0.94 1.06 352.43 90.98%
MCP 0.17 149.79 1 0.31 1.69 29.7 95.62%
SCAD 0.18 132.48 1 0.66 1.34 62.7 95.55%
SPDAS 0.11 81.59 1 0.02 1.98 20.28 95.86%
30000 Lasso 0.80 111.74 1 0.94 1.06 356.71 91.00%
MCP 0.16 152.27 1 0.29 1.71 29.45 95.52%
SCAD 0.17 138.80 1 0.66 1.34 63.48 95.51%
SPDAS 0.12 57.00 1 0.02 1.98 20.33 95.67%
$ p $ Method RE Time(s) APDR AFDR ADR MSES ACRP
18000 Lasso 0.79 81.13 1 0.95 1.05 329.51 91.04%
MCP 0.16 124.80 1 0.27 1.73 28.19 95.57%
SCAD 0.17 106.38 1 0.64 1.36 59.47 95.39%
SPDAS 0.13 28.37 1 0.02 1.98 20.41 95.58%
20000 Lasso 0.79 87.31 1 0.94 1.06 339.38 91.39%
MCP 0.17 133.90 1 0.28 1.72 28.58 95.76%
SCAD 0.18 113.40 1 0.65 1.35 61.63 95.69%
SPDAS 0.11 40.37 1 0.02 1.98 20.35 95.51%
22000 Lasso 0.79 91.23 1 0.94 1.06 339.04 91.21%
MCP 0.15 130.43 1 0.27 1.73 28 95.41%
SCAD 0.17 116.26 1 0.63 1.37 57.93 95.25%
SPDAS 0.12 45.11 1 0.02 1.98 20.36 95.58%
24000 Lasso 0.79 98.64 1 0.94 1.06 343.51 90.78%
MCP 0.17 137.09 1 0.28 1.72 28.87 95.23%
SCAD 0.19 120.02 1 0.66 1.34 61.97 95.13%
SPDAS 0.12 42.89 1 0.02 1.98 20.32 95.58%
26000 Lasso 0.79 103.52 1 0.94 1.06 352.07 91.04%
MCP 0.18 145.16 1 0.30 1.70 29.19 95.62%
SCAD 0.20 129.01 1 0.66 1.34 62.43 95.43%
SPDAS 0.12 41.18 1 0.02 1.98 20.29 95.65%
28000 Lasso 0.79 107.16 1 0.94 1.06 352.43 90.98%
MCP 0.17 149.79 1 0.31 1.69 29.7 95.62%
SCAD 0.18 132.48 1 0.66 1.34 62.7 95.55%
SPDAS 0.11 81.59 1 0.02 1.98 20.28 95.86%
30000 Lasso 0.80 111.74 1 0.94 1.06 356.71 91.00%
MCP 0.16 152.27 1 0.29 1.71 29.45 95.52%
SCAD 0.17 138.80 1 0.66 1.34 63.48 95.51%
SPDAS 0.12 57.00 1 0.02 1.98 20.33 95.67%
Table 3.  Numerical results with $ n = 1000 $, $ p = 10000 $, $ T = 20 $, $ R = 5 $, $ \rho = 0.1:0.1:0.9 $
$ \rho $ Method RE Time(s) APDR AFDR ADR MSES ACRP
0.1 Lasso 0.77 37.18 0.97 0.92 1.05 269.54 88.54%
MCP 0.25 113.53 0.98 0.39 1.59 33 93.83%
SCAD 0.26 75.32 0.99 0.69 1.30 66.58 93.60%
SPDAS 0.28 14.38 0.97 0.05 1.92 20.34 94%
0.2 Lasso 0.78 36.06 0.98 0.92 1.06 265.9 88.62%
MCP 0.26 108.64 0.98 0.35 1.63 31.42 93.54%
SCAD 0.26 75.77 0.99 0.68 1.31 65.23 93.36%
SPDAS 0.27 13.51 0.97 0.03 1.94 20.1 94.17%
0.3 Lasso 0.79 37.04 0.98 0.92 1.06 262.75 88.74%
MCP 0.26 103.542 0.98 0.33 1.65 30.13 93.81%
SCAD 0.26 77.27 0.99 0.66 1.33 62.27 93.47%
SPDAS 0.24 12.80 0.97 0.03 1.94 20.06 94.01%
0.4 Lasso 0.80 37.46 0.98 0.92 1.06 260.41 88.66%
MCP 0.27 103.60 0.98 0.32 1.66 29.7 94.14%
SCAD 0.30 76.73 0.99 0.64 1.35 58.87 93.91%
SPDAS 0.24 12.14 0.97 0.04 1.93 20.13 94.35%
0.5 Lasso 0.81 38.91 0.97 0.92 1.05 258.63 89.04%
MCP 0.28 104.73 0.97 0.28 1.69 27.85 94.27%
SCAD 0.28 84.31 0.99 0.62 1.37 54.64 94.06%
SPDAS 0.20 11.13 0.97 0.06 1.91 20.53 94.71%
0.6 Lasso 0.82 38.94 0.98 0.92 1.06 254.06 89.46%
MCP 0.30 95.63 0.97 0.27 1.70 27.24 94.52%
SCAD 0.27 78.83 0.99 0.58 1.41 50 94.42%
SPDAS 0.21 17.14 0.97 0.06 1.91 20.63 95.07%
0.7 Lasso 0.83 39.55 0.97 0.92 1.05 252.8 89.63%
MCP 0.26 90.63 0.98 0.23 1.75 26.27 95.27%
SCAD 0.25 85.02 0.99 0.55 1.44 47.05 95.25%
SPDAS 0.22 11.27 0.96 0.06 1.90 20.56 95.13%
0.8 Lasso 0.84 39.13 0.97 0.92 1.05 256.16 89.28%
MCP 0.28 91.57 0.98 0.21 1.77 25.82 95.54%
SCAD 0.25 85.04 0.99 0.53 1.46 45.73 95.3%
SPDAS 0.23 13.86 0.97 0.05 1.92 20.46 95.37%
0.9 Lasso 0.85 39.05 0.97 0.92 1.05 254.88 89.72%
MCP 0.27 89.77 0.98 0.20 1.78 25.11 95.48%
SCAD 0.29 86.91 0.99 0.50 1.49 42.69 95.31%
SPDAS 0.22 13.01 0.98 0.05 1.93 20.66 95.78%
$ \rho $ Method RE Time(s) APDR AFDR ADR MSES ACRP
0.1 Lasso 0.77 37.18 0.97 0.92 1.05 269.54 88.54%
MCP 0.25 113.53 0.98 0.39 1.59 33 93.83%
SCAD 0.26 75.32 0.99 0.69 1.30 66.58 93.60%
SPDAS 0.28 14.38 0.97 0.05 1.92 20.34 94%
0.2 Lasso 0.78 36.06 0.98 0.92 1.06 265.9 88.62%
MCP 0.26 108.64 0.98 0.35 1.63 31.42 93.54%
SCAD 0.26 75.77 0.99 0.68 1.31 65.23 93.36%
SPDAS 0.27 13.51 0.97 0.03 1.94 20.1 94.17%
0.3 Lasso 0.79 37.04 0.98 0.92 1.06 262.75 88.74%
MCP 0.26 103.542 0.98 0.33 1.65 30.13 93.81%
SCAD 0.26 77.27 0.99 0.66 1.33 62.27 93.47%
SPDAS 0.24 12.80 0.97 0.03 1.94 20.06 94.01%
0.4 Lasso 0.80 37.46 0.98 0.92 1.06 260.41 88.66%
MCP 0.27 103.60 0.98 0.32 1.66 29.7 94.14%
SCAD 0.30 76.73 0.99 0.64 1.35 58.87 93.91%
SPDAS 0.24 12.14 0.97 0.04 1.93 20.13 94.35%
0.5 Lasso 0.81 38.91 0.97 0.92 1.05 258.63 89.04%
MCP 0.28 104.73 0.97 0.28 1.69 27.85 94.27%
SCAD 0.28 84.31 0.99 0.62 1.37 54.64 94.06%
SPDAS 0.20 11.13 0.97 0.06 1.91 20.53 94.71%
0.6 Lasso 0.82 38.94 0.98 0.92 1.06 254.06 89.46%
MCP 0.30 95.63 0.97 0.27 1.70 27.24 94.52%
SCAD 0.27 78.83 0.99 0.58 1.41 50 94.42%
SPDAS 0.21 17.14 0.97 0.06 1.91 20.63 95.07%
0.7 Lasso 0.83 39.55 0.97 0.92 1.05 252.8 89.63%
MCP 0.26 90.63 0.98 0.23 1.75 26.27 95.27%
SCAD 0.25 85.02 0.99 0.55 1.44 47.05 95.25%
SPDAS 0.22 11.27 0.96 0.06 1.90 20.56 95.13%
0.8 Lasso 0.84 39.13 0.97 0.92 1.05 256.16 89.28%
MCP 0.28 91.57 0.98 0.21 1.77 25.82 95.54%
SCAD 0.25 85.04 0.99 0.53 1.46 45.73 95.3%
SPDAS 0.23 13.86 0.97 0.05 1.92 20.46 95.37%
0.9 Lasso 0.85 39.05 0.97 0.92 1.05 254.88 89.72%
MCP 0.27 89.77 0.98 0.20 1.78 25.11 95.48%
SCAD 0.29 86.91 0.99 0.50 1.49 42.69 95.31%
SPDAS 0.22 13.01 0.98 0.05 1.93 20.66 95.78%
Table 4.  Description of four real data sets
Data name $ n $ samples $ p $ features training size $ n_1 $ testing set $ n_2 $
duke breast-cancer 42 7129 38 4
gisette 7000 5000 6000 1000
leukemia 72 7129 38 34
madelon 2600 500 2000 600
splice 3175 60 1000 2175
Data name $ n $ samples $ p $ features training size $ n_1 $ testing set $ n_2 $
duke breast-cancer 42 7129 38 4
gisette 7000 5000 6000 1000
leukemia 72 7129 38 34
madelon 2600 500 2000 600
splice 3175 60 1000 2175
Table 5.  Classification accuracy rate
Data name SPDAS Lasso MCP SCAD
duke breast-cancer 75% 1 25% 75%
gisette 54.70% 51.30% 59.90% 57.10%
leukemia 94.12% 91.17% 94.11% 91.17%
madelon 60.33% 61.50% 61.50% 62%
splice 84.18% 85.70% 84.91% 85.01%
Data name SPDAS Lasso MCP SCAD
duke breast-cancer 75% 1 25% 75%
gisette 54.70% 51.30% 59.90% 57.10%
leukemia 94.12% 91.17% 94.11% 91.17%
madelon 60.33% 61.50% 61.50% 62%
splice 84.18% 85.70% 84.91% 85.01%
Table 6.  The number of selected variables ($ \hat{T} $)
Data name SPDAS Lasso MCP SCAD
duke breast-cancer 14 23 5 17
gisette 47 507 49 121
leukemia 14 13 4 11
madelon 3 4 2 8
splice 22 40 26 33
Data name SPDAS Lasso MCP SCAD
duke breast-cancer 14 23 5 17
gisette 47 507 49 121
leukemia 14 13 4 11
madelon 3 4 2 8
splice 22 40 26 33
[1]

Gautier Picot. Energy-minimal transfers in the vicinity of the lagrangian point $L_1$. Conference Publications, 2011, 2011 (Special) : 1196-1205. doi: 10.3934/proc.2011.2011.1196

[2]

Lei Wu, Zhe Sun. A new spectral method for $l_1$-regularized minimization. Inverse Problems and Imaging, 2015, 9 (1) : 257-272. doi: 10.3934/ipi.2015.9.257

[3]

Yingying Li, Stanley Osher, Richard Tsai. Heat source identification based on $l_1$ constrained minimization. Inverse Problems and Imaging, 2014, 8 (1) : 199-221. doi: 10.3934/ipi.2014.8.199

[4]

Pia Heins, Michael Moeller, Martin Burger. Locally sparse reconstruction using the $l^{1,\infty}$-norm. Inverse Problems and Imaging, 2015, 9 (4) : 1093-1137. doi: 10.3934/ipi.2015.9.1093

[5]

Satoshi Ito, Soon-Yi Wu, Ting-Jang Shiu, Kok Lay Teo. A numerical approach to infinite-dimensional linear programming in $L_1$ spaces. Journal of Industrial and Management Optimization, 2010, 6 (1) : 15-28. doi: 10.3934/jimo.2010.6.15

[6]

Lidan Li, Hongwei Zhang, Liwei Zhang. Inverse quadratic programming problem with $ l_1 $ norm measure. Journal of Industrial and Management Optimization, 2020, 16 (5) : 2425-2437. doi: 10.3934/jimo.2019061

[7]

Zhaohui Guo, Stanley Osher. Template matching via $l_1$ minimization and its application to hyperspectral data. Inverse Problems and Imaging, 2011, 5 (1) : 19-35. doi: 10.3934/ipi.2011.5.19

[8]

Yupeng Li, Wuchen Li, Guo Cao. Image segmentation via $ L_1 $ Monge-Kantorovich problem. Inverse Problems and Imaging, 2019, 13 (4) : 805-826. doi: 10.3934/ipi.2019037

[9]

Jiying Liu, Jubo Zhu, Fengxia Yan, Zenghui Zhang. Compressive sampling and $l_1$ minimization for SAR imaging with low sampling rate. Inverse Problems and Imaging, 2013, 7 (4) : 1295-1305. doi: 10.3934/ipi.2013.7.1295

[10]

Z.Y. Wu, H.W.J. Lee, F.S. Bai, L.S. Zhang. Quadratic smoothing approximation to $l_1$ exact penalty function in global optimization. Journal of Industrial and Management Optimization, 2005, 1 (4) : 533-547. doi: 10.3934/jimo.2005.1.533

[11]

Vladimir Gaitsgory, Tanya Tarnopolskaya. Threshold value of the penalty parameter in the minimization of $L_1$-penalized conditional value-at-risk. Journal of Industrial and Management Optimization, 2013, 9 (1) : 191-204. doi: 10.3934/jimo.2013.9.191

[12]

Markus Grasmair. Well-posedness and convergence rates for sparse regularization with sublinear $l^q$ penalty term. Inverse Problems and Imaging, 2009, 3 (3) : 383-387. doi: 10.3934/ipi.2009.3.383

[13]

Ahmad Mousavi, Zheming Gao, Lanshan Han, Alvin Lim. Quadratic surface support vector machine with L1 norm regularization. Journal of Industrial and Management Optimization, 2022, 18 (3) : 1835-1861. doi: 10.3934/jimo.2021046

[14]

Burak Ordin, Adil Bagirov, Ehsan Mohebi. An incremental nonsmooth optimization algorithm for clustering using $ L_1 $ and $ L_\infty $ norms. Journal of Industrial and Management Optimization, 2020, 16 (6) : 2757-2779. doi: 10.3934/jimo.2019079

[15]

Weina Wang, Chunlin Wu, Yiming Gao. A nonconvex truncated regularization and box-constrained model for CT reconstruction. Inverse Problems and Imaging, 2020, 14 (5) : 867-890. doi: 10.3934/ipi.2020040

[16]

Lican Kang, Yanming Lai, Yanyan Liu, Yuan Luo, Jing Zhang. High-dimensional linear regression with hard thresholding regularization: Theory and algorithm. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022034

[17]

Mariane Bourgoing. Viscosity solutions of fully nonlinear second order parabolic equations with $L^1$ dependence in time and Neumann boundary conditions. Discrete and Continuous Dynamical Systems, 2008, 21 (3) : 763-800. doi: 10.3934/dcds.2008.21.763

[18]

Paul Sacks, Mahamadi Warma. Semi-linear elliptic and elliptic-parabolic equations with Wentzell boundary conditions and $L^1$-data. Discrete and Continuous Dynamical Systems, 2014, 34 (2) : 761-787. doi: 10.3934/dcds.2014.34.761

[19]

Huiyuan Guo, Quan Yu, Xinzhen Zhang, Lulu Cheng. Low rank matrix minimization with a truncated difference of nuclear norm and Frobenius norm regularization. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022045

[20]

Yuyuan Ouyang, Trevor Squires. Some worst-case datasets of deterministic first-order methods for solving binary logistic regression. Inverse Problems and Imaging, 2021, 15 (1) : 63-77. doi: 10.3934/ipi.2020047

2021 Impact Factor: 1.411

Metrics

  • PDF downloads (126)
  • HTML views (138)
  • Cited by (0)

Other articles
by authors

[Back to Top]