Article Contents
Article Contents

# A smoothing iterative method for quantile regression with nonconvex $\ell_p$ penalty

• *Corresponding author: Lingchen Kong
The first author is supported by National Natural Science Foundation of China (11431002, 11171018).
• The high-dimensional linear regression model has attracted much attention in areas like information technology, biology, chemometrics, economics, finance and other scientific fields. In this paper, we use smoothing techniques to deal with high-dimensional sparse models via quantile regression with the nonconvex $\ell_p$ penalty ($0<p<1$). We introduce two kinds of smoothing functions and give the estimation of approximation by our different smoothing functions. By smoothing the quantile function, we derive two types of lower bounds for any local solution of the smoothing quantile regression with the nonconvex $\ell_p$ penalty. Then with the help of $\ell_1$ regularization, we propose a smoothing iterative method for the smoothing quantile regression with the weighted $\ell_1$ penalty and establish its global convergence, whose efficient performance is illustrated by the numerical experiments.

Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

 Citation:

• Figure 1.  Function ρτ(θ) and its smoothing relaxation ρτ, 1(θ, δ)

Figure 2.  Function $\varrho_\tau(\theta, \delta)$ and its smoothing relaxation $\rho_{\tau, 2}(\theta, \delta)$

Figure 3.  Errors for different τ, n, and noise in Example 1

Figure 4.  Errors for different τ, n, and noise in Example 2

Table 1.  The framework of MIRL1

 Modified iterative reweighted $\ell_1$ minimization (MIRL1) Initialize $\tau, \gamma\in(0, 1), \delta^1>0, \epsilon>0, \beta^{0}, w^{1}, M, \mu^{1}$; For $t=1: M$ Initialize $\beta^{t, 1}=\beta^{t-1}$; While $\|\beta^{t, k+1}-\beta^{t, k}\|_2\geq\mu^t \max\left\{1, \|\beta^{t, k}\|_2\right\}$ Compute $L_{t, k}\geq \frac{1}{2\delta^t }\min\Big\{\lambda_{\max}\Big(\sum_{\Omega\left(\beta^{t, k}, \delta^\top \right)}x_{i}x_{i}^\top \Big), \lambda_{\max}(XX^\top )\Big\}$; Compute $\widetilde{\beta}^{t, k}=\beta^{t, k}-\nabla S_\tau(\beta^{t, k}, \delta^t )/(L_{t, k}+\epsilon)$; Compute $\beta^{t, k+1}=\text{sign}\left(\widetilde{\beta}^{t, k}\right)\circ\max\left\{|\widetilde{\beta}^{t, k}|-\frac{\mu^t }{L_{t, k}+\epsilon}w^t, 0\right\}$. End Update $\delta^{t+1}=\gamma\delta^t$ and $\beta^t =\beta^{t, k+1}$; Update $w^{t+1}$ from $\beta^{t-1}$ and $\beta^t$ based on (46), (47) and (48); End

Table 2.  $n=512, m=128$

 $\tau$ Noise $\|\widehat{\beta}-\beta^*\|_2$ $\frac{1}{\sqrt{m}}\|X\widehat{\beta}-X\beta^*\|_2$ FPR TPR Time(s) $0.1$ $N(0, \sigma^2)$ 0.0063 0.0078 0 1.0000 2.7474 $LN(0, \sigma^2)$ 0.0146 0.0177 0 0.9429 3.0230 $0.3$ $N(0, \sigma^2)$ 0.0067 0.0091 0 1.0000 2.4950 $LN(0, \sigma^2)$ 0.0167 0.0209 0 0.8762 2.2424 $0.5$ $N(0, \sigma^2)$ 0.0082 0.0097 0 1.0000 2.2793 $LN(0, \sigma^2)$ 0.0175 0.0201 0 0.8563 2.1709 $0.7$ $N(0, \sigma^2)$ 0.0069 0.0093 0 1.0000 2.4489 $LN(0, \sigma^2)$ 0.0208 0.0258 0 0.8028 2.4478 $0.9$ $N(0, \sigma^2)$ 0.0062 0.0077 0 1.0000 2.6616 $LN(0, \sigma^2)$ 0.0243 0.0299 0 0.6254 2.7624

Table 3.  $n=1024, m=256$

 $\tau$ Noise $\|\widehat{\beta}-\beta^*\|_2$ $\frac{1}{\sqrt{m}}\|X\widehat{\beta}-X\beta^*\|_2$ FPR TPR Time(s) $0.1$ $N(0, \sigma^2)$ 0.0069 0.0091 0 1.0000 13.3403 $LN(0, \sigma^2)$ 0.0194 0.0257 0 0.9818 15.4523 $0.3$ $N(0, \sigma^2)$ 0.0070 0.0093 0 1.0000 10.1645 $LN(0, \sigma^2)$ 0.0193 0.0244 0 1.0000 11.2862 $0.5$ $N(0, \sigma^2)$ 0.0076 0.0096 0 1.0000 11.4844 $LN(0, \sigma^2)$ 0.0201 0.0252 0 1.0000 11.0627 $0.7$ $N(0, \sigma^2)$ 0.0074 0.0097 0 1.0000 12.2611 $LN(0, \sigma^2)$ 0.0206 0.0264 0 0.9818 12.3306 $0.9$ $N(0, \sigma^2)$ 0.0070 0.0093 0 1.0000 13.6169 $LN(0, \sigma^2)$ 0.0226 0.0286 0 0.9538 14.1217

Table 4.  $n=512, m=128$

 $\tau$ Noise $\|\widehat{\beta}-\beta^*\|_2$ $\frac{1}{\sqrt{m}}\|X\widehat{\beta}-X\beta^*\|_2$ FPR TPR Time(s) $0.1$ $N(0, \sigma^2)$ 0.0071 0.0096 0 1.0000 3.3895 $LN(0, \sigma^2)$ 0.0185 0.0242 0 1.0000 2.5908 $0.3$ $N(0, \sigma^2)$ 0.0071 0.0098 0 1.0000 2.6341 $LN(0, \sigma^2)$ 0.0189 0.0246 0 1.0000 2.1094 $0.5$ $N(0, \sigma^2)$ 0.0071 0.0094 0 1.0000 1.7425 $LN(0, \sigma^2)$ 0.0184 0.0224 0 1.0000 1.3525 $0.7$ $N(0, \sigma^2)$ 0.0070 0.0095 0 1.0000 1.6148 $LN(0, \sigma^2)$ 0.0196 0.0249 0 0.9667 1.3640 $0.9$ $N(0, \sigma^2)$ 0.0072 0.0095 0 1.0000 2.3751 $LN(0, \sigma^2)$ 0.0186 0.0227 0 1.0000 2.8742

Table 5.  $n=1024, m=256$

 $\tau$ Noise $\|\widehat{\beta}-\beta^*\|_2$ $\frac{1}{\sqrt{m}}\|X\widehat{\beta}-X\beta^*\|_2$ FPR TPR Time(s) $0.1$ $N(0, \sigma^2)$ 0.0071 0.0096 0 1.0000 17.5535 $LN(0, \sigma^2)$ 0.0186 0.0246 0 1.0000 13.5658 $0.3$ $N(0, \sigma^2)$ 0.0074 0.0093 0 1.0000 13.3480 $LN(0, \sigma^2)$ 0.0196 0.0261 0 1.0000 8.8331 $0.5$ $N(0, \sigma^2)$ 0.0076 0.0094 0 1.0000 9.8347 $LN(0, \sigma^2)$ 0.0183 0.0241 0 1.0000 7.6007 $0.7$ $N(0, \sigma^2)$ 0.0069 0.0093 0 1.0000 13.6823 $LN(0, \sigma^2)$ 0.0200 0.0272 0 1.0000 7.9791 $0.9$ $N(0, \sigma^2)$ 0.0072 0.0094 0 1.0000 18.5440 $LN(0, \sigma^2)$ 0.0184 0.0249 0 1.0000 14.3975
•  [1] A. Y. Aravkin, A. Kambadur, A. C. Lozano and R. Luss, Sparse quantile huber regression for efficient and robust estimation, preprint, arXiv: 1402.4624. [2] P. Bühlmann and S. V. D. Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer Science & Business Media, 2011. doi: 10.1007/978-3-642-20192-9. [3] E. J. Cand$\grave{e}$s, M. B. Wakin and S. P. Boyd, Enhancing sparsity by reweighted $\ell_1$ minimization, Journal of Fourier Analysis and Applications, 14 (2008), 877-905.  doi: 10.1007/s00041-008-9045-x. [4] T. T. Cai, L. Wang and G. W. Xu, Shifting inequality and recovery of sparse signals, IEEE Trans. Signal Process, 58 (2010), 1300-1308.  doi: 10.1109/TSP.2009.2034936. [5] X. J. Chen, F. M. Xu and Y. Y. Ye, Lower bound theory of nonzero entries in solutions of $\ell_2$-$\ell_p$ minimization, SIAM J. Sci. Comput., 32 (2010), 2832-2852.  doi: 10.1137/090761471. [6] X. J. Chen and W. J. Zhou, Smoothing nonlinear conjugate gradient method for image restoration using nonsmooth nonconvex minimization, SIAM J. Imaging Sci., 3 (2010), 765-790.  doi: 10.1137/080740167. [7] T. T. Cai and A. Zhang, Sharp RIP bound for sparse signal and low-rank matrix recovery, Applied and Computational Harmonic Analysis, 35 (2013), 74-93.  doi: 10.1016/j.acha.2012.07.010. [8] D. L. Donoho and M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via $\ell_1$ minimization, PNAS, 100 (2003), 2197-2202.  doi: 10.1073/pnas.0437847100. [9] I. Daubechies, R. DeVore, M. Fornasier and C. S. G$ü$nt$ü$rk, Iteratively reweighted least squares minimization for sparse recovery, Communications on Pure and Applied Mathematics, 63 (2010), 1-38.  doi: 10.1002/cpa.20303. [10] F. Francisco and J. S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, Springer-Verlag, New York, 2007. [11] J. Q. Fan, F. Han and H. Liu, Challenges of big data analysis, National Science Review, 1 (2014), 239-314.  doi: 10.1093/nsr/nwt032. [12] J. Q. Fan, Y. Y. Fan and E. Barut, Adaptive robust variable selection, Annals of Statistics, 42 (2014), 324-351.  doi: 10.1214/13-AOS1191. [13] R. Koenker and G. Bassett, Regression quantiles, Econometrica, 46 (1978), 33-50.  doi: 10.2307/1913643. [14] R. Koenker, Quantile Regression, Cambridge University Press, 2005. doi: 10.1017/CBO9780511754098. [15] M. J. Lai and J. Wang, An unconstrained $l_q$ minimization with $0 < q ≤ 1$ for sparse solution of underdetermined linear systems, SIAM J. Optim., 21 (2011), 82-101.  doi: 10.1137/090775397. [16] L. C. Kong, J. Sun and N. H. Xiu, A regularized smoothing newton method for symmetric cone complimentarity problem, SIAM J. Optim., 19 (2008), 1028-1047.  doi: 10.1137/060676775. [17] Z. S. Lu, Iterative reweighted minimization methods for $\ell_p$ regularized unconstrained nonlinear programming, Mathematical Programming, 147 (2014), 227-307.  doi: 10.1007/s10107-013-0722-4. [18] L. Wang, Y. Wu and R. Li, Quantile regression for analyzing heterogeneity in ultra high-demension, Journal of the American Statistical Association, 107 (2012), 214-222.  doi: 10.1080/01621459.2012.656014. [19] L. Wang, The $L_1$ penalized LAD estimator for high dimensional linear regression, Journal of Multivariate Analysis, 120 (2013), 135-151.  doi: 10.1016/j.jmva.2013.04.001. [20] Y. B. Zhao and D. Li, Reweighted $\ell_1$-minimization for sparse solutions to underdetermined linear systems, SIAM J. Optim., 22 (2012), 1065-1088.  doi: 10.1137/110847445. [21] S. L. Zhou, N. H. Xiu, Y. N. Wang and L. C. Kong, Exact recovery for sparse signal via weighted $\ell_1$ minimization, preprint, arXiv: 1312.2358. [22] H. Zou, The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101 (2006), 1418-1429.  doi: 10.1198/016214506000000735.

Figures(4)

Tables(5)

• on this site

/