August  2020, 19(8): 4023-4054. doi: 10.3934/cpaa.2020178

Convergence of online pairwise regression learning with quadratic loss

1. 

School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China

2. 

Department of Applied Statistics, Shaoxing University, Shaoxing, 312000, China

* Corresponding author

Received  July 2019 Revised  November 2019 Published  May 2020

Fund Project: This work is supported by the National Natural Science Foundation of China under Grants Nos. 61877039, 11971432 and the Humanities and Social Sciences Research Project of Ministry of Education (18YJA910001)

Recent investigations on the error analysis of kernel regularized pairwise learning initiate the theoretical research on pairwise reproducing kernel Hilbert spaces (PRKHSs). In the present paper, we provide a method of constructing PRKHSs with classical Jacobi orthogonal polynomials. The performance of the kernel regularized online pairwise regression learning algorithms based on a quadratic loss function is investigated. Applying convex analysis and Rademacher complexity techniques, the bounds for the generalization error are provided explicitly. It is shown that the convergence rate can be greatly improved by adjusting the scale parameters in the loss function.

Citation: Shuhua Wang, Zhenlong Chen, Baohuai Sheng. Convergence of online pairwise regression learning with quadratic loss. Communications on Pure and Applied Analysis, 2020, 19 (8) : 4023-4054. doi: 10.3934/cpaa.2020178
References:
[1]

S. Agarwal and P. Niyogi, Generalization bounds for ranking algorithms via algorithms via algorithmic stability, J. Mach. Learn. Res., 10 (2009), 441-474. 

[2]

R. AktassA. Altin and F. Tasdelen, A note on a family of two-variable polynomials, J. Comput. Appl. Math., 235 (2011), 4825-4833.  doi: 10.1016/j.cam.2010.11.005.

[3]

A. AltinR. Aktas and E. Erkus-Duman, On a multivariable extension for the extended Jacobi polynomials, J. Math. Anal. Appl., 353 (2009), 121-133.  doi: 10.1016/j.jmaa.2008.11.070.

[4]

P. L. BartlettO. Bousquet and S. Mendelson, Local rademacher complexities, Ann. Stat., 33 (2005), 1497-1537.  doi: 10.1214/009053605000000282.

[5]

P. L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res., 3 (2002), 463-482.  doi: 10.1162/153244303321897690.

[6]

H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer-Verlag, New York, 2010. doi: 10.1007/978-1-4419-9467-7.

[7]

A. Bezubik, J. Hrivn$\acute{a}$k and S. Po$\check{s}$ta, Two dimensional symmetric and antisymmetric exponential functions and orthogonal polynomials, J. Phys. Confer. Ser., 411 (2013), Art. 012007. doi: 10.1088/1742-6596/411/1/012007.

[8]

M. Boissier, S. Lyu, Y. Ying and D. X. Zhou, Fast convergence of online pairwise learning algorithms, in Proceedings of the 19th International Conference on Artifical Intelligence and Statistics, (2016) 204–212.

[9] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, Cambridge University Press, 2004.  doi: 10.1017/CBO9780511804441.
[10]

C. BrunnerA. FischerK. Luig and T. Thies, Pairwise support vector machines and their application to large scale problems, J. Mach. Learn. Res., 13 (2012), 2279-2292. 

[11]

Q. CaoZ. C. Guo and Y. M. Ying, Generalization bounds for metric and similarity learning, Mach. Learn., 102 (2016), 115-132.  doi: 10.1007/s10994-015-5499-7.

[12]

H. ChenZ. B. Pan and L. Q. Li, Learning performance of coefficient-based regularized ranking, Neurocomputing, 133 (2014), 54-62. 

[13]

H. Chen and J. T. Wu, Regularized ranking with convex losses and $\ell^{1}$ penalty, Abst. Appl. Anal., (2013), Art. 927827, 8 pp. doi: 10.1155/2013/927827.

[14]

R. H. Chen and Z. M. Wu, Solving partial differential equation by using multiquadratic quasi-interpolation, Appl. Math. Comput., 186 (2007), 1502-1510.  doi: 10.1016/j.amc.2006.07.160.

[15]

X. M. Chen and Y. W. Lei, Refined bounds for online pairwise learning algorithms, Neurocomputing, 275 (2018), 2656-2665. 

[16]

S. O. Cheng and A. J. Smola, Hyperkernels, Conference Paper, 2002. Available from: https://www.researchgate.net/publication/221618045.

[17]

S. O. ChengA. J. Smola and R. C. Williamson, Learning the kernel with hyperkernels, J. Mach. Learn. Res., 6 (2005), 1043-1071. 

[18]

A. Christmann and D. X. Zhou, On the robustness of regularized pairwise learning methods based on kernels, J. Complexity, 37 (2016), 1-33.  doi: 10.1016/j.jco.2016.07.001.

[19]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.

[20] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, New York, 2007.  doi: 10.1017/CBO9780511618796.
[21]

S. R. DasKu hooD. Mishra and M. Rout, An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies, Physica A, 513 (2019), 339-370. 

[22] C. F. Dunkl and Y. Xu, Orthogonal Polynomials of Several Variables, Cambridge University Press, 2014.  doi: 10.1017/CBO9781107786134.
[23]

Q. FangM. Xu and Y. M. Ying, Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems, Anal. Appl., 16 (2018), 741-755.  doi: 10.1142/S0219530518500082.

[24]

L. Fern$\acute{a}$ndezT. E. P$\acute{e}$rez and M. A. Pi$\tilde{n}$ar, On Koornwinder classical orthogonal polynomials in two variables, J. Comput. Appl. Math., 236 (2012), 3817-3826.  doi: 10.1016/j.cam.2011.08.017.

[25]

W. W. Gao and Z. G. Wang, A meshless scheme for partial differential equations based on multiquadratic trigonometric b-spline quasi-interpolation, Chin. Phys. B, 23 (2014), Art. 110207.

[26]

Z. C. GuoY. M. Ying and D. X. Zhou, Online regularized learning with pairwise loss functions, Adv. Comput. Math., 43 (2017), 127-150.  doi: 10.1007/s10444-016-9479-7.

[27]

T. HuJ. Fan and D. H. Xiang, Convergence analysis of distributed multi-penalty regularized pairwise learning, Anal. Appl., 18 (2020), 109-127.  doi: 10.1142/S0219530519410045.

[28]

S. Karlin and J. Gregor, Determinants of orthogonal polynomials, Bull. Amer. Math. Soc., 68 (1962), 204-209.  doi: 10.1090/S0002-9904-1962-10749-6.

[29]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Stati., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.

[30]

A. Klimyk and J. Patera, (Anti)Symmetric multivariable exponential functions and corresponding Fourier transforms, J. Phys. A Math. Theor., 40 (2007), 10473-10489.  doi: 10.1088/1751-8113/40/34/006.

[31]

T. Koornwinder, Two-variable analogues of the classical orthogonal polynomials, in Theory and Application of Special Functions (eds. R. A. Askey), Academic Press, New York, (1975), 435–495.

[32]

T. Koornwinder, Yet another proof of the addition formula for Jacobi polynomials, J. Math. Anal. Appl., 61 (1977), 136-141.  doi: 10.1016/0022-247X(77)90149-4.

[33]

T. Koornwinder, Orthogonal polynomials in two variables which are eigenfunctions of two algebraically independeng partial differential operators, I, II, Indag Math., 36 (1974), 48-66. 

[34]

H. L. Krall and I. M. Sheffer, Orthogonal polynomials in two variables, Ann. Mat. Pura Appl., 76 (1967), 325-376.  doi: 10.1007/BF02412238.

[35]

G. KriukovaS. V. Pereverzyev and P. Tkachenko, On the convergence rate and some applications of regularized ranking algorithms, J. Complexity, 33 (2016), 14-29.  doi: 10.1016/j.jco.2015.09.004.

[36]

D. W. Lee, Partial differential equations for products of two classical orthogonal polynomials, Bull. Korean Math. Soc., 42 (2005), 179-188.  doi: 10.4134/BKMS.2005.42.1.179.

[37]

S. B. LinJ. S. ZengJ. Fang and Z. B. Xu, Learning rates of $l^q$ coefficient regularization learning with Gaussian kernel, Neural Comput., 26 (2014), 2359-2378.  doi: 10.1162/NECO_a_00641.

[38]

J. H. Lin and D. X. Zhou, Online learning algorithms can converge comparably fast as batch learning, IEEE Trans. Neural Netw. Learn. Syst., 29 (2017), 2367-2378.  doi: 10.1109/TNNLS.2017.2677970.

[39]

J. H. LinY. W. LeiB. Zhang and D. X. Zhou, Online pairwise learning algorithms with convex loss functions, Inform. Sci., 406-407 (2017), 57-70.  doi: 10.1162/neco_a_00817.

[40]

H. X. LiuB. H. Sheng and P. X. Ye, The improved learning rate for regularized regression with RKBSs, Int. J. Math. Learn. Cyber., 8 (2017), 1235-1245. 

[41]

P. Malave, Multi-dimensional Orthogonal Polynomials, Ph.D thesis, Marathwada, Univ. Aurangabad, 1979.

[42] L. MasonJ. BaxterP. Bartlett and M. Frean, Functional Gradient Techniques for Combining Hypotheses, MIT Press, 1999. 
[43]

R. Meir and T. Zhang, Generalization error bounds for Bayesian mixture algorithms, J. Mach. Learn. Res., 4 (2003), 839-860.  doi: 10.1162/1532443041424300.

[44]

G. V. Milovanovi$\acute{c}$G. $\ddot{O}$zt$\ddot{u}$rk and R. Aktas, Properties of some of two-variable orthogonal polynomials, Bull. Malay. Math. Sci. Soc., 43 (2020), 1403-1431.  doi: 10.1007/s40840-019-00750-8.

[45]

T. Pahikkala, M. Viljanen, A. Airola and W. Waegeman, Spectral analysis of symmetric and antisymmetric pairwise kernels, preprint, arXiv: 1506.05950v1, (2015), 19 pp.

[46]

N. Ratliff and J. Bagnell, Kernel conjugate gradient for fast kernel machines, In IJCAI, 20 (2007), 1017-1021. 

[47]

W. Rejchel, On ranking and generalization bounds, J. Mach. Learn. Res., 13 (2012), 1373-1392. 

[48]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.

[49]

H. Rosengren, Multivariable Chiristoffel-Darboux kernels and charateristic polynomials of random Hermitian matrices, Symmetry Integr. Geom. Meth. Appl., 2 (2006), Art. 085, 12 pp. doi: 10.3842/SIGMA.2006.085.

[50] B. Sch$\ddot{o}$lkopfR. Herbrich and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002. 
[51]

B. H. Sheng, The weighted norm for some Mercer kernel matrices, Acta. Math. Sci., 33A (2013), 6–15 (in Chinese).

[52]

B. H. ShengL. Q. Duan and P. X. Ye, Strong convex loss can increase the learning rates of online learning, J. Computer, 9 (2014), 1606-1611. 

[53]

B. H. Sheng and J. L. Wang, On the K-functional in learning theory, Anal. Appl., 18 (2020), 423-446.  doi: 10.1142/S0219530519500192.

[54]

B. H. Sheng and D. H. Xiang, The performance of semi-supervised laplacian regularized regression with least square loss, Int. J. Wavelets Multiresolut. Inform. Process., 15 (2017), Art. 1750016, 31 pp. doi: 10.1142/S0219691317500163.

[55]

B. H. Sheng and H. Z. Zhang, Performance analysis of the LapRSSLG algorithm in learning theory, Anal. Appl., 18 (2020), 79-108.  doi: 10.1142/S0219530519410033.

[56]

B. H. Sheng and H. C. Zhu, The convergence rate of semi-supervised regression with quadratic loss, Appl. Math. Comput., 321 (2018), 11-24.  doi: 10.1016/j.amc.2017.10.033.

[57]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory, Anal. Appl., 1 (2003), 1-25.  doi: 10.1142/S0219530503000089.

[58]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.

[59]

I. Sprinkhuizen-Kuyper, Orthogonal polynomials in two variables. A further analysis of the polynomials orthogonal over a region bounded by two lines and a parabola, SIAM J. Math. Anal., 7 (1976), 501-518.  doi: 10.1137/0507041.

[60]

I. Steinwart and C. Scovel, Mercer$\prime$s theorem on general domains: on the interaction between measures, kernels, and RKHSs, Constr. Approx., 35 (2012), 363-417.  doi: 10.1007/s00365-012-9153-3.

[61]

H. Sun, Mercer theorem for RKHS on noncompact sets, J. Complexity, 21 (2005), 337-349.  doi: 10.1016/j.jco.2004.09.002.

[62]

S. H. Wang, Z. L. Chen and B. H. Sheng, The convergence rate of SVM for kernel-based robust regression, Int. J. Wavelets Multiresolut. Inform. Process., 17 (2019), Art. 1950004. doi: 10.1142/S0219691319500048.

[63]

C. Wang and T. Hu, Online regularized rairwise learning with least squares loss, Anal. Appl., 18 (2020), 49-78.  doi: 10.1142/S0219530519410070.

[64]

Y. Wang, R. Khardon, D. Pechyony and R. Jones, Generalization bounds for online learning algorithms with pairwise loss functions, in JMLR: Workshop and Conference Proceedings, Annual Conference on Learning Theory, 23 (2012), 13.1–13.22.

[65]

Q. WuY. M. Ying and D. X. Zhou, Multi-kernel regularized classifiers, J. Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.

[66]

Z. M. Wu, Dynamical knot and shape parameter setting for simulating shock wave by using multi-quadratic quasi-interpolation, Eng. Anal. Bound. Elem., 29 (2005), 354-358. 

[67]

Z. M. Wu and R. Schaback, Shape preserving properties and convergence of univariate multiquadratic quasi-interpolation, Acta Math. Appl. Sin., 10 (1994), 441-446.  doi: 10.1007/BF02016334.

[68]

D. H. Xiang, Conditional quantiles with varying Gaussians, Adv. Comput. Math., 38 (2013), 723-735.  doi: 10.1007/s10444-011-9257-5.

[69]

G. B. Ye and D. X. Zhou, Fully online classification by regularization, Appl. Comput. Harmon. Anal., 23 (2007), 198-214.  doi: 10.1016/j.acha.2006.12.001.

[70]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, J. Mach. Learn. Res., 8 (2007), 249-276. 

[71]

Y. M. Ying and D. X. Zhou, Online regularized classification algorithms, IEEE. Trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.

[72]

Y. M. Ying and D. X. Zhou, Online pairwise learning algorithms, Neural Comput., 28 (2016), 743-777.  doi: 10.1162/neco_a_00817.

[73]

Y. M. Ying and D. X. Zhou, Unregularized online learning algorithms with general loss functions, Appl. Comput. Harmon. Anal., 42 (2017), 224-244.  doi: 10.1016/j.acha.2015.08.007.

[74]

Y. X. Zeng and D. Klabjian, Online adaptive machine learning based algorithm for implied volatility surface modeling, Knowledge-Based Syst., 163 (2019), 376-391. 

[75]

L. L. Zhang and B. H. Sheng, Online regularized generalized gradient classification algorithms, Anal. Theory Appl., 26 (2010), 278-300.  doi: 10.1007/s10496-010-0278-6.

[76]

Y. L. ZhaoJ. Fan and L. Shi, Learning rates for regularized least squares ranking algorithm, Anal. Appl., 15 (2017), 815-836.  doi: 10.1142/S0219530517500063.

show all references

References:
[1]

S. Agarwal and P. Niyogi, Generalization bounds for ranking algorithms via algorithms via algorithmic stability, J. Mach. Learn. Res., 10 (2009), 441-474. 

[2]

R. AktassA. Altin and F. Tasdelen, A note on a family of two-variable polynomials, J. Comput. Appl. Math., 235 (2011), 4825-4833.  doi: 10.1016/j.cam.2010.11.005.

[3]

A. AltinR. Aktas and E. Erkus-Duman, On a multivariable extension for the extended Jacobi polynomials, J. Math. Anal. Appl., 353 (2009), 121-133.  doi: 10.1016/j.jmaa.2008.11.070.

[4]

P. L. BartlettO. Bousquet and S. Mendelson, Local rademacher complexities, Ann. Stat., 33 (2005), 1497-1537.  doi: 10.1214/009053605000000282.

[5]

P. L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res., 3 (2002), 463-482.  doi: 10.1162/153244303321897690.

[6]

H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer-Verlag, New York, 2010. doi: 10.1007/978-1-4419-9467-7.

[7]

A. Bezubik, J. Hrivn$\acute{a}$k and S. Po$\check{s}$ta, Two dimensional symmetric and antisymmetric exponential functions and orthogonal polynomials, J. Phys. Confer. Ser., 411 (2013), Art. 012007. doi: 10.1088/1742-6596/411/1/012007.

[8]

M. Boissier, S. Lyu, Y. Ying and D. X. Zhou, Fast convergence of online pairwise learning algorithms, in Proceedings of the 19th International Conference on Artifical Intelligence and Statistics, (2016) 204–212.

[9] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, Cambridge University Press, 2004.  doi: 10.1017/CBO9780511804441.
[10]

C. BrunnerA. FischerK. Luig and T. Thies, Pairwise support vector machines and their application to large scale problems, J. Mach. Learn. Res., 13 (2012), 2279-2292. 

[11]

Q. CaoZ. C. Guo and Y. M. Ying, Generalization bounds for metric and similarity learning, Mach. Learn., 102 (2016), 115-132.  doi: 10.1007/s10994-015-5499-7.

[12]

H. ChenZ. B. Pan and L. Q. Li, Learning performance of coefficient-based regularized ranking, Neurocomputing, 133 (2014), 54-62. 

[13]

H. Chen and J. T. Wu, Regularized ranking with convex losses and $\ell^{1}$ penalty, Abst. Appl. Anal., (2013), Art. 927827, 8 pp. doi: 10.1155/2013/927827.

[14]

R. H. Chen and Z. M. Wu, Solving partial differential equation by using multiquadratic quasi-interpolation, Appl. Math. Comput., 186 (2007), 1502-1510.  doi: 10.1016/j.amc.2006.07.160.

[15]

X. M. Chen and Y. W. Lei, Refined bounds for online pairwise learning algorithms, Neurocomputing, 275 (2018), 2656-2665. 

[16]

S. O. Cheng and A. J. Smola, Hyperkernels, Conference Paper, 2002. Available from: https://www.researchgate.net/publication/221618045.

[17]

S. O. ChengA. J. Smola and R. C. Williamson, Learning the kernel with hyperkernels, J. Mach. Learn. Res., 6 (2005), 1043-1071. 

[18]

A. Christmann and D. X. Zhou, On the robustness of regularized pairwise learning methods based on kernels, J. Complexity, 37 (2016), 1-33.  doi: 10.1016/j.jco.2016.07.001.

[19]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.

[20] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, New York, 2007.  doi: 10.1017/CBO9780511618796.
[21]

S. R. DasKu hooD. Mishra and M. Rout, An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies, Physica A, 513 (2019), 339-370. 

[22] C. F. Dunkl and Y. Xu, Orthogonal Polynomials of Several Variables, Cambridge University Press, 2014.  doi: 10.1017/CBO9781107786134.
[23]

Q. FangM. Xu and Y. M. Ying, Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems, Anal. Appl., 16 (2018), 741-755.  doi: 10.1142/S0219530518500082.

[24]

L. Fern$\acute{a}$ndezT. E. P$\acute{e}$rez and M. A. Pi$\tilde{n}$ar, On Koornwinder classical orthogonal polynomials in two variables, J. Comput. Appl. Math., 236 (2012), 3817-3826.  doi: 10.1016/j.cam.2011.08.017.

[25]

W. W. Gao and Z. G. Wang, A meshless scheme for partial differential equations based on multiquadratic trigonometric b-spline quasi-interpolation, Chin. Phys. B, 23 (2014), Art. 110207.

[26]

Z. C. GuoY. M. Ying and D. X. Zhou, Online regularized learning with pairwise loss functions, Adv. Comput. Math., 43 (2017), 127-150.  doi: 10.1007/s10444-016-9479-7.

[27]

T. HuJ. Fan and D. H. Xiang, Convergence analysis of distributed multi-penalty regularized pairwise learning, Anal. Appl., 18 (2020), 109-127.  doi: 10.1142/S0219530519410045.

[28]

S. Karlin and J. Gregor, Determinants of orthogonal polynomials, Bull. Amer. Math. Soc., 68 (1962), 204-209.  doi: 10.1090/S0002-9904-1962-10749-6.

[29]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Stati., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.

[30]

A. Klimyk and J. Patera, (Anti)Symmetric multivariable exponential functions and corresponding Fourier transforms, J. Phys. A Math. Theor., 40 (2007), 10473-10489.  doi: 10.1088/1751-8113/40/34/006.

[31]

T. Koornwinder, Two-variable analogues of the classical orthogonal polynomials, in Theory and Application of Special Functions (eds. R. A. Askey), Academic Press, New York, (1975), 435–495.

[32]

T. Koornwinder, Yet another proof of the addition formula for Jacobi polynomials, J. Math. Anal. Appl., 61 (1977), 136-141.  doi: 10.1016/0022-247X(77)90149-4.

[33]

T. Koornwinder, Orthogonal polynomials in two variables which are eigenfunctions of two algebraically independeng partial differential operators, I, II, Indag Math., 36 (1974), 48-66. 

[34]

H. L. Krall and I. M. Sheffer, Orthogonal polynomials in two variables, Ann. Mat. Pura Appl., 76 (1967), 325-376.  doi: 10.1007/BF02412238.

[35]

G. KriukovaS. V. Pereverzyev and P. Tkachenko, On the convergence rate and some applications of regularized ranking algorithms, J. Complexity, 33 (2016), 14-29.  doi: 10.1016/j.jco.2015.09.004.

[36]

D. W. Lee, Partial differential equations for products of two classical orthogonal polynomials, Bull. Korean Math. Soc., 42 (2005), 179-188.  doi: 10.4134/BKMS.2005.42.1.179.

[37]

S. B. LinJ. S. ZengJ. Fang and Z. B. Xu, Learning rates of $l^q$ coefficient regularization learning with Gaussian kernel, Neural Comput., 26 (2014), 2359-2378.  doi: 10.1162/NECO_a_00641.

[38]

J. H. Lin and D. X. Zhou, Online learning algorithms can converge comparably fast as batch learning, IEEE Trans. Neural Netw. Learn. Syst., 29 (2017), 2367-2378.  doi: 10.1109/TNNLS.2017.2677970.

[39]

J. H. LinY. W. LeiB. Zhang and D. X. Zhou, Online pairwise learning algorithms with convex loss functions, Inform. Sci., 406-407 (2017), 57-70.  doi: 10.1162/neco_a_00817.

[40]

H. X. LiuB. H. Sheng and P. X. Ye, The improved learning rate for regularized regression with RKBSs, Int. J. Math. Learn. Cyber., 8 (2017), 1235-1245. 

[41]

P. Malave, Multi-dimensional Orthogonal Polynomials, Ph.D thesis, Marathwada, Univ. Aurangabad, 1979.

[42] L. MasonJ. BaxterP. Bartlett and M. Frean, Functional Gradient Techniques for Combining Hypotheses, MIT Press, 1999. 
[43]

R. Meir and T. Zhang, Generalization error bounds for Bayesian mixture algorithms, J. Mach. Learn. Res., 4 (2003), 839-860.  doi: 10.1162/1532443041424300.

[44]

G. V. Milovanovi$\acute{c}$G. $\ddot{O}$zt$\ddot{u}$rk and R. Aktas, Properties of some of two-variable orthogonal polynomials, Bull. Malay. Math. Sci. Soc., 43 (2020), 1403-1431.  doi: 10.1007/s40840-019-00750-8.

[45]

T. Pahikkala, M. Viljanen, A. Airola and W. Waegeman, Spectral analysis of symmetric and antisymmetric pairwise kernels, preprint, arXiv: 1506.05950v1, (2015), 19 pp.

[46]

N. Ratliff and J. Bagnell, Kernel conjugate gradient for fast kernel machines, In IJCAI, 20 (2007), 1017-1021. 

[47]

W. Rejchel, On ranking and generalization bounds, J. Mach. Learn. Res., 13 (2012), 1373-1392. 

[48]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.

[49]

H. Rosengren, Multivariable Chiristoffel-Darboux kernels and charateristic polynomials of random Hermitian matrices, Symmetry Integr. Geom. Meth. Appl., 2 (2006), Art. 085, 12 pp. doi: 10.3842/SIGMA.2006.085.

[50] B. Sch$\ddot{o}$lkopfR. Herbrich and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002. 
[51]

B. H. Sheng, The weighted norm for some Mercer kernel matrices, Acta. Math. Sci., 33A (2013), 6–15 (in Chinese).

[52]

B. H. ShengL. Q. Duan and P. X. Ye, Strong convex loss can increase the learning rates of online learning, J. Computer, 9 (2014), 1606-1611. 

[53]

B. H. Sheng and J. L. Wang, On the K-functional in learning theory, Anal. Appl., 18 (2020), 423-446.  doi: 10.1142/S0219530519500192.

[54]

B. H. Sheng and D. H. Xiang, The performance of semi-supervised laplacian regularized regression with least square loss, Int. J. Wavelets Multiresolut. Inform. Process., 15 (2017), Art. 1750016, 31 pp. doi: 10.1142/S0219691317500163.

[55]

B. H. Sheng and H. Z. Zhang, Performance analysis of the LapRSSLG algorithm in learning theory, Anal. Appl., 18 (2020), 79-108.  doi: 10.1142/S0219530519410033.

[56]

B. H. Sheng and H. C. Zhu, The convergence rate of semi-supervised regression with quadratic loss, Appl. Math. Comput., 321 (2018), 11-24.  doi: 10.1016/j.amc.2017.10.033.

[57]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory, Anal. Appl., 1 (2003), 1-25.  doi: 10.1142/S0219530503000089.

[58]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.

[59]

I. Sprinkhuizen-Kuyper, Orthogonal polynomials in two variables. A further analysis of the polynomials orthogonal over a region bounded by two lines and a parabola, SIAM J. Math. Anal., 7 (1976), 501-518.  doi: 10.1137/0507041.

[60]

I. Steinwart and C. Scovel, Mercer$\prime$s theorem on general domains: on the interaction between measures, kernels, and RKHSs, Constr. Approx., 35 (2012), 363-417.  doi: 10.1007/s00365-012-9153-3.

[61]

H. Sun, Mercer theorem for RKHS on noncompact sets, J. Complexity, 21 (2005), 337-349.  doi: 10.1016/j.jco.2004.09.002.

[62]

S. H. Wang, Z. L. Chen and B. H. Sheng, The convergence rate of SVM for kernel-based robust regression, Int. J. Wavelets Multiresolut. Inform. Process., 17 (2019), Art. 1950004. doi: 10.1142/S0219691319500048.

[63]

C. Wang and T. Hu, Online regularized rairwise learning with least squares loss, Anal. Appl., 18 (2020), 49-78.  doi: 10.1142/S0219530519410070.

[64]

Y. Wang, R. Khardon, D. Pechyony and R. Jones, Generalization bounds for online learning algorithms with pairwise loss functions, in JMLR: Workshop and Conference Proceedings, Annual Conference on Learning Theory, 23 (2012), 13.1–13.22.

[65]

Q. WuY. M. Ying and D. X. Zhou, Multi-kernel regularized classifiers, J. Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.

[66]

Z. M. Wu, Dynamical knot and shape parameter setting for simulating shock wave by using multi-quadratic quasi-interpolation, Eng. Anal. Bound. Elem., 29 (2005), 354-358. 

[67]

Z. M. Wu and R. Schaback, Shape preserving properties and convergence of univariate multiquadratic quasi-interpolation, Acta Math. Appl. Sin., 10 (1994), 441-446.  doi: 10.1007/BF02016334.

[68]

D. H. Xiang, Conditional quantiles with varying Gaussians, Adv. Comput. Math., 38 (2013), 723-735.  doi: 10.1007/s10444-011-9257-5.

[69]

G. B. Ye and D. X. Zhou, Fully online classification by regularization, Appl. Comput. Harmon. Anal., 23 (2007), 198-214.  doi: 10.1016/j.acha.2006.12.001.

[70]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, J. Mach. Learn. Res., 8 (2007), 249-276. 

[71]

Y. M. Ying and D. X. Zhou, Online regularized classification algorithms, IEEE. Trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.

[72]

Y. M. Ying and D. X. Zhou, Online pairwise learning algorithms, Neural Comput., 28 (2016), 743-777.  doi: 10.1162/neco_a_00817.

[73]

Y. M. Ying and D. X. Zhou, Unregularized online learning algorithms with general loss functions, Appl. Comput. Harmon. Anal., 42 (2017), 224-244.  doi: 10.1016/j.acha.2015.08.007.

[74]

Y. X. Zeng and D. Klabjian, Online adaptive machine learning based algorithm for implied volatility surface modeling, Knowledge-Based Syst., 163 (2019), 376-391. 

[75]

L. L. Zhang and B. H. Sheng, Online regularized generalized gradient classification algorithms, Anal. Theory Appl., 26 (2010), 278-300.  doi: 10.1007/s10496-010-0278-6.

[76]

Y. L. ZhaoJ. Fan and L. Shi, Learning rates for regularized least squares ranking algorithm, Anal. Appl., 15 (2017), 815-836.  doi: 10.1142/S0219530517500063.

[1]

Baohuai Sheng, Huanxiang Liu, Huimin Wang. Learning rates for the kernel regularized regression with a differentiable strongly convex loss. Communications on Pure and Applied Analysis, 2020, 19 (8) : 3973-4005. doi: 10.3934/cpaa.2020176

[2]

Ying Lin, Rongrong Lin, Qi Ye. Sparse regularized learning in the reproducing kernel banach spaces with the $ \ell^1 $ norm. Mathematical Foundations of Computing, 2020, 3 (3) : 205-218. doi: 10.3934/mfc.2020020

[3]

Kaitlyn (Voccola) Muller. A reproducing kernel Hilbert space framework for inverse scattering problems within the Born approximation. Inverse Problems and Imaging, 2019, 13 (6) : 1327-1348. doi: 10.3934/ipi.2019058

[4]

Ning Zhang, Qiang Wu. Online learning for supervised dimension reduction. Mathematical Foundations of Computing, 2019, 2 (2) : 95-106. doi: 10.3934/mfc.2019008

[5]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete and Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[6]

Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete and Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1

[7]

Marc Bocquet, Alban Farchi, Quentin Malartic. Online learning of both state and dynamics using ensemble Kalman filters. Foundations of Data Science, 2021, 3 (3) : 305-330. doi: 10.3934/fods.2020015

[8]

Soheila Garshasbi, Brian Yecies, Jun Shen. Microlearning and computer-supported collaborative learning: An agenda towards a comprehensive online learning system. STEM Education, 2021, 1 (4) : 225-255. doi: 10.3934/steme.2021016

[9]

Paul E. Anderson, Timothy P. Chartier, Amy N. Langville, Kathryn E. Pedings-Behling. The rankability of weighted data from pairwise comparisons. Foundations of Data Science, 2021, 3 (1) : 1-26. doi: 10.3934/fods.2021002

[10]

Liam Burrows, Weihong Guo, Ke Chen, Francesco Torella. Reproducible kernel Hilbert space based global and local image segmentation. Inverse Problems and Imaging, 2021, 15 (1) : 1-25. doi: 10.3934/ipi.2020048

[11]

Ali Akgül, Mustafa Inc, Esra Karatas. Reproducing kernel functions for difference equations. Discrete and Continuous Dynamical Systems - S, 2015, 8 (6) : 1055-1064. doi: 10.3934/dcdss.2015.8.1055

[12]

Ali Akgül. A new application of the reproducing kernel method. Discrete and Continuous Dynamical Systems - S, 2021, 14 (7) : 2041-2053. doi: 10.3934/dcdss.2020261

[13]

Janosch Rieger. A learning-enhanced projection method for solving convex feasibility problems. Discrete and Continuous Dynamical Systems - B, 2022, 27 (1) : 555-568. doi: 10.3934/dcdsb.2021054

[14]

Noémi Nagy, Péter L. Simon. Detailed analytic study of the compact pairwise model for SIS epidemic propagation on networks. Discrete and Continuous Dynamical Systems - B, 2020, 25 (1) : 99-115. doi: 10.3934/dcdsb.2019174

[15]

Douglas Hardin, Edward B. Saff, Ruiwen Shu, Eitan Tadmor. Dynamics of particles on a curve with pairwise hyper-singular repulsion. Discrete and Continuous Dynamical Systems, 2021, 41 (12) : 5509-5536. doi: 10.3934/dcds.2021086

[16]

Ran Ma, Lu Zhang, Yuzhong Zhang. A best possible algorithm for an online scheduling problem with position-based learning effect. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021144

[17]

Yudong Li, Yonggang Li, Bei Sun, Yu Chen. Zinc ore supplier evaluation and recommendation method based on nonlinear adaptive online transfer learning. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021193

[18]

Yossi Bokor, Katharine Turner, Christopher Williams. Reconstructing linearly embedded graphs: A first step to stratified space learning. Foundations of Data Science, 2021  doi: 10.3934/fods.2021026

[19]

Andrew E.B. Lim, John B. Moore. A path following algorithm for infinite quadratic programming on a Hilbert space. Discrete and Continuous Dynamical Systems, 1998, 4 (4) : 653-670. doi: 10.3934/dcds.1998.4.653

[20]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

2020 Impact Factor: 1.916

Metrics

  • PDF downloads (230)
  • HTML views (69)
  • Cited by (0)

Other articles
by authors

[Back to Top]