# American Institute of Mathematical Sciences

April  2016, 12(2): 543-563. doi: 10.3934/jimo.2016.12.543

## Regularized multidimensional scaling with radial basis functions

 1 School of Mathematics, University of Southampton, United Kingdom, United Kingdom

Received  August 2014 Revised  January 2015 Published  June 2015

The classical Multi-Dimensional Scaling (MDS) is an important method for data dimension reduction. Nonlinear variants have been developed to improve its performance. One of them is the MDS with Radial Basis Functions (RBF). A key issue that has not been well addressed in MDS-RBF is the effective selection of its centers. This paper treats this selection problem as a multi-task learning problem, which leads us to employ the $(2,1)$-norm to regularize the original MDS-RBF objective function. We then study its two reformulations: Diagonal and spectral reformulations. Both can be effectively solved through an iterative block-majorization method. Numerical experiments show that the regularized models can improve the original model significantly.
Citation: Sohana Jahan, Hou-Duo Qi. Regularized multidimensional scaling with radial basis functions. Journal of Industrial and Management Optimization, 2016, 12 (2) : 543-563. doi: 10.3934/jimo.2016.12.543
##### References:
 [1] A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007. [2] A. Argyriou, T. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272. doi: 10.2139/ssrn.1031158. [3] J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193. [4] I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, $2^{nd}$ edition, Springer Series in Statistics, Springer, 2005. [5] F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308. doi: 10.1007/BF02294026. [6] H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415. doi: 10.3934/jimo.2009.5.403. [7] L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321. [8] T. F. Cox and M. A. Cox, Multidimensional Scaling, $2^{nd}$ edition, Chapman and Hall/CRC, 2002. doi: 10.1007/978-3-540-33037-0_14. [9] J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133-145. [10] J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin (1994), 308-325. doi: 10.1007/978-3-642-46808-7_28. [11] W. Glunt, T. L. Hayden, S. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600. doi: 10.1137/0611042. [12] W. Glunt, T. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120. doi: 10.1002/jcc.540140115. [13] J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328. doi: 10.1093/biomet/53.3-4.325. [14] Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748. doi: 10.3934/jimo.2007.3.739. [15] J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27. doi: 10.1007/BF02289565. [16] K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, $10^{th}$ printing, Academic Press, 1995. [17] S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15. [18] E. Pękalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific 2005. [19] H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93. doi: 10.1137/110849523. [20] H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336. doi: 10.1080/00949655.2011.579970. [21] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826. doi: 10.1109/TSP.2013.2264814. [22] H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, doi:10.1007/s10107-013-0726-0. doi: 10.1007/s10107-013-0726-0. [23] K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476. doi: 10.3934/jimo.2005.1.465. [24] I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet'', Ann. Math., 36 (1935), 724-732. doi: 10.2307/1968654. [25] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009. doi: 10.1016/B0-12-227240-4/00132-5. [26] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB approach, Elsevier Inc., 2010. [27] W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958. [28] A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759. doi: 10.1016/0031-3203(94)00135-9. [29] A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635-639. doi: 10.1109/ICPR.1996.547642. [30] A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168. [31] G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22. doi: 10.1007/BF02287916. [32] Y. Yuan, W. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542. doi: 10.3934/jimo.2007.3.529.

show all references

##### References:
 [1] A. Argyriou, T. Evgeniou and M. Pontil, Multi-task Feature Learning, in Advances in Neural Information Processing Systems (eds. B. Schoelkopf, J. Platt, and T. Hoffman), MIT Press, 2007. [2] A. Argyriou, T. Evgeniou and M. Pontil, Convex Multi-task Feature Learning, Machine Learning, Special Issue on Inductive Transfer Learning, 73 (2008), 243-272. doi: 10.2139/ssrn.1031158. [3] J. Bénasséni, Partial additive constant, J. Statist. Comput. Simul., 49 (1994), 179-193. [4] I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling. Theory and Applications, $2^{nd}$ edition, Springer Series in Statistics, Springer, 2005. [5] F. Cailliez, The analytical solution of the additive constant problem, Psychometrika, 48 (1983), 305-308. doi: 10.1007/BF02294026. [6] H. G. Chew and C. C. Lim, On regularisation parameter transformation of support vector machines, Journal of Industrial and Management Optimization, 5 (2009), 403-415. doi: 10.3934/jimo.2009.5.403. [7] L. G. Cooper, A new solution to the additive constant problem in metric and multidimensional scaling, Psychometrika, 37 (1972), 311-321. [8] T. F. Cox and M. A. Cox, Multidimensional Scaling, $2^{nd}$ edition, Chapman and Hall/CRC, 2002. doi: 10.1007/978-3-540-33037-0_14. [9] J. de Leeuw, Applications of convex analysis to multidimensional scaling, in Recent Developments in Statistics (eds. J. Barra, F. Brodeau, G. Romier, and B. van Cutsen), North Holland Publishing Company, Amsterdem, The Netherlands, 133-145. [10] J. de Leeuw, Block relaxation algorithms in statistics, in Information Systems and Data Analysis (eds. Bock, H.H. et al.), Springer, Berlin (1994), 308-325. doi: 10.1007/978-3-642-46808-7_28. [11] W. Glunt, T. L. Hayden, S. Hong and J. Wells, An alternating projection algorithm for computing the nearest Euclidean distance matrix, SIAM J. Matrix Anal. Appl., 11 (1990), 589-600. doi: 10.1137/0611042. [12] W. Glunt, T. L. Hayden and R. Raydan, Molecular conformations from distance matrices, J. Computational Chemistry, 14 (1993), 114-120. doi: 10.1002/jcc.540140115. [13] J. C. Gower, Some distance properties of latent rootand vector methods in multivariate analysis, Biometrika, 53 (1966), 315-328. doi: 10.1093/biomet/53.3-4.325. [14] Y. Hao and F. Meng, A new method on gene selection for tissue classification, Journal of Industrial and Management Optimization, 3 (2007), 739-748. doi: 10.3934/jimo.2007.3.739. [15] J. Kruskal, Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29 (1964), 1-27. doi: 10.1007/BF02289565. [16] K. V. Mardia, J. T. Kent and J. M. Bibby, Multivariate Analysis, $10^{th}$ printing, Academic Press, 1995. [17] S. J. Messick and R. P Abelson, The additive constant problem in multidimensional scaling, Psychometrika, 21 (1956), 1-15. [18] E. Pękalaska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Application, Series in Machine Perception Artificial Intelligence 64, World Scientific 2005. [19] H.-D. Qi, A semismooth Newton method for the nearest Euclidean distance matrix problem, SIAM Journal Matrix Analysis and Applications, 34 (2013), 67-93. doi: 10.1137/110849523. [20] H.-D. Qi and N. Xiu, A convex quadratic semidefinite programming approach to the partial additive constant problem in multidimensional scaling, Journal of Statistical Computation and Simulation, 82 (2012), 1317-1336. doi: 10.1080/00949655.2011.579970. [21] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826. doi: 10.1109/TSP.2013.2264814. [22] H.-D. Qi and X. M. Yuan, Computing the nearest Euclidean distance matrix with low embedding dimensions, Mathematical Programming, doi:10.1007/s10107-013-0726-0. doi: 10.1007/s10107-013-0726-0. [23] K. Schittkowski, Optimal parameter selection in support vector machines, Journal of Industrial and Management Optimization, 1 (2005), 465-476. doi: 10.3934/jimo.2005.1.465. [24] I. J. Schoenberg, Remarks to Maurice Fréchet's article "Sur la définition axiomatque d'une classe d'espaces vectoriels distanciés applicbles vectoriellement sur l'espace de Hilbet'', Ann. Math., 36 (1935), 724-732. doi: 10.2307/1968654. [25] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009. doi: 10.1016/B0-12-227240-4/00132-5. [26] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB approach, Elsevier Inc., 2010. [27] W. S. Torgerson, Theory and Methods for Scaling, Wiley, New York, 1958. [28] A. R. Webb, Multidimensional Scaling by iterative majorization using radial basis functions, Pattern Recognition, 28 (1995), 753-759. doi: 10.1016/0031-3203(94)00135-9. [29] A. R. Webb, Nonlinear feature extraction with radial basis functions using a weighted multidimensional scaling stress measure, Pattern Recognition, IEEE Conference Publications, 4 (1996), 635-639. doi: 10.1109/ICPR.1996.547642. [30] A. R. Webb, An approach to nonlinear principal component analysis using radially-symmetric kernel functions, Statistics and Computing, 6 (1996), 159-168. [31] G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, 3 (1938), 19-22. doi: 10.1007/BF02287916. [32] Y. Yuan, W. Fan and D. Pu, Spline function smooth support vector machine for classification, Journal of Industrial and Management Optimization, 3 (2007), 529-542. doi: 10.3934/jimo.2007.3.529.
 [1] Yubo Yuan, Weiguo Fan, Dongmei Pu. Spline function smooth support vector machine for classification. Journal of Industrial and Management Optimization, 2007, 3 (3) : 529-542. doi: 10.3934/jimo.2007.3.529 [2] Xin Li, Ziguan Cui, Linhui Sun, Guanming Lu, Debnath Narayan. Research on iterative repair algorithm of Hyperchaotic image based on support vector machine. Discrete and Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1199-1218. doi: 10.3934/dcdss.2019083 [3] Ahmad Mousavi, Zheming Gao, Lanshan Han, Alvin Lim. Quadratic surface support vector machine with L1 norm regularization. Journal of Industrial and Management Optimization, 2022, 18 (3) : 1835-1861. doi: 10.3934/jimo.2021046 [4] Martin D. Buhmann, Slawomir Dinew. Limits of radial basis function interpolants. Communications on Pure and Applied Analysis, 2007, 6 (3) : 569-585. doi: 10.3934/cpaa.2007.6.569 [5] Yubo Yuan. Canonical duality solution for alternating support vector machine. Journal of Industrial and Management Optimization, 2012, 8 (3) : 611-621. doi: 10.3934/jimo.2012.8.611 [6] Peter Giesl. Construction of a global Lyapunov function using radial basis functions with a single operator. Discrete and Continuous Dynamical Systems - B, 2007, 7 (1) : 101-124. doi: 10.3934/dcdsb.2007.7.101 [7] Ying Lin, Qi Ye. Support vector machine classifiers by non-Euclidean margins. Mathematical Foundations of Computing, 2020, 3 (4) : 279-300. doi: 10.3934/mfc.2020018 [8] Jian Luo, Shu-Cherng Fang, Yanqin Bai, Zhibin Deng. Fuzzy quadratic surface support vector machine based on fisher discriminant analysis. Journal of Industrial and Management Optimization, 2016, 12 (1) : 357-373. doi: 10.3934/jimo.2016.12.357 [9] Xin Yan, Hongmiao Zhu. A kernel-free fuzzy support vector machine with Universum. Journal of Industrial and Management Optimization, 2023, 19 (1) : 282-299. doi: 10.3934/jimo.2021184 [10] Fatemeh Bazikar, Saeed Ketabchi, Hossein Moosaei. Smooth augmented Lagrangian method for twin bounded support vector machine. Numerical Algebra, Control and Optimization, 2022, 12 (4) : 659-678. doi: 10.3934/naco.2021027 [11] Dmitry Treschev. A locally integrable multi-dimensional billiard system. Discrete and Continuous Dynamical Systems, 2017, 37 (10) : 5271-5284. doi: 10.3934/dcds.2017228 [12] Kang-Ling Liao, Chih-Wen Shih, Chi-Jer Yu. The snapback repellers for chaos in multi-dimensional maps. Journal of Computational Dynamics, 2018, 5 (1&2) : 81-92. doi: 10.3934/jcd.2018004 [13] Franz Achleitner, Anton Arnold, Eric A. Carlen. On multi-dimensional hypocoercive BGK models. Kinetic and Related Models, 2018, 11 (4) : 953-1009. doi: 10.3934/krm.2018038 [14] Anatoli F. Ivanov. On global dynamics in a multi-dimensional discrete map. Conference Publications, 2015, 2015 (special) : 652-659. doi: 10.3934/proc.2015.0652 [15] Gerald Sommer, Di Zang. Parity symmetry in multi-dimensional signals. Communications on Pure and Applied Analysis, 2007, 6 (3) : 829-852. doi: 10.3934/cpaa.2007.6.829 [16] Ning Lu, Ying Liu. Application of support vector machine model in wind power prediction based on particle swarm optimization. Discrete and Continuous Dynamical Systems - S, 2015, 8 (6) : 1267-1276. doi: 10.3934/dcdss.2015.8.1267 [17] Huiqin Zhang, JinChun Wang, Meng Wang, Xudong Chen. Integration of cuckoo search and fuzzy support vector machine for intelligent diagnosis of production process quality. Journal of Industrial and Management Optimization, 2022, 18 (1) : 195-217. doi: 10.3934/jimo.2020150 [18] Qianru Zhai, Ye Tian, Jingyue Zhou. A SMOTE-based quadratic surface support vector machine for imbalanced classification with mislabeled information. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2021230 [19] Xiaoling Sun, Xiaojin Zheng, Juan Sun. A Lagrangian dual and surrogate method for multi-dimensional quadratic knapsack problems. Journal of Industrial and Management Optimization, 2009, 5 (1) : 47-60. doi: 10.3934/jimo.2009.5.47 [20] Wen-Guei Hu, Song-Sun Lin. On spatial entropy of multi-dimensional symbolic dynamical systems. Discrete and Continuous Dynamical Systems, 2016, 36 (7) : 3705-3717. doi: 10.3934/dcds.2016.36.3705

2021 Impact Factor: 1.411