\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

An incremental nonsmooth optimization algorithm for clustering using $ L_1 $ and $ L_\infty $ norms

  • * Corresponding author: Burak Ordin

    * Corresponding author: Burak Ordin 
Abstract Full Text(HTML) Figure(6) / Table(6) Related Papers Cited by
  • An algorithm is developed for solving clustering problems with the similarity measure defined using the $ L_1 $ and $ L_\infty $ norms. It is based on an incremental approach and applies nonsmooth optimization methods to find cluster centers. Computational results on 12 data sets are reported and the proposed algorithm is compared with the $ X $-means algorithm.

    Mathematics Subject Classification: Primary: 90C90; Secondary: 90A05.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Illustration of the set $ B_2(y) $

    Figure 2.  The number of distance evaluations vs the number of clusters

    Figure 3.  Visualization of clusters for TSPLIB1060 data set

    Figure 4.  Visualization of clusters for TSPLIB3038 data set

    Figure 5.  The purity vs the number of clusters

    Figure 6.  Data set with outliers

    Table 1.  The brief description of data sets

    Data sets Number of instances Number of attributes Number of classes
    Bavaria postal 1 89 3 -
    Bavaria postal 2 89 4 -
    Fisher's Iris Plant 150 4 3
    Breast Cancer 683 9 2
    TSPLIB1060 1060 2 -
    Image Segmentation 2310 19 7
    TSPLIB3038 3038 2 -
    Page Blocks 5473 10 5
    EEG Eye State 14980 14 2
    D15112 15112 2 -
    KEGG Metabolic 53413 20 -
    Relation Network
    Pla85900 85900 2 -
     | Show Table
    DownLoad: CSV

    Table 2.  Results for small and medium size data sets

    $ k $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $
    Bavaria postal 1 Bavaria postal 2 Iris Plant
    $ \times 10^6 $ $ \times 10^{10} $ $ \times 10^6 $ $ \times 10^6 $ $ \times 10^{10} $ $ \times 10^6 $ $ \times 10^2 $ $ \times 10^2 $ $ \times 10^2 $
    2 4.0249 60.2547 3.9940 1.8600 5.2192 0.9456 2.1670 1.5235 0.9715
    3 2.8284 29.4507 2.7892 1.2607 1.7399 0.6594 1.5920 0.7885 0.7420
    4 2.1982 10.4561 2.1690 0.9633 0.7559 0.5210 1.3650 0.5726 0.6460
    5 1.7208 5.9762 1.6948 0.7872 0.5442 0.4221 1.2460 0.4645 0.5860
    6 1.3003 3.5909 1.2766 0.6736 0.3226 0.3459 1.1530 0.3904 0.5300
    7 1.0704 2.1983 1.0368 0.5659 0.2215 0.2946 1.0620 0.3430 0.4915
    8 0.8511 1.3385 0.8183 0.5097 0.1705 0.2550 1.0010 0.2999 0.4640
    9 0.7220 0.8424 0.6993 0.4688 0.1401 0.2360 0.9540 0.2779 0.4435
    10 0.6037 0.6447 0.5828 0.4340 0.1181 0.2173 0.9070 0.2583 0.4245
    Breast Cancer TSPLIB1060 Image Segmentation
    $ \times 10^4 $ $ \times 10^4 $ $ \times 10^4 $ $ \times 10^7 $ $ \times 10^9 $ $ \times 10^6 $ $ \times 10^6 $ $ \times 10^7 $ $ \times 10^5 $
    2 0.6401 1.9323 0.1831 0.3864 9.8319 2.6809 0.5192 3.5606 1.4929
    3 0.5702 1.6256 0.1607 0.3139 6.7058 2.1508 0.4160 2.7416 1.3284
    5 0.5165 1.3707 0.1460 0.2310 3.7915 1.6546 0.3400 1.7143 1.1081
    7 0.4651 1.2031 0.1372 0.1976 2.7039 1.3754 0.2892 1.3473 0.9408
    10 0.4270 1.0212 0.1278 0.1563 1.7553 1.1048 0.2575 0.9967 0.8170
    12 0.4068 0.9480 0.1234 0.1378 1.4507 1.0033 0.2401 0.8477 0.7635
    15 0.3872 0.8711 0.1172 0.1198 1.1219 0.8827 0.2188 0.6556 0.6966
    18 0.3707 0.8068 0.1112 0.1080 0.9004 0.7906 0.2021 0.5630 0.6481
    20 0.3614 0.7698 0.1081 0.1015 0.7925 0.7378 0.1942 0.5137 0.6200
     | Show Table
    DownLoad: CSV

    Table 3.  Results for large data sets

    $ k $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $
    TSPLIB3038 Page Blocks D15112
    $ \times 10^6 $ $ \times 10^9 $ $ \times 10^6 $ $ \times 10^7 $ $ \times 10^{10} $ $ \times 10^6 $ $ \times 10^8 $ $ \times 10^{11} $ $ \times 10^8 $
    2 3.7308 3.1688 2.5651 0.8414 5.7937 4.1746 0.8860 3.6840 0.6109
    3 3.0056 2.1763 2.1221 0.6747 3.3134 3.4309 0.6908 2.5324 0.4896
    5 2.2551 1.1982 1.5576 0.4882 1.3218 2.4671 0.4998 1.3271 0.3619
    10 1.5508 0.5634 1.0738 0.3152 0.4533 1.4446 0.3618 0.6489 0.2524
    15 1.2295 0.3560 0.8592 0.2555 0.2495 1.1784 0.2930 0.4324 0.2065
    20 1.0597 0.2672 0.7379 0.2200 0.1672 1.0160 0.2501 0.3218 0.1768
    25 0.9441 0.2145 0.6627 0.2004 0.1203 0.9020 0.2241 0.2530 0.1583
    EEG Eye State Pla85900 Relation Network
    $ \times 10^7 $ $ \times 10^8 $ $ \times 10^6 $ $ \times 10^{10} $ $ \times 10^{15} $ $ \times 10^{10} $ $ \times 10^7 $ $ \times 10^8 $ $ \times 10^6 $
    2 0.5289 8178.1381 1.5433 2.0656 3.7491 1.4533 0.3586 11.3853 1.9821
    3 0.4197 1833.8806 0.9049 1.6262 2.2806 1.1434 0.2800 4.9006 1.5112
    5 0.2944 1.3386 0.5183 1.2587 1.3397 0.8712 0.2095 1.8837 1.0549
    10 0.2191 0.4567 0.3947 0.8950 0.6829 0.6218 0.1459 0.6352 0.6667
    15 0.1965 0.3500 0.3562 0.7335 0.4625 0.5082 0.1231 0.3512 0.5114
    20 0.1827 0.2899 0.3292 0.6374 0.3517 0.4443 0.1108 0.2654 0.4440
    25 0.1736 0.2596 0.3129 0.5693 0.2827 0.3981 0.1011 0.1946 0.4013
     | Show Table
    DownLoad: CSV

    Table 4.  CPU time in small and medium size data sets

    $ k $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $
    Bavaria postal 1 Bavaria postal 2 Iris Plant
    2 0.03 0.00 0.03 0.03 0.00 0.03 0.02 0.00 0.02
    3 0.05 0.02 0.05 0.03 0.00 0.03 0.12 0.00 0.27
    4 0.05 0.02 0.05 0.05 0.02 0.05 0.20 0.02 0.70
    5 0.20 0.03 0.14 0.12 0.02 0.20 0.42 0.05 1.05
    6 0.22 0.05 0.16 0.16 0.03 0.25 0.56 0.11 1.42
    7 0.25 0.05 0.19 0.20 0.05 0.42 0.70 0.19 2.12
    8 0.25 0.08 0.19 0.28 0.09 0.59 0.94 0.22 2.98
    9 0.41 0.11 0.27 0.31 0.14 1.00 1.29 0.51 4.13
    10 0.50 0.14 0.30 0.34 0.17 1.33 2.26 0.80 4.90
    Breast Cancer TSPLIB1060 Image Segmentation
    2 1.45 0.03 0.50 0.09 0.05 0.11 2.28 0.33 27.69
    3 5.93 0.22 1.95 0.30 0.08 0.39 5.38 0.89 61.67
    5 12.20 1.11 9.94 1.15 0.17 1.29 13.37 3.53 426.55
    7 15.58 2.45 14.52 3.17 0.53 4.91 27.67 8.53 638.29
    10 22.45 3.42 17.49 7.36 0.86 10.64 157.87 14.82 1144.10
    12 25.05 5.91 32.53 9.41 1.53 17.89 210.59 23.13 1539.23
    15 26.29 8.89 55.13 18.19 2.22 26.80 360.63 34.02 1903.06
    18 28.92 12.37 85.97 30.81 3.40 35.66 508.86 41.34 2284.46
    20 30.47 15.09 96.22 47.22 4.84 57.30 711.24 62.63 2866.97
     | Show Table
    DownLoad: CSV

    Table 5.  CPU time in large data sets

    $ k $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $ $ d_1 $ $ d_2^2 $ $ d_\infty $
    TSPLIB3038 Page Blocks D15112
    2 0.33 0.20 0.41 2.67 0.45 1.22 4.80 4.48 6.13
    3 0.90 0.47 0.95 6.51 1.34 7.52 8.80 8.50 12.62
    5 2.34 0.87 1.56 30.75 2.84 54.16 15.37 14.71 23.51
    10 14.29 2.26 14.68 113.05 14.01 193.61 39.13 32.26 54.49
    15 43.34 4.48 33.79 193.38 29.05 523.04 73.79 58.34 107.72
    20 107.33 8.31 63.73 739.82 63.68 787.13 121.87 92.51 176.70
    25 163.46 15.29 155.72 969.64 95.22 1085.21 168.96 136.61 274.31
    EEG Eye State Pla85900 Relation Network
    2 1.26 0.89 2.57 79.47 77.69 108.03 95.22 4.93 71.46
    3 2.20 1.61 3.96 156.24 154.13 216.56 208.09 18.50 119.70
    5 9.28 3.79 16.04 311.22 305.59 431.59 422.79 52.45 283.08
    10 44.80 65.86 79.89 710.71 697.09 989.53 949.52 186.80 931.51
    15 115.74 148.62 208.57 1138.17 1154.27 1566.55 1789.21 514.96 1963.96
    20 252.27 243.49 415.32 1574.80 1575.67 2177.26 2687.01 787.71 3546.57
    25 414.92 354.20 764.83 2041.79 2025.19 2830.90 4547.85 1197.28 5854.08
     | Show Table
    DownLoad: CSV

    Table 6.  Comparison of algorithms using $ d_1 $ and $ d_{\infty} $ functions

    $ k $ $ d_1 $ $ d_{\infty} $ $ k $ $ d_1 $ $ d_{\infty} $
    $ X $-means INCA $ X $-means INCA $ X $-means INCA $ X $-means INCA
    Bavaria postal 1 Bavaria postal 2
    2 44.21 0.00 44.58 0.00 2 13.31 0.00 15.95 0.00
    3 34.44 0.00 111.42 0.00 3 15.29 0.00 23.33 0.00
    4 48.73 0.00 63.68 0.00 4 27.51 0.00 30.53 0.00
    5 102.81 0.00 102.90 0.00 5 14.30 0.00 45.22 0.00
    6 149.68 0.00 166.86 0.00 6 10.24 0.00 72.81 0.00
    7 148.86 0.00 226.41 0.00 7 20.39 0.00 97.27 0.00
    8 188.37 0.00 312.47 0.00 8 32.67 0.00 122.49 0.00
    9 338.74 0.00 370.37 0.00 9 39.57 0.00 132.80 0.00
    10 404.43 0.00 466.69 0.00 10 42.22 0.00 149.03 0.00
    Iris Plants Breast Cancer
    2 2.05 0.00 1.01 0.00 2 14.46 0.00 5.60 0.00
    3 2.07 0.00 6.45 0.00 3 15.35 0.00 14.25 0.00
    4 12.42 0.00 12.02 0.00 5 15.11 0.00 9.26 0.00
    5 5.34 0.00 10.58 0.00 7 20.19 0.00 10.40 0.00
    6 7.15 0.00 30.88 0.00 10 18.15 0.00 14.19 0.00
    7 11.85 0.00 21.67 0.00 12 19.75 0.00 12.80 0.00
    8 17.12 0.00 26.90 0.00 15 18.75 0.00 14.41 0.00
    9 21.59 0.00 31.06 0.00 18 21.63 0.00 15.84 0.00
    10 23.72 0.00 28.98 0.00 20 22.22 0.00 15.50 0.00
     | Show Table
    DownLoad: CSV
  • [1] C. Aggarwal, A. Hinneburg and D. Keim, On the surprising behavior of distance metrics in high dimensional space, ICDT '01 Proceedings of the 8th International Conf. on Database Theory, (2001), 420–434.
    [2] D. Arthur and S. Vassilvitskii, k-means++: The advantages of careful seeding, SODA '07 Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, (2007), 1027–1035.
    [3] K. Bache and M. Lichman, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, 2013., Available from: http://archive.ics.uci.edu/ml.
    [4] A. M. Bagirov, Modified global k-means algorithm for minimum sum-of-squares clustering problems, Pattern Recognition, 41 (2008), 3192-3199. 
    [5] A. M. BagirovA. Al Nuaimat and N. Sultanova, Hyperbolic smoothing function method for minimax problems, Optimization, 62 (2013), 759-782.  doi: 10.1080/02331934.2012.675335.
    [6] A. M. BagirovB. Karasözen and M. Sezer, Discrete gradient method: Derivative-free method for nonsmooth optimization, Journal of Optimization Theory and Applications, 137 (2008), 317-334.  doi: 10.1007/s10957-007-9335-5.
    [7] A. M. Bagirov, N. Karmitsa and M. M. Mäkelä, Introduction to Nonsmooth Optimization, Springer, Cham, 2014. doi: 10.1007/978-3-319-08114-4.
    [8] A. M. Bagirov and E. Mohebi, An algorithm for clustering using $L_1$-norm based on hyperbolic smoothing technique, Computational Intelligence, 32 (2016), 439-457.  doi: 10.1111/coin.12062.
    [9] A. M. Bagirov and E. Mohebi, Nonsmooth optimization based algorithms in cluster analysis, Partitional Clustering Algorithms, (2015), 99–146.
    [10] A. M. BagirovB. OrdinG. Ozturk and A. E. Xavier, An incremental clustering algorithm based on hyperbolic smoothing, Computational Optimization and Applications, 61 (2015), 219-241.  doi: 10.1007/s10589-014-9711-7.
    [11] A. M. BagirovA. M. Rubinov and J. Yearwood, A global optimization approach to classification, Optimization and Engineering, 3 (2002), 129-155.  doi: 10.1023/A:1020911318981.
    [12] A. M. BagirovA. M. RubinovN. V. Soukhoroukova and J. Yearwood, Unsupervised and supervised data classification via nonsmooth and global optimization, TOP, 11 (2003), 1-93.  doi: 10.1007/BF02578945.
    [13] A. M. Bagirov and J. Ugon, Piecewise partially separable functions and a derivative-free algorithm for large scale nonsmooth optimization, Journal of Global Optimization, 35 (2006), 163-195.  doi: 10.1007/s10898-005-3834-4.
    [14] A. M. BagirovJ. Ugon and D. Webb, Fast modified global k-means algorithm for incremental cluster construction, Pattern Recognition, 44 (2011), 866-876. 
    [15] A. M. Bagirov and J. Yearwood, A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems, European Journal of Operational Research, 170 (2006), 578-596.  doi: 10.1016/j.ejor.2004.06.014.
    [16] L. Bobrowski and J. C. Bezdek, c-means clustering with the $l_1$ and $l_\infty$ norms, IEEE Trans. on Systems, Man and Cybernetics, 21 (1991), 545-554.  doi: 10.1109/21.97475.
    [17] H.-H. Bock, Clustering and neural networks, in Advances in Data Science and Classification (eds. A. Rizzi, M. Vichi and H.-H. Bock), Springer, Berlin, (1998), 265–277.
    [18] J. Carmichael and P. Sneath, Taxometric maps, Systematic Zoology, 18 (1969), 402-415. 
    [19] M. E. CelebiH. A. Kingravi and P. A. Vela, A comparative study of efficient initialization methods for the k-means clustering algorithm, Expert Systems with Applications, 40 (2013), 200-210. 
    [20] F. H. Clarke, Optimization and Nonsmooth Analysis, Canadian Mathematical Society Series of Monographs and Advanced Texts, Wiley, 1983.
    [21] K. A. J. Doherty, R. G. Adams and N. Davey, Non-Euclidean norms and data normalisation, Proceedings of ESANN, (2004), 181–186.
    [22] M. Ghorbani, Maximum entropy-based fuzzy clustering by using $L_1$-norm space, Turk. J. Math., 29 (2005), 431-438. 
    [23] C. Hanilci and F. Ertas, Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition, Computers and Electrical Engineering, 37 (2011), 41-56. 
    [24] P. HansenN. Mladenovic and D. Perez-Britos, Variable neighborhood decomposition search, Journal of Heuristics, 7 (2001), 335-350.  doi: 10.1007/0-306-48056-5_6.
    [25] A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666. 
    [26] A. K. JainM. N. Murty and P. J. Flynn, Data clustering: A review, ACM Comput. Surv., 31 (1999), 264-323. 
    [27] K. Jajuga, A clustering method based on the $L_1$-norm, Computational Statistics & Data Analysis, 5 (1987), 357-371.  doi: 10.1016/0167-9473(87)90058-2.
    [28] D. R. JonesC. D. Perttunen and B. E. Stuckman, Lipschitzian optimization without the Lipschitz constant, Journal of Optimization Theory and Applications, 79 (1993), 157-181.  doi: 10.1007/BF00941892.
    [29] L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley Series in Probability and Statistics, Wiley, 1990. doi: 10.1002/9780470316801.
    [30] J. KoganIntroduction to Clustering Large and High-Dimensional Data, Cambridge University Press, 2007. 
    [31] A. LikasN. Vlassis and J. Verbeek, The global $k$-means clustering algorithm, Pattern Recognition, 36 (2003), 451-461. 
    [32] B. Ordin and A. M. Bagirov, A heuristic algorithm for solving the minimum sum-of-squares clustering problems, Journal of Global Optimization, 61 (2015), 341-361.  doi: 10.1007/s10898-014-0171-5.
    [33] D. Pelleg and A. W. Moore, X-means: Extending $k$-means with efficient estimation of the number of clusters, ICML'00 Proceedings of the 17-th International Conference on Machine Learning, (2000), 727–734.
    [34] J. D. Pinter, Global Optimization in Action. Continous and Lipschitz Optimization: Algorithms, Implementations ans Applications, Kluwer Academic Publishers, Boston, 1996. doi: 10.1007/978-1-4757-2502-5.
    [35] G. N. RamosY. HatakeyamaF. Dong and K. Hirota, Hyperbox clustering with Ant Colony Optimization method and its application to medical risk profile recognition, Applied Soft Computing, 9 (2009), 632-640. 
    [36] G. Reinelt, TSPLIB - a traveling salesman problem library, ORSA Journal of Computing, 3 (1991), 376-384. 
    [37] K. SaboR. Scitovski and I. Vazler, One-dimensional center-based $l_1$-clustering method, Optimization Letters, 7 (2013), 5-22.  doi: 10.1007/s11590-011-0389-9.
    [38] R. Sedgewick and K. Wayne, Introduction to Programming in Java, Addison-Wesley, 2007.
    [39] R. de Souza and F. de Carvalho, Clustering of interval data based on city-block distances, Pattern Recognition Letters, 25 (2004), 353-365.  doi: 10.1016/j.patrec.2003.10.016.
    [40] H. Späth, Algorithm 30: $L_1$ cluster analysis, Computing, 16 (1976), 379-387.  doi: 10.1007/BF02243486.
    [41] N. Venkateswarlu and P. Raju, Fast isodata clustering algorithms, Pattern Recognition, 25 (1992), 335-342. 
    [42] A. E. Xavier and A. A. F. D. Oliveira, Optimal covering of plane domains by circles via hyperbolic smoothing, J. of Glob. Opt., 31 (2005), 493-504.  doi: 10.1007/s10898-004-0737-8.
    [43] S. Xu, Smoothing method for minimax problems, Computational Optimization and Applications, 20 (2001), 267-279.  doi: 10.1023/A:1011211101714.
    [44] R. Xu and D. Wunsch, Survey of clustering algorithms, IEEE Transactions on Neural Networks, 16 (2005), 645-678. 
    [45] M. S. Yang and W. L. Hung, Alternative fuzzy clustering algorithms with $L_1$-norm and covariance matrix, Advanced Concepts for Intelligent Vision, 4179 (2006), 654-665. 
    [46] J. ZhangL. PengX. Zhao and E. E. Kuruoglu, Robust data clustering by learning multi-metric $L_q$-norm distances, Expert Systems with Applications, 39 (2012), 335-349. 
  • 加载中

Figures(6)

Tables(6)

SHARE

Article Metrics

HTML views(553) PDF downloads(342) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return