• Previous Article
    Optimization of inventory system with defects, rework failure and two types of errors under crisp and fuzzy approach
  • JIMO Home
  • This Issue
  • Next Article
    Painlevé-Kuratowski convergences of the solution sets for vector optimization problems with free disposal sets
July  2022, 18(4): 2277-2287. doi: 10.3934/jimo.2021067

Approximation algorithm for spherical $ k $-means problem with penalty

1. 

School of Business, Nankai University, Tianjin, 300071, China, College of Science, Tianjin University of Technology, Tianjin, 300384, China

2. 

Computer Science and Technology Department, Tianjin Renai College, Tianjin, 306136, China

3. 

Department of Operations Research and Information Engineering, Beijing University of Technology, Beijing, 100124, China

* Corresponding author: Wei Lv

Received  July 2020 Revised  February 2021 Published  July 2022 Early access  April 2021

The $ k $-means problem is a classical combinatorial optimization problem which has lots of applications in many fields such as machine learning, data mining, etc. We consider a variant of $ k $-means problem in the spherical space, that is, spherical $ k $-means problem with penalties. In the problem, it is allowable that some nodes in the spherical space can not be clustered by paying some penalty costs. Based on local search scheme, we propose a $ \left(4 (11+4\sqrt{7})+ \epsilon\right) $-approximation algorithm using singe-swap operation, where $ \epsilon $ is a positive constant.

Citation: Chenchen Wu, Wei Lv, Yujie Wang, Dachuan Xu. Approximation algorithm for spherical $ k $-means problem with penalty. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2277-2287. doi: 10.3934/jimo.2021067
References:
[1]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.

[2]

D. AloiseA. DeshpandeP. Hansen and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, 75 (2009), 245-248.  doi: 10.1007/s10994-009-5103-0.

[3]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.

[4]

D. Arthur and S. Vassilvitskii, $K$-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2007, 1027–1035. doi: 10.1145/1283383.1283494.

[5]

Y. Endo and S. Miyamoto, Spherical $k$-means++ clustering, Proceedings of International Conference on Modeling Decisions for Artificial Intelligence, (2015), 103–114. doi: 10.1007/978-3-319-23240-99.

[6]

Q. Feng, Z. Zhang, F. Shi and J. Wang, An improved approximation algorithm for the $k$-means problem with penalties, Proceedings of International Workshop on Frontiers in Algorithmics, (2019), 170–181. doi: 10.1007/978-3-030-18126-015.

[7]

Z. Friggstad, K. Khodamoradi, M. Rezapour and M. Salavatipour, Approximation schemes for clustering with outliers, ACM Transactions on Algorithms, 15 (2019), 26: 1–26: 26. doi: 10.1145/3301446.

[8]

A. Georgogiannis, Robust $k$-means: A theoretical revisit, Proceedings of 30th Conference on Neural Information Processing Systems, (2016), 2891–2899. doi: 10.5555/3157382.3157421.

[9]

J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Elsevier, 2012. doi: 10.1016/C2009-0-61819-5.

[10]

A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666.  doi: 10.1016/j.patrec.2009.09.011.

[11]

S. JiD. XuL. GuoM. Li and D. Zhang, The seeding algorithm for spherical $k$-means clustering with penalties, Journal of Combinatorial Optimization, 2 (2020), 1-18.  doi: 10.1007/s10898-019-00779-w.

[12]

T. KanungoD. M. MountN. S. NetanyahuC. D. PiatkoR. Silverman and A. Y. Wu, A local search approximation algorithm for $k$-means clustering, Computational Geometry-Theory and Applications, 28 (2004), 89-112.  doi: 10.1016/j.comgeo.2004.03.003.

[13]

M. Li, The bi-criteria seeding algorithms for two variants of $k$-means problem, Journal of Combinatorial Optimization. doi: 10.1007/s10878-020-00537-9.

[14]

M. LiD. XuJ. YueD. Zhang and P. Zhang, The seeding algorithm for $k$-means problem with penalties, Journal of Combinatorial Optimization, 39 (2020), 15-32.  doi: 10.1007/s10878-019-00450-w.

[15]

S. Lloyd, Least squares quantization in PCM, IEEE Transactions on Information Theory, 28 (1982), 129-137.  doi: 10.1109/TIT.1982.1056489.

[16]

R. Ostrovsky, Y. Rabani, L. Schulman and C. Swamy, The effectiveness of Lloyd-type methods for the $k$-means problem, Journal of the ACM, 59 (2012), 28: 1–28: 22. doi: 10.1145/2395116.2395117.

[17]

D. Wei, A constant-factor bi-criteria approximation guarantee for $k$-means++, Proceedings of the Thirtieth International Conference on Neural Information Processing Systems, (2016), 604–612. doi: 10.5555/3157096.3157164.

[18]

X. WuV. KumarJ. QuinlanJ. Ross GhoshQ. YangH. MotodaG. J. McLachlanA. NgB. LiuP. S. YuZ. H. ZhouM. SteinbachD. J. Hand and D. Steinberg, Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.  doi: 10.1007/s10115-007-0114-2.

[19]

D. Zhang, Y. Cheng, M. Li, Y. Wang and D. Xu, Local search approximation algorithms for the spherical $k$-means problem, Proceedings of International Conference on Algorithmic Applications in Management, Springer (2019), 341–351. doi: 10.1007/978-3-030-27195-4_31.

[20]

D. ZhangC. HaoC. WuD. Xu and Z. Zhang, Local search approximation algorithms for the $k$-means problem with penalties, Journal of Combinatorial Optimization, 37 (2019), 439-453.  doi: 10.1007/s10878-018-0278-6.

show all references

References:
[1]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.

[2]

D. AloiseA. DeshpandeP. Hansen and P. Popat, NP-hardness of Euclidean sum-of-squares clustering, Machine Learning, 75 (2009), 245-248.  doi: 10.1007/s10994-009-5103-0.

[3]

S. Ahmadian, A. Norouzi-Fard, O. Svensson and J. Ward, Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms, SIAM Journal on Computing, 49 (2019), FOCS17-97–FOCS17-156. doi: 10.1137/18M1171321.

[4]

D. Arthur and S. Vassilvitskii, $K$-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2007, 1027–1035. doi: 10.1145/1283383.1283494.

[5]

Y. Endo and S. Miyamoto, Spherical $k$-means++ clustering, Proceedings of International Conference on Modeling Decisions for Artificial Intelligence, (2015), 103–114. doi: 10.1007/978-3-319-23240-99.

[6]

Q. Feng, Z. Zhang, F. Shi and J. Wang, An improved approximation algorithm for the $k$-means problem with penalties, Proceedings of International Workshop on Frontiers in Algorithmics, (2019), 170–181. doi: 10.1007/978-3-030-18126-015.

[7]

Z. Friggstad, K. Khodamoradi, M. Rezapour and M. Salavatipour, Approximation schemes for clustering with outliers, ACM Transactions on Algorithms, 15 (2019), 26: 1–26: 26. doi: 10.1145/3301446.

[8]

A. Georgogiannis, Robust $k$-means: A theoretical revisit, Proceedings of 30th Conference on Neural Information Processing Systems, (2016), 2891–2899. doi: 10.5555/3157382.3157421.

[9]

J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Elsevier, 2012. doi: 10.1016/C2009-0-61819-5.

[10]

A. K. Jain, Data clustering: 50 years beyond $k$-means, Pattern Recognition Letters, 31 (2010), 651-666.  doi: 10.1016/j.patrec.2009.09.011.

[11]

S. JiD. XuL. GuoM. Li and D. Zhang, The seeding algorithm for spherical $k$-means clustering with penalties, Journal of Combinatorial Optimization, 2 (2020), 1-18.  doi: 10.1007/s10898-019-00779-w.

[12]

T. KanungoD. M. MountN. S. NetanyahuC. D. PiatkoR. Silverman and A. Y. Wu, A local search approximation algorithm for $k$-means clustering, Computational Geometry-Theory and Applications, 28 (2004), 89-112.  doi: 10.1016/j.comgeo.2004.03.003.

[13]

M. Li, The bi-criteria seeding algorithms for two variants of $k$-means problem, Journal of Combinatorial Optimization. doi: 10.1007/s10878-020-00537-9.

[14]

M. LiD. XuJ. YueD. Zhang and P. Zhang, The seeding algorithm for $k$-means problem with penalties, Journal of Combinatorial Optimization, 39 (2020), 15-32.  doi: 10.1007/s10878-019-00450-w.

[15]

S. Lloyd, Least squares quantization in PCM, IEEE Transactions on Information Theory, 28 (1982), 129-137.  doi: 10.1109/TIT.1982.1056489.

[16]

R. Ostrovsky, Y. Rabani, L. Schulman and C. Swamy, The effectiveness of Lloyd-type methods for the $k$-means problem, Journal of the ACM, 59 (2012), 28: 1–28: 22. doi: 10.1145/2395116.2395117.

[17]

D. Wei, A constant-factor bi-criteria approximation guarantee for $k$-means++, Proceedings of the Thirtieth International Conference on Neural Information Processing Systems, (2016), 604–612. doi: 10.5555/3157096.3157164.

[18]

X. WuV. KumarJ. QuinlanJ. Ross GhoshQ. YangH. MotodaG. J. McLachlanA. NgB. LiuP. S. YuZ. H. ZhouM. SteinbachD. J. Hand and D. Steinberg, Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.  doi: 10.1007/s10115-007-0114-2.

[19]

D. Zhang, Y. Cheng, M. Li, Y. Wang and D. Xu, Local search approximation algorithms for the spherical $k$-means problem, Proceedings of International Conference on Algorithmic Applications in Management, Springer (2019), 341–351. doi: 10.1007/978-3-030-27195-4_31.

[20]

D. ZhangC. HaoC. WuD. Xu and Z. Zhang, Local search approximation algorithms for the $k$-means problem with penalties, Journal of Combinatorial Optimization, 37 (2019), 439-453.  doi: 10.1007/s10878-018-0278-6.

Figure 1.  $ k $-means problem
Figure 2.  Spherical $ k $-means problem
Figure 3.  Swap operataion
Figure 4.  Compound mapping
Figure 5.  Cases for re-clustering
Figure 6.  Partition of $ \mathcal{N} $
[1]

Min Li, Yishui Wang, Dachuan Xu, Dongmei Zhang. The approximation algorithm based on seeding method for functional $ k $-means problem. Journal of Industrial and Management Optimization, 2022, 18 (1) : 411-426. doi: 10.3934/jimo.2020160

[2]

Sung Ha Kang, Berta Sandberg, Andy M. Yip. A regularized k-means and multiphase scale segmentation. Inverse Problems and Imaging, 2011, 5 (2) : 407-429. doi: 10.3934/ipi.2011.5.407

[3]

Yishui Wang, Dongmei Zhang, Peng Zhang, Yong Zhang. Local search algorithm for the squared metric $ k $-facility location problem with linear penalties. Journal of Industrial and Management Optimization, 2021, 17 (4) : 2013-2030. doi: 10.3934/jimo.2020056

[4]

Baolan Yuan, Wanjun Zhang, Yubo Yuan. A Max-Min clustering method for $k$-means algorithm of data clustering. Journal of Industrial and Management Optimization, 2012, 8 (3) : 565-575. doi: 10.3934/jimo.2012.8.565

[5]

Kien Ming Ng, Trung Hieu Tran. A parallel water flow algorithm with local search for solving the quadratic assignment problem. Journal of Industrial and Management Optimization, 2019, 15 (1) : 235-259. doi: 10.3934/jimo.2018041

[6]

Gaidi Li, Zhen Wang, Dachuan Xu. An approximation algorithm for the $k$-level facility location problem with submodular penalties. Journal of Industrial and Management Optimization, 2012, 8 (3) : 521-529. doi: 10.3934/jimo.2012.8.521

[7]

Baoli Shi, Zhi-Feng Pang, Jing Xu. Image segmentation based on the hybrid total variation model and the K-means clustering strategy. Inverse Problems and Imaging, 2016, 10 (3) : 807-828. doi: 10.3934/ipi.2016022

[8]

Fan Yuan, Dachuan Xu, Donglei Du, Min Li. An exact algorithm for stable instances of the $ k $-means problem with penalties in fixed-dimensional Euclidean space. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021122

[9]

Behrad Erfani, Sadoullah Ebrahimnejad, Amirhossein Moosavi. An integrated dynamic facility layout and job shop scheduling problem: A hybrid NSGA-II and local search algorithm. Journal of Industrial and Management Optimization, 2020, 16 (4) : 1801-1834. doi: 10.3934/jimo.2019030

[10]

Jianjun Liu, Min Zeng, Yifan Ge, Changzhi Wu, Xiangyu Wang. Improved Cuckoo Search algorithm for numerical function optimization. Journal of Industrial and Management Optimization, 2020, 16 (1) : 103-115. doi: 10.3934/jimo.2018142

[11]

Justyna Jarczyk, Witold Jarczyk. Gaussian iterative algorithm and integrated automorphism equation for random means. Discrete and Continuous Dynamical Systems, 2020, 40 (12) : 6837-6844. doi: 10.3934/dcds.2020135

[12]

Leong-Kwan Li, Sally Shao. Convergence analysis of the weighted state space search algorithm for recurrent neural networks. Numerical Algebra, Control and Optimization, 2014, 4 (3) : 193-207. doi: 10.3934/naco.2014.4.193

[13]

Behrouz Kheirfam, Morteza Moslemi. On the extension of an arc-search interior-point algorithm for semidefinite optimization. Numerical Algebra, Control and Optimization, 2018, 8 (2) : 261-275. doi: 10.3934/naco.2018015

[14]

Mohamed A. Tawhid, Ahmed F. Ali. An effective hybrid firefly algorithm with the cuckoo search for engineering optimization problems. Mathematical Foundations of Computing, 2018, 1 (4) : 349-368. doi: 10.3934/mfc.2018017

[15]

Weihua Liu, Andrew Klapper. AFSRs synthesis with the extended Euclidean rational approximation algorithm. Advances in Mathematics of Communications, 2017, 11 (1) : 139-150. doi: 10.3934/amc.2017008

[16]

Xavier Gràcia, Xavier Rivas, Narciso Román-Roy. Constraint algorithm for singular field theories in the k-cosymplectic framework. Journal of Geometric Mechanics, 2020, 12 (1) : 1-23. doi: 10.3934/jgm.2020002

[17]

Ruiqi Yang, Dachuan Xu, Yicheng Xu, Dongmei Zhang. An adaptive probabilistic algorithm for online k-center clustering. Journal of Industrial and Management Optimization, 2019, 15 (2) : 565-576. doi: 10.3934/jimo.2018057

[18]

Ming-Jong Yao, Tien-Cheng Hsu. An efficient search algorithm for obtaining the optimal replenishment strategies in multi-stage just-in-time supply chain systems. Journal of Industrial and Management Optimization, 2009, 5 (1) : 11-32. doi: 10.3934/jimo.2009.5.11

[19]

Abdel-Rahman Hedar, Ahmed Fouad Ali, Taysir Hassan Abdel-Hamid. Genetic algorithm and Tabu search based methods for molecular 3D-structure prediction. Numerical Algebra, Control and Optimization, 2011, 1 (1) : 191-209. doi: 10.3934/naco.2011.1.191

[20]

Zheng Chang, Haoxun Chen, Farouk Yalaoui, Bo Dai. Adaptive large neighborhood search Algorithm for route planning of freight buses with pickup and delivery. Journal of Industrial and Management Optimization, 2021, 17 (4) : 1771-1793. doi: 10.3934/jimo.2020045

2021 Impact Factor: 1.411

Metrics

  • PDF downloads (456)
  • HTML views (402)
  • Cited by (0)

Other articles
by authors

[Back to Top]