• Previous Article
    An uncertain programming model for single machine scheduling problem with batch delivery
  • JIMO Home
  • This Issue
  • Next Article
    RETRACTION: Peng Zhang, Chance-constrained multiperiod mean absolute deviation uncertain portfolio selection
April  2019, 15(2): 565-576. doi: 10.3934/jimo.2018057

An adaptive probabilistic algorithm for online k-center clustering

1. 

Department of Information and Operations Research, College of Applied Sciences, Beijing University of Technology, Beijing 100124, China

2. 

School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China

* Corresponding author: Dongmei Zhang

Received  August 2017 Revised  December 2017 Published  April 2019 Early access  April 2018

The $k$-center clustering is one of the well-studied clustering problems in computer science. We are given a set of data points $P\subseteq R^d$, where $R^d$ is $d$ dimensional Euclidean space. We need to select $k≤ |P|$ points as centers and partition the set $P$ into $k$ clusters with each point connecting to its nearest center. The goal is to minimize the maximum radius. We consider the so-called online $k$-center clustering model where the data points in $R^d$ arrive over time. We present the bi-criteria $(\frac{n}{k}, (\log\frac{U^*}{L^*})^2)$-competitive algorithm and $(\frac{n}{k}, \logγ\log\frac{nγ}{k})$-competitive algorithm for semi-online and fully-online $k$-center clustering respectively, where $U^*$ is the maximum cluster radius of optimal solution, $L^*$ is the minimum distance of two distinct points of $P$, $γ$ is the ratio of the maximum distance of two distinct points and the minimum distance of two distinct points of $P$ and $n$ is the number of points that will arrive in total.

Citation: Ruiqi Yang, Dachuan Xu, Yicheng Xu, Dongmei Zhang. An adaptive probabilistic algorithm for online k-center clustering. Journal of Industrial and Management Optimization, 2019, 15 (2) : 565-576. doi: 10.3934/jimo.2018057
References:
[1]

D. AchlioptasM. Chrobak and J. Noga, Competitive analysis of randomized paging algorithms, Theoretical Computer Science, 234 (2000), 203-218.  doi: 10.1016/S0304-3975(98)00116-9.

[2]

J. Bar-IlanG. Kortsarz and D. Peleg, How to allocate network centers, Journal of Algorithms, 15 (1993), 385-415.  doi: 10.1006/jagm.1993.1047.

[3]

M. CharikarC. ChekuriT. Feder and R. Motwani, Incremental clustering and dynamic information retrieval, SIAM Journal on Computing, 33 (2004), 1417-1440.  doi: 10.1137/S0097539702418498.

[4]

S. ChaudhuriN. Garg and R. Ravi, The p-neighbor k-center problem, Information Processing Letters, 65 (1998), 131-134.  doi: 10.1016/S0020-0190(97)00224-X.

[5]

S. Chechik and D. Peleg, The fault-tolerant capacitated k-center problem, Theoretical Computer Science, 566 (2015), 12-25.  doi: 10.1016/j.tcs.2014.11.017.

[6]

M. Cygan, M. T. Hajiaghayi and S. Khuller, LP rounding for k-centers with non-uniform hard capacities, Proceedings of the fifity-third Annual Symposium on Foundations of Computer Science (FOCS). IEEE, (2012), 273–282.

[7]

S. B. DavidA. BorodinR. KarpG. Tardos and A. Wigderson, On the power of randomization in on-line algorithms, Algorithmica, 11 (1994), 2-14.  doi: 10.1007/BF01294260.

[8]

T. Feder and D. Greene, Optimal algorithms for approximate clustering, Proceedings of the twentieth Annual ACM Symposium on Theory of Computing, ACM, (1988), 434–444. doi: 10.1145/62212.62255.

[9]

C. G. FernandesS. P. de Paula and L. L. C. Pedrosa, Improved approximation algorithms for capacitated fault-tolerant k-center, Algorithmica, 80 (2018), 1041-1072. 

[10]

T. F. Gonzalez, Clustering to minimize the maximum intercluster distance, Theoretical Computer Science, 38 (1985), 293-306.  doi: 10.1016/0304-3975(85)90224-5.

[11]

D. S. Hochbaum and D. B. Shmoys, A best possible heuristic for the k-center problem, Mathematics of Operations Research, 10 (1985), 180-184.  doi: 10.1287/moor.10.2.180.

[12]

W. L. Hsu and G. L. Nemhauser, Easy and hard bottleneck location problems, Discrete Applied Mathematics, 1 (1979), 209-215.  doi: 10.1016/0166-218X(79)90044-1.

[13]

S. KhullerR. Pless and Y. J. Sussmann, Fault tolerant k-center problems, Theoretical Computer Science, 242 (2000), 237-245.  doi: 10.1016/S0304-3975(98)00222-9.

[14]

S. Khuller and Y. J. Sussmann, The capacitated k-center problem, SIAM Journal on Discrete Mathematics, 13 (2000), 403-418.  doi: 10.1137/S0895480197329776.

[15]

S. O. Krumke, On a generalization of the p-center problem, Information Processing Letters, 56 (1995), 67-71.  doi: 10.1016/0020-0190(95)00141-X.

[16]

E. Liberty, R. Sriharsha and M. Sviridenko, An algorithm for online k-means clustering, Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), Society for Industrial and Applied Mathematics, (2016), 81–89. doi: 10.1137/1.9781611974317.7.

[17]

R. G. Michael and S. J. David, Computers and intractability: A guide to the theory of NP-completeness, W. H. Freeman, San Francisco, (1979), 90-91. 

show all references

References:
[1]

D. AchlioptasM. Chrobak and J. Noga, Competitive analysis of randomized paging algorithms, Theoretical Computer Science, 234 (2000), 203-218.  doi: 10.1016/S0304-3975(98)00116-9.

[2]

J. Bar-IlanG. Kortsarz and D. Peleg, How to allocate network centers, Journal of Algorithms, 15 (1993), 385-415.  doi: 10.1006/jagm.1993.1047.

[3]

M. CharikarC. ChekuriT. Feder and R. Motwani, Incremental clustering and dynamic information retrieval, SIAM Journal on Computing, 33 (2004), 1417-1440.  doi: 10.1137/S0097539702418498.

[4]

S. ChaudhuriN. Garg and R. Ravi, The p-neighbor k-center problem, Information Processing Letters, 65 (1998), 131-134.  doi: 10.1016/S0020-0190(97)00224-X.

[5]

S. Chechik and D. Peleg, The fault-tolerant capacitated k-center problem, Theoretical Computer Science, 566 (2015), 12-25.  doi: 10.1016/j.tcs.2014.11.017.

[6]

M. Cygan, M. T. Hajiaghayi and S. Khuller, LP rounding for k-centers with non-uniform hard capacities, Proceedings of the fifity-third Annual Symposium on Foundations of Computer Science (FOCS). IEEE, (2012), 273–282.

[7]

S. B. DavidA. BorodinR. KarpG. Tardos and A. Wigderson, On the power of randomization in on-line algorithms, Algorithmica, 11 (1994), 2-14.  doi: 10.1007/BF01294260.

[8]

T. Feder and D. Greene, Optimal algorithms for approximate clustering, Proceedings of the twentieth Annual ACM Symposium on Theory of Computing, ACM, (1988), 434–444. doi: 10.1145/62212.62255.

[9]

C. G. FernandesS. P. de Paula and L. L. C. Pedrosa, Improved approximation algorithms for capacitated fault-tolerant k-center, Algorithmica, 80 (2018), 1041-1072. 

[10]

T. F. Gonzalez, Clustering to minimize the maximum intercluster distance, Theoretical Computer Science, 38 (1985), 293-306.  doi: 10.1016/0304-3975(85)90224-5.

[11]

D. S. Hochbaum and D. B. Shmoys, A best possible heuristic for the k-center problem, Mathematics of Operations Research, 10 (1985), 180-184.  doi: 10.1287/moor.10.2.180.

[12]

W. L. Hsu and G. L. Nemhauser, Easy and hard bottleneck location problems, Discrete Applied Mathematics, 1 (1979), 209-215.  doi: 10.1016/0166-218X(79)90044-1.

[13]

S. KhullerR. Pless and Y. J. Sussmann, Fault tolerant k-center problems, Theoretical Computer Science, 242 (2000), 237-245.  doi: 10.1016/S0304-3975(98)00222-9.

[14]

S. Khuller and Y. J. Sussmann, The capacitated k-center problem, SIAM Journal on Discrete Mathematics, 13 (2000), 403-418.  doi: 10.1137/S0895480197329776.

[15]

S. O. Krumke, On a generalization of the p-center problem, Information Processing Letters, 56 (1995), 67-71.  doi: 10.1016/0020-0190(95)00141-X.

[16]

E. Liberty, R. Sriharsha and M. Sviridenko, An algorithm for online k-means clustering, Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), Society for Industrial and Applied Mathematics, (2016), 81–89. doi: 10.1137/1.9781611974317.7.

[17]

R. G. Michael and S. J. David, Computers and intractability: A guide to the theory of NP-completeness, W. H. Freeman, San Francisco, (1979), 90-91. 

Figure 1.  An example of a sequence of points $1, x^2, ..., x^{2t}$ coming over time. The circles with dotted lines are the clusters produced by an online algorithm. Solid ones are those produced by the optimal off-line algorithm
Figure 2.  An illustration of partition of $S^*_i$ for any given $i\in[k]$
[1]

Baolan Yuan, Wanjun Zhang, Yubo Yuan. A Max-Min clustering method for $k$-means algorithm of data clustering. Journal of Industrial and Management Optimization, 2012, 8 (3) : 565-575. doi: 10.3934/jimo.2012.8.565

[2]

Haixia Liu, Jian-Feng Cai, Yang Wang. Subspace clustering by (k,k)-sparse matrix factorization. Inverse Problems and Imaging, 2017, 11 (3) : 539-551. doi: 10.3934/ipi.2017025

[3]

Gurkan Ozturk, Mehmet Tahir Ciftci. Clustering based polyhedral conic functions algorithm in classification. Journal of Industrial and Management Optimization, 2015, 11 (3) : 921-932. doi: 10.3934/jimo.2015.11.921

[4]

Jiping Tao, Ronghuan Huang, Tundong Liu. A $2.28$-competitive algorithm for online scheduling on identical machines. Journal of Industrial and Management Optimization, 2015, 11 (1) : 185-198. doi: 10.3934/jimo.2015.11.185

[5]

Guojun Gan, Kun Chen. A soft subspace clustering algorithm with log-transformed distances. Big Data & Information Analytics, 2016, 1 (1) : 93-109. doi: 10.3934/bdia.2016.1.93

[6]

Jiangchuan Fan, Xinyu Guo, Jianjun Du, Weiliang Wen, Xianju Lu, Brahmani Louiza. Analysis of the clustering fusion algorithm for multi-band color image. Discrete and Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1233-1249. doi: 10.3934/dcdss.2019085

[7]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete and Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[8]

Zongwei Chen. An online-decision algorithm for the multi-period bank clearing problem. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2783-2803. doi: 10.3934/jimo.2021091

[9]

Baoli Shi, Zhi-Feng Pang, Jing Xu. Image segmentation based on the hybrid total variation model and the K-means clustering strategy. Inverse Problems and Imaging, 2016, 10 (3) : 807-828. doi: 10.3934/ipi.2016022

[10]

Xavier Gràcia, Xavier Rivas, Narciso Román-Roy. Constraint algorithm for singular field theories in the k-cosymplectic framework. Journal of Geometric Mechanics, 2020, 12 (1) : 1-23. doi: 10.3934/jgm.2020002

[11]

Gaidi Li, Zhen Wang, Dachuan Xu. An approximation algorithm for the $k$-level facility location problem with submodular penalties. Journal of Industrial and Management Optimization, 2012, 8 (3) : 521-529. doi: 10.3934/jimo.2012.8.521

[12]

Saber Shiripour, Nezam Mahdavi-Amiri. Median location problem with two probabilistic line barriers: Extending the Hook and Jeeves algorithm. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021128

[13]

Jiping Tao, Zhijun Chao, Yugeng Xi. A semi-online algorithm and its competitive analysis for a single machine scheduling problem with bounded processing times. Journal of Industrial and Management Optimization, 2010, 6 (2) : 269-282. doi: 10.3934/jimo.2010.6.269

[14]

Ran Ma, Jiping Tao. An improved 2.11-competitive algorithm for online scheduling on parallel machines to minimize total weighted completion time. Journal of Industrial and Management Optimization, 2018, 14 (2) : 497-510. doi: 10.3934/jimo.2017057

[15]

Ran Ma, Lu Zhang, Yuzhong Zhang. A best possible algorithm for an online scheduling problem with position-based learning effect. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021144

[16]

Jie Shen, Jian Lv, Fang-Fang Guo, Ya-Li Gao, Rui Zhao. A new proximal chebychev center cutting plane algorithm for nonsmooth optimization and its convergence. Journal of Industrial and Management Optimization, 2018, 14 (3) : 1143-1155. doi: 10.3934/jimo.2018003

[17]

Yishui Wang, Dongmei Zhang, Peng Zhang, Yong Zhang. Local search algorithm for the squared metric $ k $-facility location problem with linear penalties. Journal of Industrial and Management Optimization, 2021, 17 (4) : 2013-2030. doi: 10.3934/jimo.2020056

[18]

Xavier Gràcia, Xavier Rivas, Narciso Román-Roy. Erratum: Constraint algorithm for singular field theories in the $ k $-cosymplectic framework. Journal of Geometric Mechanics, 2021, 13 (2) : 273-275. doi: 10.3934/jgm.2021007

[19]

Min Li, Yishui Wang, Dachuan Xu, Dongmei Zhang. The approximation algorithm based on seeding method for functional $ k $-means problem. Journal of Industrial and Management Optimization, 2022, 18 (1) : 411-426. doi: 10.3934/jimo.2020160

[20]

Chenchen Wu, Wei Lv, Yujie Wang, Dachuan Xu. Approximation algorithm for spherical $ k $-means problem with penalty. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2277-2287. doi: 10.3934/jimo.2021067

2020 Impact Factor: 1.801

Metrics

  • PDF downloads (309)
  • HTML views (1135)
  • Cited by (0)

Other articles
by authors

[Back to Top]