# American Institute of Mathematical Sciences

September  2019, 1(3): 307-327. doi: 10.3934/fods.2019014

## Power weighted shortest paths for clustering Euclidean data

 1 Department of Mathematics, University of California, Los Angeles, Los Angeles CA 90095, USA 2 Department of Mathematics, University of Michigan, Ann Arbor MI 48109, USA

* Corresponding author: mckenzie@math.ucla.edu

Published  September 2019

Fund Project: The first author gratefully acknowledges the support of the Department of Mathematics, The University of Georgia, where the first author was a graduate student while this work was completed. The second author thanks the Department of Mathematics, The University of Michigan for their support. Both authors thank the anonymous reviewer for many useful suggestions.

We study the use of power weighted shortest path metrics for clustering high dimensional Euclidean data, under the assumption that the data is drawn from a collection of disjoint low dimensional manifolds. We argue, theoretically and experimentally, that this leads to higher clustering accuracy. We also present a fast algorithm for computing these distances.

Citation: Daniel Mckenzie, Steven Damelin. Power weighted shortest paths for clustering Euclidean data. Foundations of Data Science, 2019, 1 (3) : 307-327. doi: 10.3934/fods.2019014
##### References:
 [1] M. Alamgir and U. Von Luxburg, Shortest path distance in random k-nearest neighbor graphs, arXiv preprint, arXiv: 1206.6381. [2] A. Aldroubi, K. Hamm, A. Koku and A. Sekmen, Cur decompositions, similarity matrices, and subspace clustering, Front. Appl. Math. Stat., 4 (2019), p65. doi: 10.3389/fams.2018.00065. [3] E. Arias-Castro, Clustering based on pairwise distances when the data is of mixed dimensions, IEEE Transactions on Information Theory, 57 (2011), 1692-1706.  doi: 10.1109/TIT.2011.2104630. [4] R. Basri and D. Jacobs, Lambertian reflectance and linear subspaces, IEEE Transactions on Pattern Analysis & Machine Intelligence, 2002,218–233. doi: 10.1109/ICCV.2001.937651. [5] J. L. Bentley, Multidimensional binary search trees used for associative searching, Communications of the ACM, 18 (1975), 509-517.  doi: 10.1145/361002.361007. [6] A. Beygelzimer, S. Kakade and J. Langford, Cover trees for nearest neighbor, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, 97–104. doi: 10.1145/1143844.1143857. [7] A. Bijral, N. Ratliff and N. Srebro, Semi-supervised learning with density based distances, in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2011, 43–50. [8] H. Chang and D.-Y. Yeung, Robust path-based spectral clustering, Pattern Recognition, 41 (2008), 191-203.  doi: 10.1016/j.patcog.2007.04.010. [9] T. Chu, G. Miller and D. Sheehy, Exploration of a graph-based density sensitive metric, arXiv preprint, arXiv: 1709.07797. [10] R. Coifman and S. Lafon, Diffusion maps, Applied and Computational Harmonic Analysis, 21 (2006), 5-30.  doi: 10.1016/j.acha.2006.04.006. [11] T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, MIT press, 2009. [12] J. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179. [13] K. Diaz-Chito, A. Hernández-Sabaté and A. López, A reduced feature set for driver head pose estimation, Applied Soft Computing, 45 (2016), 98-107.  doi: 10.1016/j.asoc.2016.04.027. [14] D. Dua and C. Graff, UCImachine learning repository, 2017, http://archive.ics.uci.edu/ml. [15] C. Fefferman, S. Mitter and H. Narayanan, Testing the manifold hypothesis, Journal of the American Mathematical Society, 29 (2016), 983-1049.  doi: 10.1090/jams/852. [16] B. Fischer and J. Buhmann, Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25 (2003), 513-518.  doi: 10.1109/TPAMI.2003.1190577. [17] S. Har-Peled, Computing the k nearest-neighbors for all vertices via dijkstra, arXiv preprint, arXiv: 1607.07818. [18] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee and D. Kriegman, Clustering appearances of objects under varying illumination conditions, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003, 11–18. doi: 10.1109/CVPR.2003.1211332. [19] C. D. Howard and C. Newman, Geodesics and spanning trees for euclidean first-passage percolation, Annals of Probability, 29 (2001), 577-623.  doi: 10.1214/aop/1008956685. [20] S. Hwang, S. Damelin and A. Hero Ⅲ, Shortest path through random points, The Annals of Applied Probability, 26 (2016), 2791-2823.  doi: 10.1214/15-AAP1162. [21] M. Jacobs, E. Merkurjev and S. Esedoḡlu, Auction dynamics: A volume constrained mbo scheme, Journal of Computational Physics, 354 (2018), 288-310.  doi: 10.1016/j.jcp.2017.10.036. [22] A. Little, M. Maggioni and J. Murphy, Path-based spectral clustering: Guarantees, robustness to outliers, and fast algorithms, arXiv preprint, arXiv: 1712.06206. [23] A. Moscovich, A. Jaffe and B. Nadler, Minimax-optimal semi-supervised regression on unknown manifolds, in Artificial Intelligence and Statistics, 2017,933–942. [24] S. Nene, S. Nayar, H. Murase et al., Columbia object image library (coil-20). [25] A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 2002,849–856. [26] A. Orlitsky and Sajama, Estimating and computing density based distance metrics, in Proceedings of the 22nd International Conference on Machine Learning, ACM, 2005,760–767. [27] J. Tenenbaum, V. De Silva and J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323.  doi: 10.1126/science.290.5500.2319. [28] P. Vincent and Y. Bengio, Density-sensitive Metrics and Kernels, Snowbird Learning Workshop, 2003. [29] K. Yin and X.-C. Tai, An effective region force for some variational models for learning and clustering, Journal of Scientific Computing, 74 (2018), 175-196.  doi: 10.1007/s10915-017-0429-4. [30] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances in Neural Information Processing Systems, 2005, 1601–1608.

show all references

##### References:
 [1] M. Alamgir and U. Von Luxburg, Shortest path distance in random k-nearest neighbor graphs, arXiv preprint, arXiv: 1206.6381. [2] A. Aldroubi, K. Hamm, A. Koku and A. Sekmen, Cur decompositions, similarity matrices, and subspace clustering, Front. Appl. Math. Stat., 4 (2019), p65. doi: 10.3389/fams.2018.00065. [3] E. Arias-Castro, Clustering based on pairwise distances when the data is of mixed dimensions, IEEE Transactions on Information Theory, 57 (2011), 1692-1706.  doi: 10.1109/TIT.2011.2104630. [4] R. Basri and D. Jacobs, Lambertian reflectance and linear subspaces, IEEE Transactions on Pattern Analysis & Machine Intelligence, 2002,218–233. doi: 10.1109/ICCV.2001.937651. [5] J. L. Bentley, Multidimensional binary search trees used for associative searching, Communications of the ACM, 18 (1975), 509-517.  doi: 10.1145/361002.361007. [6] A. Beygelzimer, S. Kakade and J. Langford, Cover trees for nearest neighbor, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, 97–104. doi: 10.1145/1143844.1143857. [7] A. Bijral, N. Ratliff and N. Srebro, Semi-supervised learning with density based distances, in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2011, 43–50. [8] H. Chang and D.-Y. Yeung, Robust path-based spectral clustering, Pattern Recognition, 41 (2008), 191-203.  doi: 10.1016/j.patcog.2007.04.010. [9] T. Chu, G. Miller and D. Sheehy, Exploration of a graph-based density sensitive metric, arXiv preprint, arXiv: 1709.07797. [10] R. Coifman and S. Lafon, Diffusion maps, Applied and Computational Harmonic Analysis, 21 (2006), 5-30.  doi: 10.1016/j.acha.2006.04.006. [11] T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, MIT press, 2009. [12] J. Costeira and T. Kanade, A multibody factorization method for independently moving objects, International Journal of Computer Vision, 29 (1998), 159-179. [13] K. Diaz-Chito, A. Hernández-Sabaté and A. López, A reduced feature set for driver head pose estimation, Applied Soft Computing, 45 (2016), 98-107.  doi: 10.1016/j.asoc.2016.04.027. [14] D. Dua and C. Graff, UCImachine learning repository, 2017, http://archive.ics.uci.edu/ml. [15] C. Fefferman, S. Mitter and H. Narayanan, Testing the manifold hypothesis, Journal of the American Mathematical Society, 29 (2016), 983-1049.  doi: 10.1090/jams/852. [16] B. Fischer and J. Buhmann, Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25 (2003), 513-518.  doi: 10.1109/TPAMI.2003.1190577. [17] S. Har-Peled, Computing the k nearest-neighbors for all vertices via dijkstra, arXiv preprint, arXiv: 1607.07818. [18] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee and D. Kriegman, Clustering appearances of objects under varying illumination conditions, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003, 11–18. doi: 10.1109/CVPR.2003.1211332. [19] C. D. Howard and C. Newman, Geodesics and spanning trees for euclidean first-passage percolation, Annals of Probability, 29 (2001), 577-623.  doi: 10.1214/aop/1008956685. [20] S. Hwang, S. Damelin and A. Hero Ⅲ, Shortest path through random points, The Annals of Applied Probability, 26 (2016), 2791-2823.  doi: 10.1214/15-AAP1162. [21] M. Jacobs, E. Merkurjev and S. Esedoḡlu, Auction dynamics: A volume constrained mbo scheme, Journal of Computational Physics, 354 (2018), 288-310.  doi: 10.1016/j.jcp.2017.10.036. [22] A. Little, M. Maggioni and J. Murphy, Path-based spectral clustering: Guarantees, robustness to outliers, and fast algorithms, arXiv preprint, arXiv: 1712.06206. [23] A. Moscovich, A. Jaffe and B. Nadler, Minimax-optimal semi-supervised regression on unknown manifolds, in Artificial Intelligence and Statistics, 2017,933–942. [24] S. Nene, S. Nayar, H. Murase et al., Columbia object image library (coil-20). [25] A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 2002,849–856. [26] A. Orlitsky and Sajama, Estimating and computing density based distance metrics, in Proceedings of the 22nd International Conference on Machine Learning, ACM, 2005,760–767. [27] J. Tenenbaum, V. De Silva and J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323.  doi: 10.1126/science.290.5500.2319. [28] P. Vincent and Y. Bengio, Density-sensitive Metrics and Kernels, Snowbird Learning Workshop, 2003. [29] K. Yin and X.-C. Tai, An effective region force for some variational models for learning and clustering, Journal of Scientific Computing, 74 (2018), 175-196.  doi: 10.1007/s10915-017-0429-4. [30] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances in Neural Information Processing Systems, 2005, 1601–1608.
Three sample geodesics in the power weighted shortest path metric with $p=2$, for the data set "Three Lines" (see §6). Observe how the geodesics consist of many small hops, instead of several large hops. The total lengths of the red and green paths are significantly smaller than the length of the blue path
All three synthetic data sets, projected into $\mathbb{R}^{2}$. From left to right: Three Lines, Three Moons and Three Circles
Varying $p$ and recording the accuracy of spectral clustering on the Three Lines data set, for three different values of the ambient dimension
Classification accuracy of spectral clustering. Note that $A^{(1)}$ represents using the Euclidean metric
 $A^{(f, 1)}$ $A^{(1)}$ $A^{(2)}$ $A^{(10)}$ $A^{(\infty)}$ 3 Lines $66.11\pm 0.94\%$ $66.35 \pm 3.73\%$ $66.87 \pm 3.37\%$ $95.38\pm 9.22\%$ $\bf{95.38 \pm 9.1\%}$ 3 Moons $85.90 \pm 1.13\%$ $94.40 \pm 1.48\%$ $94.40 \pm 1.48\%$ $\bf{96.20 \pm 1.76\%}$ $94.35 \pm 3.34\%$ 3 Circles $51.87 \pm 0.00\%$ $51.93 \pm 0.32\%$ $51.94 \pm 0.36\%$ $71.22 \pm 9.50\%$ $\bf{73.61 \pm 10.47\%}$ $\mathtt{ DrivFace}$ $78.88\%$ $71.62\%$ $71.62\%$ $74.71\%$ $\bf{85.38\%}$ $\mathtt{ COIL-20}$ $63.24\%$ $75.28\%$ $\bf{78.61\%}$ $77.45\%$ $60.92\%$ $\mathtt{ OptDigits}$ $77.73\%$ $91.49\%$ $\bf{91.54\%}$ $88.39\%$ $83.17\%$ $\mathtt{ USPS}$ $48.65\%$ $65.05\%$ $65.02\%$ $76.20\%$ $\bf{77.92\%}$ $\mathtt{ MNIST}$ - $76.11\%$ $75.63\%$ $84.54\%$ $\bf{86.77\%}$
 $A^{(f, 1)}$ $A^{(1)}$ $A^{(2)}$ $A^{(10)}$ $A^{(\infty)}$ 3 Lines $66.11\pm 0.94\%$ $66.35 \pm 3.73\%$ $66.87 \pm 3.37\%$ $95.38\pm 9.22\%$ $\bf{95.38 \pm 9.1\%}$ 3 Moons $85.90 \pm 1.13\%$ $94.40 \pm 1.48\%$ $94.40 \pm 1.48\%$ $\bf{96.20 \pm 1.76\%}$ $94.35 \pm 3.34\%$ 3 Circles $51.87 \pm 0.00\%$ $51.93 \pm 0.32\%$ $51.94 \pm 0.36\%$ $71.22 \pm 9.50\%$ $\bf{73.61 \pm 10.47\%}$ $\mathtt{ DrivFace}$ $78.88\%$ $71.62\%$ $71.62\%$ $74.71\%$ $\bf{85.38\%}$ $\mathtt{ COIL-20}$ $63.24\%$ $75.28\%$ $\bf{78.61\%}$ $77.45\%$ $60.92\%$ $\mathtt{ OptDigits}$ $77.73\%$ $91.49\%$ $\bf{91.54\%}$ $88.39\%$ $83.17\%$ $\mathtt{ USPS}$ $48.65\%$ $65.05\%$ $65.02\%$ $76.20\%$ $\bf{77.92\%}$ $\mathtt{ MNIST}$ - $76.11\%$ $75.63\%$ $84.54\%$ $\bf{86.77\%}$
Run time of spectral clustering, in seconds. Note that this includes the time rquired to construct the similarity matrix. $A^{(1)}$ represents using the Euclidean metric
 $A^{(f, 1)}$ $A^{(1)}$ $A^{(2)}$ $A^{(10)}$ $A^{(\infty)}$ 3 Lines $0.32$ $0.16$ $1.20$ $1.22$ $1.22$ 3 Moons $0.33$ $0.17$ $1.31$ $1.30$ $1.36$ 3 Circles $0.35$ $0.16$ $1.00$ $1.06$ $1.07$ $\mathtt{ DrivFace}$ $0.37$ $1.24$ $1.55$ $1.64$ $1.64$ $\mathtt{ COIL-20}$ $0.57$ $0.72$ $1.57$ $1.82$ $1.78$ $\mathtt{ OptDigits}$ $5.40$ $1.41$ $5.28$ $5.58$ $5.67$ $\mathtt{ USPS}$ $27.40$ $17.12$ $26.75$ $22.78$ $23.79$ $\mathtt{ MNIST}$ - $2060.23$ $2031.38$ $1554.15$ $1613.41$
 $A^{(f, 1)}$ $A^{(1)}$ $A^{(2)}$ $A^{(10)}$ $A^{(\infty)}$ 3 Lines $0.32$ $0.16$ $1.20$ $1.22$ $1.22$ 3 Moons $0.33$ $0.17$ $1.31$ $1.30$ $1.36$ 3 Circles $0.35$ $0.16$ $1.00$ $1.06$ $1.07$ $\mathtt{ DrivFace}$ $0.37$ $1.24$ $1.55$ $1.64$ $1.64$ $\mathtt{ COIL-20}$ $0.57$ $0.72$ $1.57$ $1.82$ $1.78$ $\mathtt{ OptDigits}$ $5.40$ $1.41$ $5.28$ $5.58$ $5.67$ $\mathtt{ USPS}$ $27.40$ $17.12$ $26.75$ $22.78$ $23.79$ $\mathtt{ MNIST}$ - $2060.23$ $2031.38$ $1554.15$ $1613.41$
 [1] Mikhail Langovoy, Akhilesh Gotmare, Martin Jaggi. Unsupervised robust nonparametric learning of hidden community properties. Mathematical Foundations of Computing, 2019, 2 (2) : 127-147. doi: 10.3934/mfc.2019010 [2] Mohsen Abdolhosseinzadeh, Mir Mohammad Alipour. Design of experiment for tuning parameters of an ant colony optimization method for the constrained shortest Hamiltonian path problem in the grid networks. Numerical Algebra, Control and Optimization, 2021, 11 (2) : 321-332. doi: 10.3934/naco.2020028 [3] Bin Zhou, Xinghao Chen. A directional heuristics pulse algorithm for a two resources constrained shortest path problem with reinitialization. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022097 [4] Dmitry Pozharskiy, Noah J. Wichrowski, Andrew B. Duncan, Grigorios A. Pavliotis, Ioannis G. Kevrekidis. Manifold learning for accelerating coarse-grained optimization. Journal of Computational Dynamics, 2020, 7 (2) : 511-536. doi: 10.3934/jcd.2020021 [5] Leonor Cruzeiro. The VES hypothesis and protein misfolding. Discrete and Continuous Dynamical Systems - S, 2011, 4 (5) : 1033-1046. doi: 10.3934/dcdss.2011.4.1033 [6] Antonio Rieser. A topological approach to spectral clustering. Foundations of Data Science, 2021, 3 (1) : 49-66. doi: 10.3934/fods.2021005 [7] Yongbin Ou, Cun-Quan Zhang. A new multimembership clustering method. Journal of Industrial and Management Optimization, 2007, 3 (4) : 619-624. doi: 10.3934/jimo.2007.3.619 [8] Dominique Duncan, Thomas Strohmer. Classification of Alzheimer's disease using unsupervised diffusion component analysis. Mathematical Biosciences & Engineering, 2016, 13 (6) : 1119-1130. doi: 10.3934/mbe.2016033 [9] Maria Gabriella Mosquera, Vlado Keselj. Identifying electronic gaming machine gambling personae through unsupervised session classification. Big Data & Information Analytics, 2017, 2 (2) : 141-175. doi: 10.3934/bdia.2017015 [10] Martin Frank, Armin Fügenschuh, Michael Herty, Lars Schewe. The coolest path problem. Networks and Heterogeneous Media, 2010, 5 (1) : 143-162. doi: 10.3934/nhm.2010.5.143 [11] Rui Wang, Guo-Wei Wei. Persistent path Laplacian. Foundations of Data Science, 2022  doi: 10.3934/fods.2022015 [12] Baolan Yuan, Wanjun Zhang, Yubo Yuan. A Max-Min clustering method for $k$-means algorithm of data clustering. Journal of Industrial and Management Optimization, 2012, 8 (3) : 565-575. doi: 10.3934/jimo.2012.8.565 [13] José M. Arrieta, Esperanza Santamaría. Estimates on the distance of inertial manifolds. Discrete and Continuous Dynamical Systems, 2014, 34 (10) : 3921-3944. doi: 10.3934/dcds.2014.34.3921 [14] Liliana Trejo-Valencia, Edgardo Ugalde. Projective distance and $g$-measures. Discrete and Continuous Dynamical Systems - B, 2015, 20 (10) : 3565-3579. doi: 10.3934/dcdsb.2015.20.3565 [15] Changbing Hu, Roger Temam, Mohammed Ziane. The primitive equations on the large scale ocean under the small depth hypothesis. Discrete and Continuous Dynamical Systems, 2003, 9 (1) : 97-131. doi: 10.3934/dcds.2003.9.97 [16] Shinsuke Koyama, Lubomir Kostal. The effect of interspike interval statistics on the information gain under the rate coding hypothesis. Mathematical Biosciences & Engineering, 2014, 11 (1) : 63-80. doi: 10.3934/mbe.2014.11.63 [17] Christian Bonatti, Sylvain Crovisier, Katsutoshi Shinohara. The $C^{1+\alpha }$ hypothesis in Pesin Theory revisited. Journal of Modern Dynamics, 2013, 7 (4) : 605-618. doi: 10.3934/jmd.2013.7.605 [18] J.-M. Deshouillers, G. Effinger, H. te Riele and D. Zinoviev. A complete Vinogradov 3-primes theorem under the Riemann hypothesis. Electronic Research Announcements, 1997, 3: 99-104. [19] E. Camouzis, H. Kollias, I. Leventides. Stable manifold market sequences. Journal of Dynamics and Games, 2018, 5 (2) : 165-185. doi: 10.3934/jdg.2018010 [20] Camillo De Lellis, Emanuele Spadaro. Center manifold: A case study. Discrete and Continuous Dynamical Systems, 2011, 31 (4) : 1249-1272. doi: 10.3934/dcds.2011.31.1249

Impact Factor: