Article Contents
Article Contents

# Spectral clustering revisited: Information hidden in the Fiedler vector

• * Corresponding author: Stefan Steinerberger
AD was supported by the 2019-2020 Yale University Emerging Scholars Initiative Post-Baccalaureate Research Education Program (ESI PREP). SS was supported by the NSF (DMS-2123224) and the Alfred P. Sloan Foundation
• We study the clustering problem on graphs: it is known that if there are two underlying clusters, then the signs of the eigenvector corresponding to the second largest eigenvalue of the adjacency matrix can reliably reconstruct the two clusters. We argue that the vertices for which the eigenvector has the largest and the smallest entries, respectively, are unusually strongly connected to their own cluster and more reliably classified than the rest. This can be regarded as a discrete version of the Hot Spots conjecture and should be a useful heuristic for evaluating 'strongly clustered' versus 'liminal' nodes in applications. We give a rigorous proof for the stochastic block model and discuss several explicit examples.

Mathematics Subject Classification: Primary: 31E05, 35B51; Secondary: 47F99, 94A11.

 Citation:

• Figure 1.  A graph with two communities that have strong inter-connectedness but very few edges between them

Figure 2.  A two-dimensional domain with Neumann boundary condition and the second Laplacian eigenfunction

Figure 3.  A stochastic block graph: $n = 500$, $p = 0.05$ and $q = 0.0001$. Two clusters that are barely connected; the cluster on the left has a somewhat more 'ambiguous' region; this ambiguity should be reflected in the size of the entries of the eigenvector ${\bf{v_2}}$

Figure 4.  (top) The deviation from expected in-group affinity ($c$, defined in Equation 2) for the vertices of a stochastic block model with $(n,p,q) = (2000, 0.6, 0.4)$. Vertices are plotted in increasing order of the corresponding ${\bf{v_2}}$ entry. (mid) Values of ${\bf{v_2}}$ for corresponding vertices, ordered in increasing value. (bottom) Plot showing linear relationship between $\Delta/\sqrt{n}$ and $\left|{\bf{v_2}}\right|$, in accordance with the main theorem

Figure 5.  Error rates on subsets of vertices with extremal ${\bf{v_2}}$ value, compared with the global ${\bf{v_2}}$ label-estimation error rate. Subsets were chosen by taking the nodes with $\varepsilon\cdot n$ largest magnitude ${\bf{v_2}}$ entries, as in Corollary 1. This figure was generated by randomly sampling 500 independent stochastic block models, $n = 200$, $p = 0.55,$ and $q = 0.45$

Figure 6.  Visualization of clustering experiments performed using MNIST dataset. Three hundred images of 3's and three hundred images of 8's were chosen at random from the original MNIST dataset. Pixel values were normalized and rounded to take binary values. A graph was constructed, with a vertex corresponding to each image, and an edge between two vertices if one of the vertices was within the 10% nearest neighbors of the other, using Euclidean distance. The vector ${\bf{v_2}}$ and values of $c$ (see Equation 3) were calculated for each vertex. The top figure was generated without noise. In the bottom figure, each pixel's binary value was reversed with independent probability $\rho = 0.5$, and the same calculations were performed

Figure 7.  Comparing how ${\bf{v_2}}$ can inform seed set expansion methods. We replicate the method described in [5], and compare the overall error rate when using the five nodes corresponding to the five largest entries of ${\bf{v_2}}$ as our seed set (shown in blue), versus five nodes selected uniformly at random from within a single community (shown in orange). Errorbars indicate one standard deviation. Using ${\bf{v_2}}$ to inform the initial seed set consistently outperforms use of a random seed set. We emphasize that the random seed set is selected from within a single community: this means that not only are ${\bf{v_2}}$-extremal vertices likely to be in their correct community, as reflected in Figure 5, but that these vertices are "preferable" to others within the same community

•  [1] E. Abbe, Community detection and stochastic block models: Recent developments, J. Mach. Learn. Res., 18 (2017), 86pp. [2] E. Abbe, A. S. Bandeira and G. Hall, Exact recovery in the stochastic block model, IEEE Trans. Inform. Theory, 62 (2016), 471-487.  doi: 10.1109/TIT.2015.2490670. [3] E. Abbe, J. Fan, K. Wang and Y. Zhong, Entrywise eigenvector analysis of random matrices with low expected rank, Ann. Statist., 48 (2020), 1452-1474.  doi: 10.1214/19-AOS1854. [4] E. Abbe and C. Sandon, Proof of the achievability conjectures for the general stochastic block model, Comm. Pure Appl. Math., 71 (2018), 1334-1406.  doi: 10.1002/cpa.21719. [5] R. Andersen and K. Lang, Communities from seed sets, in Proceedings of the 15th International Conference on World Wide Web, 2006, 223–232. doi: 10.1145/1135777.1135814. [6] A. S. Bandeira, Random Laplacian matrices and convex relaxations, Found. Comput. Math., 18 (2018), 345-379.  doi: 10.1007/s10208-016-9341-9. [7] R. Bañuelos and K. Burdzy, On the "hot spots" conjecture of J. Rauch, J. Funct. Anal., 164 (1999), 1-33.  doi: 10.1006/jfan.1999.3397. [8] M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Info. Processing Systems, (2002), 585–591. Available from: https://papers.nips.cc/paper/2001/file/f106b7f99d2cb30c3db1c3cc0fde9ccb-Paper.pdf. [9] A. Blum,  J. Hopcroft and  R. Kannan,  Foundations of Data Science, Cambridge University Press, 2020.  doi: 10.1017/9781108755528. [10] R. B. Boppana, Eigenvalues and graph bisection: An average-case analysis, 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, 1987. doi: 10.1109/SFCS. 1987.22. [11] K. Burdzy, The hot spots problem in planar domains with one hole, Duke Math. J., 129 (2005), 481-502.  doi: 10.1215/S0012-7094-05-12932-5. [12] K. Burdzy and W. Werner, A counterexample to the "hot spots" conjecture, Ann. of Math. (2), 149 (1999), 309-317.  doi: 10.2307/121027. [13] J. Cape, M. Tang and C. E. Priebe, The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics, Ann. Statist., 47 (2019), 2405-2439.  doi: 10.1214/18-AOS1752. [14] J. Cheeger,  A lower bound for the smallest eigenvalue of the Laplacian, in Problems in Analysis, Princeton Univ. Press, Princeton, NJ, 1970.  doi: 10.1515/9781400869312-013. [15] X. Cheng, G. Mishne and S. Steinerberger, The Geometry of nodal sets and outlier detection, J. Number Theory, 185 (2018), 48-64.  doi: 10.1016/j.jnt.2017.09.021. [16] X. Cheng, M. Rachh and S. Steinerberger, On the diffusion geometry of graph Laplacians and applications, Appl. Comput. Harmon. Anal., 46 (2019), 674-688.  doi: 10.1016/j.acha.2018.04.001. [17] F. R. K. Chung, Spectral graph theory, CBMS Regional Conference Series in Mathematics, 92, American Mathematical Society, Providence, RI, 1997. [18] M. K. Chung, S. Seo, N. Adluru and H. K. Vorperian, Hot spots conjecture and its application to modeling tubular structures, in Machine Learning in Medical Imaging, Lecture Notes in Computer Science, 7009, Springer, 2011, 225–232. doi: 10.1007/978-3-642-24319-6_28. [19] A. Damle and Y. Sun, Uniform bounds for invariant subspace perturbations, SIAM J. Matrix Anal. Appl., 41 (2020), 1208-1236.  doi: 10.1137/19M1262760. [20] C. Davis and W. M. Kahan, The rotation of eigenvectors by a perturbation. III, SIAM J. Numer. Anal., 7 (1970), 1-46.  doi: 10.1137/0707001. [21] W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM J. Res. Develop., 17 (1973), 420-425.  doi: 10.1147/rd.175.0420. [22] M. Fiedler, A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, Czechoslovak Math. J., 25 (1975), 619-633.  doi: 10.21136/CMJ.1975.101357. [23] M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Math. J., 23 (1973), 298-305. [24] M. Fiedler, Laplacian of graphs and algebraic connectivity, in Combinatorics and Graph Theory, Banach Center Publ., 25, PWN, Warsaw, 1989, 57–70. [25] H. Gernandt and J. P. Pade, Schur reduction of trees and extremal entries of the Fiedler vector, Linear Algebra Appl., 570 (2019), 93-122.  doi: 10.1016/j.laa.2019.02.008. [26] D. K. Hammond, P. Vandergheynst and R. Gribonval, Wavelets on graphs via spectral graph theory, Appl. Comput. Harmon. Anal., 30 (2011), 129-150.  doi: 10.1016/j.acha.2010.04.005. [27] P. W. Holland, K. B. Laskey and S. Leinhardt, Stochastic blockmodels: First steps, Social Networks, 5 (1983), 109-137.  doi: 10.1016/0378-8733(83)90021-7. [28] C. Judge and S. Mondal, Euclidean triangles have no hot spots, Ann. of Math. (2), 191 (2020), 167-211.  doi: 10.4007/annals.2020.191.1.3. [29] R. Kannan, S. Vempala and A. Vetta, On clusterings: Good, bad and spectral, J. ACM, 51 (2004), 497-515.  doi: 10.1145/990308.990313. [30] T. Kato, Perturbation Theory for Linear Operators, Die Grundlehren der mathematischen Wissenschaften, Band 132, Springer-Verlag New York, Inc., New York, 1966. [31] B. Kawohl, Rearrangements and Convexity of Level Sets in PDE, Lecture Notes in Mathematics, 1150, Springer-Verlag, Berlin, 1985. doi: 10.1007/BFb0075060. [32] T. C. Kwok, L. C. Lau, Y. T. Lee, S. Oveis Gharan and L. Trevisan, Improved Cheeger's inequality: Analysis of spectral partitioning algorithms through higher order spectral gap, in STOC'13—Proceedings of the 2013 ACM Symposium on Theory of Computing, ACM, New York, 2013, 11–20. doi: 10.1145/2488608.2488611. [33] R. Lederman and S. Steinerberger, Extreme values of the Fiedler vector on trees, preprint, arXiv: 1912.08327. [34] D. A. Levin and Y. Peres, Markov Chains and Mixing Times, American Mathematical Society, Providence, RI, 2017. doi: 10.1090/mbk/107. [35] M. W. Mahoney, L. Orecchia and N. K. Vishnoi, A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally, J. Mach. Learn. Res., 13 (2012), 2339-2365. [36] F. McSherry, Spectral partitioning of random graphs, 42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001), IEEE Computer Soc., Los Alamitos, CA, 2001, 529–537. doi: 10.1109/SFCS. 2001.959929. [37] M. E. J. Newman, Modularity and community structure in networks, PNAS, 103 (2006), 8577-8582.  doi: 10.1073/pnas.0601602103. [38] M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69 (2004). doi: 10.1103/PhysRevE. 69.026113. [39] A. Ng, M. Jordan and Y. Weiss, On spectral clustering: Analysis and an algorithm, NIPS'01: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and SyntheticJanuary 2001, 849–856. [40] A. Perry, A. S. Wein, A. S. Bandeira and A. Moitra, Message-passing algorithms for synchronization problems over compact groups, Comm. Pure Appl. Math., 71 (2018), 2275-2322.  doi: 10.1002/cpa.21750. [41] M. Rachh and S. Steinerberger, On the location of maxima of solutions of Schroedinger's equation, Comm. Pure Appl. Math., 71 (2018), 1109-1122.  doi: 10.1002/cpa.21753. [42] M. F. Rios, J. Calder and G. Lerman, Algorithms for $\ell_p$-based semi-supervised learning on graphs, preprint, arXiv: 1901.05031. [43] K. Rohe, S. Chatterjee and B. Yu, Spectral clustering and the high-dimensional stochastic blockmodel, Ann. Statist., 39 (2011), 1878-1915.  doi: 10.1214/11-AOS887. [44] J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905.  doi: 10.1109/34.868688. [45] D. A. Spielman and  S.-H. Teng,  Spectral partitioning works: Planar graphs and finite element meshes, 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), IEEE Comput. Soc. Press, Los Alamitos, CA, 1996.  doi: 10.1109/SFCS.1996.548468. [46] S. Steinerberger, Hot spots in convex domains are in the tips (up to an inradius), Comm. Partial Differential Equations, 45 (2020), 641-654.  doi: 10.1080/03605302.2020.1750427. [47] L. Trevisan, Graph Partitioning and Expanders, CS359G Lecture 4, Stanford University, Palo Alto. [48] L. Trevisan, Max cut and the smallest eigenvalue, SIAM J. Comput., 41 (2012), 1769-1786.  doi: 10.1137/090773714. [49] R. Vershynin,  High-Dimensional Probability. An Introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics, 47, Cambridge University Press, Cambridge, 2018.  doi: 10.1017/9781108231596. [50] U. von Luxburg, A tutorial on spectral clustering, Stat. Comput., 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z. [51] X. Zhu, Z. Ghahramani and J. D. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, 912–919.

Figures(7)