Article Contents
Article Contents

# Comparisons of different methods for balanced data classification under the discrete non-local total variational framework

• * Corresponding author: Zhenkuan Pan
• Because balanced constraints can overcome the problems of trivial solutions of data classification via minimum cut method, many techniques with different balanced strategies have been proposed to improve data classification accuracy. However, their performances have not been compared comprehensively so far. In this paper, we investigate seven balanced classification methods under the discrete non-local total variational framework and compare their accuracy performances on graph. The two-class classification problem with equality constraints, inequality constraints and Ratio Cut, Normalized Cut, Cheeger Cut models are investigated. For cases of equality constraint, we firstly compare the Penalty Function Method (PFM) and the Augmented Lagrangian Method (ALM), which can transform the constrained problems into unconstrained ones, to show the advantages of ALM. The other cases are all solved using the ALM also. In order to make the comparison fairly, we solve all models using ALM method and using the same proportion of fidelity points and the same neighborhood size on graph. Experimental results demonstrate ALM with the equality balanced constraint has the best classification accuracy compared with other six constraints. 200 words.

Mathematics Subject Classification: Primary: 65K10; Secondary: 65F22, 65N20.

 Citation:

• Figure 1.  Reference basis of three data sets

Figure 2.  Reference basis of three data sets

Figure 3.  Different results for two-moon dataset classification

Figure 4.  Different results for handwritten digits classification of 3 & 8 dataset

Figure 5.  Different results for handwritten digit classification of 4 & 9 dataset

Table 0.  Different algorithm's parameters

 EC-PFM EC-ALM SDIC DDIC RC CC NC $\mu_0$ 0.01 0.005 $\frac{3k}{n}$ $\frac{3k}{n}\cdot n$ 0.01 $\mu_1$ 15 25 20 15 15 12 15

Table 1.  Accuracy comparisons ofdifferent algorithms for two-moon dataset classification

 Method $V_1$.class $V_2$.class Error Ranking Solution 1000 1000 EC-PFM 1014 986 1.7$\%$ 5 EC-ALM 1005 995 1.25$\%$ 1 SDIC 1016 984 1.50$\%$ 2 DDIC 1015 985 1.55$\%$ 3 RC 1030 970 1.70$\%$ 5 CC 1021 979 1.65$\%$ 4 NC 986 1014 1.90$\%$ 7

Table 2.  Accuracy comparisons of different algorithms for handwritten digit classification of 3 & 8 dataset

 Method $V_1$.class $V_2$.class Error Ranking Solution 7141 6825 EC-PFM 7165 6801 1.2745$\%$ 6 EC-ALM 7139 6827 1.1385$\%$ 1 SDIC 7128 6838 1.1671$\%$ 3 DDIC 7161 6805 1.2244$\%$ 4 RC 7173 6793 1.2387$\%$ 5 CC 7134 6832 1.1600$\%$ 2 NC 7167 6799 1.5538$\%$ 7

Table 3.  Accuracy comparisons of different algorithms for handwritten digit classification of 4 & 9 dataset

 Method $V_1$.class $V_2$.class Error Ranking Solution 6824 6958 EC-PFM 6814 6968 1.2988$\%$ 4 EC-ALM 6818 6964 1.2335$\%$ 1 SDIC 6799 6983 1.3133$\%$ 5 DDIC 6811 6971 1.3206$\%$ 6 RC 6836 6946 1.2698$\%$ 3 CC 6787 6995 1.2407$\%$ 2 NC 6807 6975 1.4729$\%$ 7

Table 4.  The dependences of error rate on fidelity set size

 Two-moons 3&8 Fidelity set size(per class) Error rate Fidelity set size(per class) Error rate 100 1.20$\%$ 350 1.1287$\%$ 50 1.25$\%$ 300 1.1385 $\%$ 40 1.65$\%$ 250 1.2888$\%$ 30 1.80$\%$ 200 1.3604$\%$ 26 1.85$\%$ 150 1.6397$\%$ 20 2.45$\%$ 100 3.8665$\%$ 16 2.55$\%$ 90 4.1386$\%$ 10 3.25$\%$ 70 9.8597$\%$ 6 4.20$\%$ 50 11.2201$\%$

Table 5.  Dependence of error rate on DDIC range parameter

 $\varepsilon$value $V_1$.class $V_2$.class Error rate 5 7153 6813 1.2316$\%$ 10 7160 6806 1.2387$\%$ 15 7121 6845 1.2459$\%$ 20 7159 6807 1.2602$\%$ 30 7160 6806 1.2576$\%$ 50 7161 6805 1.2724$\%$
•  [1] E. Bae and E. Merkurjev, Convex variational methods on graphs for multiclass segmentation of high-dimensional data and point clouds, Journal of Mathematical Imaging & Vision, 58 (2017), 468-493.  doi: 10.1007/s10851-017-0713-9. [2] A. Ben-Dor, G. Lancia and J. Perone et al., Banishing bias from consensus sequences, Symposium on Combinatorial Pattern Matching. Springer-Verlag, 1264 (1997), 247-261.  doi: 10.1007/3-540-63220-4_63. [3] X. Bresson, X.-C. Tai, T. F. Chan and A. Szlam, Multi-class transductive learning based on L1 relaxations of Cheeger cut and Mumford-Shah-Potts model, Journal of Mathematical Imaging and Visionx, 49 (2014), 191-201.  doi: 10.1007/s10851-013-0452-5. [4] T. Chan and L. Vese, An active contour model without edges, International Conference on Scale-Space Theories in Computer Vision. Springer-Verlag, (1999), 141-151. doi: 10.1007/3-540-48236-9_13. [5] J. Cheeger, A lower bound for the smallest eigenvalue of the laplacian, Problems in Analysis, (1970), 195-199. [6] D. L. Donoho, De-noising by soft-thresholding, IEEE Transactions on Information Theory, 41 (1995), 613-627.  doi: 10.1109/18.382009. [7] A. Elmoataz, O. Lezoray and S. Bougleux, Nonlocal discrete regularization on weighted graphs: A framework for image and manifold processing, IEEE Trans. Image Processing, 17 (2008), 1047-1060.  doi: 10.1109/TIP.2008.924284. [8] D. R. Fulkerson, Flows in networks, Recent Advances in Mathematical Programming, (1963), 319-331. [9] G. Gilboa and S. Osher., Nonlocal operators with applications to image processing, Multiscale Modeling & Simulation, 7 (2008), 1005-1028.  doi: 10.1137/070698592. [10] G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised segmentation, Multiscale Modeling & Simulation, 6 (2007), 595-630.  doi: 10.1137/060669358. [11] R. Glowinski, T. W. Pan and X. C. Tai, Some facts about operator splitting and alternating direction methods, Splitting Methods in Communication, Imaging, Science, and Engineering, 19-94, Sci. Comput., Springer, Cham, 2016. [12] L. Hagen and A. B. Kahng, New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 11 (1992), 1074-1085.  doi: 10.1109/43.159993. [13] M. Hein and S. Setzer, Beyond 'spectral clustering -tight relaxations of balanced graph cuts, In Advacnces in Neural Information Processing Systems (NIPS), (2011). [14] M. Hein and S. Setzer, Beyond spectral clustering- tight relaxations of balanced graph cuts, In J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F.C.N. Pereira and K.Q.Weinberger, Editors, Advances in Neural Information Processing Systems, 24 (2011), 2366-2374. [15] U. V. Luxburg, A tutorial on spectral clustering, Statistics and Computing, 17 (2007), 395-416.  doi: 10.1007/s11222-007-9033-z. [16] M. Meila & J. Shi. Learning segmentation by random walks. Advances in Neural Information Processing Systems, 13 (2000), 873- 879. [17] E. Merkurjev, A. L. Bertozzi, X. Yan and K. Lerman, Modified cheeger and ratio cut methods using the ginzburg-landau functional for classification of high-dimensional data, Inverse Problems, 33 (2017), 074003, 24pp. doi: 10.1088/1361-6420/33/7/074003. [18] E. Merkurjev, E. Bae, A. L. Bertozzi and X. C. Tai, Global binary optimization on graphs for classification of high-dimensional data, Journal of Mathematical Imaging and Vision, 52 (2015), 414-435.  doi: 10.1007/s10851-015-0567-y. [19] L. I. Rudin, S. Osher and W. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D Nonlinear Phenomena, 60 (1992), 259-268.  doi: 10.1016/0167-2789(92)90242-F. [20] J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (2000), 888-905. [21] A. Szlam and X. Bresson, A total variation-based graph clustering algorithm for Cheeger ratio cuts, Conference on Machine Learning, (2010), 1039-1046. [22] A. N. Tikhonov, Regularization of incorrectly posed problems, Sov Math, 4 (1963), 1624-1627. [23] D. Wagner and F. Wagner, Between min cut and graph bisection, Mathematical Foundations of Computer Science 1993 (Gdask, 1993), 711 (1993), 744-750.  doi: 10.1007/3-540-57182-5_65. [24] Y. Z. Zhang, Y. Jiang and Z. F. Pang, Cheeger cut model for the balanced data classification problem, Advanced Materials Research, (2013), 765-767, 730-734. [25] D. Zhou and B. Schlkopf, Regularization on discrete spaces, Joint Pattern Recognition Symposium, 3663 (2005), 361-368.  doi: 10.1007/11550518_45. [26] X. Zhu and A. B. Goldberg, Introduction to semi-supervised learning, Morgan & Claypool Publishers, (2009), 130pp. doi: 10.2200/S00196ED1V01Y200906AIM006.

Figures(5)

Tables(6)