doi: 10.3934/jimo.2021230
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

A SMOTE-based quadratic surface support vector machine for imbalanced classification with mislabeled information

1. 

School of Business Administration, Southwestern University of Finance and Economics, Sichuan, China

2. 

School of Business Administration and Collaborative Innovation Center of Financial Security, Southwestern University of Finance and Economics, Chengdu, 611130, China

3. 

School of Business Administration, Southwestern University of Finance and Economics, Chengdu, China

* Corresponding author: Ye Tian

Received  February 2021 Revised  September 2021 Early access February 2022

Recently, Synthetic Minority Over-Sampling Technique (SMOTE) has been widely used to handle the imbalanced classification. To address the issues of existing benchmark methods, we propose a novel scheme of SMOTE based on the K-means and Intuitionistic Fuzzy Set theory to assign proper weights to the existing points and generate new synthetic points from them. Besides, we introduce the state-of-the-art kernel-free fuzzy quadratic surface support vector machine (QSSVM) to do the classification. Finally, the numerical experiments on various artificial and real data sets strongly demonstrate the validity and applicability of our proposed method, especially in the presence of mislabeled information.

Citation: Qianru Zhai, Ye Tian, Jingyue Zhou. A SMOTE-based quadratic surface support vector machine for imbalanced classification with mislabeled information. Journal of Industrial and Management Optimization, doi: 10.3934/jimo.2021230
References:
[1]

S. BaruaM. M. IslamX. Yao and K. Murase, MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning, IEEE Transactions on Knowledge & Data Engineering, 26 (2014), 405-425.  doi: 10.1109/TKDE.2012.232.

[2]

N. V. ChawlaK. W. BowyerL. O. Hall and W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16 (2002), 321-357.  doi: 10.1613/jair.953.

[3]

E. DuchesnayA. CachiaN. BoddaertN. ChabaneJ.-F. ManginJ.-L. MartinotF. Brunelle and M. Zilbovicius, Feature selection and classification of imbalanced datasets: Application to PET images of children with autistic spectrum disorders, Neuroimage, 57 (2011), 1003-1014.  doi: 10.1016/j.neuroimage.2011.05.011.

[4]

A. Gelman and D. B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science, 7 (1992), 457-472.  doi: 10.1214/ss/1177011136.

[5]

R. Y. Goh and L. S. Lee, Credit scoring: A review on support vector machines and metaheuristic approaches, Adv. Oper. Res., 2019 (2019), 1974794, 30pp. doi: 10.1155/2019/1974794.

[6]

R. S. Gong and S. H. Huang, A Kolmogorov-Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction, Expert Systems with Applications, 39 (2012), 6192-6200.  doi: 10.1016/j.eswa.2011.12.011.

[7]

M. H. HaC. Wang and J. Q. Chen, The support vector machine based on intuitionistic fuzzy number and kernel function, Soft Computing, 17 (2013), 635-641.  doi: 10.1007/s00500-012-0937-y.

[8]

H. HanW. Y. Wang and B. H. Mao, Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, International Conference on Intelligent Computing, 2005 (3644), 878-887.  doi: 10.1007/11538059_91.

[9]

X. P. HuaS. XuJ. Gao and S. F. Ding, L1-norm loss-based projection twin support vector machine for binary classification, Soft Computing, 23 (2019), 10649-10659.  doi: 10.1007/s00500-019-04002-6.

[10]

W. C. LinC. F. TsaiY. H. Hu and J. S. Jhang, Clustering-based undersampling in class-imbalanced data-ScienceDirect, Information Sciences, 409/410 (2017), 17-26.  doi: 10.1016/j.ins.2017.05.008.

[11]

J. LuoS. C. FangY. Bai and Z. Deng, Fuzzy quadratic surface support vector machine based on fisher discriminant analysis, J. Ind. Manag. Optim., 12 (2016), 357-373. 

[12]

J. Luo, S. C. Fang, Z. B. Deng and X. L. Guo, Soft quadratic surface support vector machine for binary classification, Asia-Pac. J. Oper. Res., 33 (2016), 1650046, 22 pp. doi: 10.1142/S0217595916500469.

[13]

K. MiroslavH. Robert and M. Stan, Machine learning for the detection of oil spills in satellite radar images, Machine Learning, 30 (1998), 195-215. 

[14]

E. RamentolY. CaballeroR. Bello and F. Herrera, SMOTE-RSB: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory, Knowledge and Information Systems, 33 (2012), 245-265.  doi: 10.1007/s10115-011-0465-6.

[15]

E. RamentolN. VerbiestR. BelloY. CaballeroC. Cornelts and F. Herrera, SMOTE-FRST: A new resampling method using fuzzy rough set theory, Uncertainty Modeling in Knowledge Engineering and Decision Making, 7 (2012), 800-805.  doi: 10.1142/9789814417747_0128.

[16]

B. SchLkopfJ. C. PlattJ. S. TaylorA. J. Smola and R. C. Williamson, Estimating the support of a high-dimensional distribution, Neural Computation, 13 (2001), 1443-1471.  doi: 10.1162/089976601750264965.

[17]

J. Taeho and J. Nathalie, Class imbalances versus small disjuncts, Acm Sigkdd Explorations Newsletter, 6 (2004), 40-49.  doi: 10.1145/1007730.1007737.

[18]

M. A. TahirJ. KittlerK. Mikolajczyk and F. Yan, A multiple expert approach to the class imbalance problem using inverse random under sampling, International Workshop on Multiple Classifier Systems, 5519 (2009), 82-91.  doi: 10.1007/978-3-642-02326-2_9.

[19]

Y. TianZ. B. DengJ. Luo and Y. Q. Li, An intuitionistic fuzzy set based S$^3$VM model for binary classification with mislabeled information, Fuzzy Optim. Decis. Mak., 17 (2018), 475-494.  doi: 10.1007/s10700-017-9282-z.

[20]

Y. TianM. SunZ. B. DengJ. Luo and Y. Q. Li, A new fuzzy set and non-kernel SVM approach for mislabeled binary classification with applications, IEEE Transactions on Fuzzy Systems, 25 (2017), 1536-1545.  doi: 10.1109/TFUZZ.2017.2752138.

[21]

[0885-6125] J. M. Tomczak and M. Zie.ba, Probabilistic combination of classification rules and its application to medical diagnosis, Mach. Learn., 101 (2015), 105-135. doi: 10.1007/s10994-015-5508-x.

[22]

N. VerbiestE. RamentolC. Cornelis and F. Herrera, Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection, Applied Soft Computing, 22 (2014), 511-517.  doi: 10.1016/j.asoc.2014.05.023.

[23]

R. F. XuT. ChenY. Q. XiaQ. LuB. Liu and X. Wang, Word embedding composition for data imbalances in sentiment and emotion classification, Cognitive Computation, 7 (2015), 226-240.  doi: 10.1007/s12559-015-9319-y.

[24]

T. YuJ. DebenhamT. Jan and S. Simoff, Combine vector quantization and support vector machine for imbalanced datasets, Artificial Intelligence in Theory and Practice, 217 (2012), 81-88.  doi: 10.1007/978-0-387-34747-9_9.

[25]

Y. T. XuQ. WangX. Y. Pang and Y. Tian, Maximum margin of twin spheres machine with pinball loss for imbalanced data classification, Applied Intelligence, 48 (2018), 23-34.  doi: 10.1007/s10489-017-0961-9.

show all references

References:
[1]

S. BaruaM. M. IslamX. Yao and K. Murase, MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning, IEEE Transactions on Knowledge & Data Engineering, 26 (2014), 405-425.  doi: 10.1109/TKDE.2012.232.

[2]

N. V. ChawlaK. W. BowyerL. O. Hall and W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16 (2002), 321-357.  doi: 10.1613/jair.953.

[3]

E. DuchesnayA. CachiaN. BoddaertN. ChabaneJ.-F. ManginJ.-L. MartinotF. Brunelle and M. Zilbovicius, Feature selection and classification of imbalanced datasets: Application to PET images of children with autistic spectrum disorders, Neuroimage, 57 (2011), 1003-1014.  doi: 10.1016/j.neuroimage.2011.05.011.

[4]

A. Gelman and D. B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science, 7 (1992), 457-472.  doi: 10.1214/ss/1177011136.

[5]

R. Y. Goh and L. S. Lee, Credit scoring: A review on support vector machines and metaheuristic approaches, Adv. Oper. Res., 2019 (2019), 1974794, 30pp. doi: 10.1155/2019/1974794.

[6]

R. S. Gong and S. H. Huang, A Kolmogorov-Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction, Expert Systems with Applications, 39 (2012), 6192-6200.  doi: 10.1016/j.eswa.2011.12.011.

[7]

M. H. HaC. Wang and J. Q. Chen, The support vector machine based on intuitionistic fuzzy number and kernel function, Soft Computing, 17 (2013), 635-641.  doi: 10.1007/s00500-012-0937-y.

[8]

H. HanW. Y. Wang and B. H. Mao, Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, International Conference on Intelligent Computing, 2005 (3644), 878-887.  doi: 10.1007/11538059_91.

[9]

X. P. HuaS. XuJ. Gao and S. F. Ding, L1-norm loss-based projection twin support vector machine for binary classification, Soft Computing, 23 (2019), 10649-10659.  doi: 10.1007/s00500-019-04002-6.

[10]

W. C. LinC. F. TsaiY. H. Hu and J. S. Jhang, Clustering-based undersampling in class-imbalanced data-ScienceDirect, Information Sciences, 409/410 (2017), 17-26.  doi: 10.1016/j.ins.2017.05.008.

[11]

J. LuoS. C. FangY. Bai and Z. Deng, Fuzzy quadratic surface support vector machine based on fisher discriminant analysis, J. Ind. Manag. Optim., 12 (2016), 357-373. 

[12]

J. Luo, S. C. Fang, Z. B. Deng and X. L. Guo, Soft quadratic surface support vector machine for binary classification, Asia-Pac. J. Oper. Res., 33 (2016), 1650046, 22 pp. doi: 10.1142/S0217595916500469.

[13]

K. MiroslavH. Robert and M. Stan, Machine learning for the detection of oil spills in satellite radar images, Machine Learning, 30 (1998), 195-215. 

[14]

E. RamentolY. CaballeroR. Bello and F. Herrera, SMOTE-RSB: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory, Knowledge and Information Systems, 33 (2012), 245-265.  doi: 10.1007/s10115-011-0465-6.

[15]

E. RamentolN. VerbiestR. BelloY. CaballeroC. Cornelts and F. Herrera, SMOTE-FRST: A new resampling method using fuzzy rough set theory, Uncertainty Modeling in Knowledge Engineering and Decision Making, 7 (2012), 800-805.  doi: 10.1142/9789814417747_0128.

[16]

B. SchLkopfJ. C. PlattJ. S. TaylorA. J. Smola and R. C. Williamson, Estimating the support of a high-dimensional distribution, Neural Computation, 13 (2001), 1443-1471.  doi: 10.1162/089976601750264965.

[17]

J. Taeho and J. Nathalie, Class imbalances versus small disjuncts, Acm Sigkdd Explorations Newsletter, 6 (2004), 40-49.  doi: 10.1145/1007730.1007737.

[18]

M. A. TahirJ. KittlerK. Mikolajczyk and F. Yan, A multiple expert approach to the class imbalance problem using inverse random under sampling, International Workshop on Multiple Classifier Systems, 5519 (2009), 82-91.  doi: 10.1007/978-3-642-02326-2_9.

[19]

Y. TianZ. B. DengJ. Luo and Y. Q. Li, An intuitionistic fuzzy set based S$^3$VM model for binary classification with mislabeled information, Fuzzy Optim. Decis. Mak., 17 (2018), 475-494.  doi: 10.1007/s10700-017-9282-z.

[20]

Y. TianM. SunZ. B. DengJ. Luo and Y. Q. Li, A new fuzzy set and non-kernel SVM approach for mislabeled binary classification with applications, IEEE Transactions on Fuzzy Systems, 25 (2017), 1536-1545.  doi: 10.1109/TFUZZ.2017.2752138.

[21]

[0885-6125] J. M. Tomczak and M. Zie.ba, Probabilistic combination of classification rules and its application to medical diagnosis, Mach. Learn., 101 (2015), 105-135. doi: 10.1007/s10994-015-5508-x.

[22]

N. VerbiestE. RamentolC. Cornelis and F. Herrera, Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection, Applied Soft Computing, 22 (2014), 511-517.  doi: 10.1016/j.asoc.2014.05.023.

[23]

R. F. XuT. ChenY. Q. XiaQ. LuB. Liu and X. Wang, Word embedding composition for data imbalances in sentiment and emotion classification, Cognitive Computation, 7 (2015), 226-240.  doi: 10.1007/s12559-015-9319-y.

[24]

T. YuJ. DebenhamT. Jan and S. Simoff, Combine vector quantization and support vector machine for imbalanced datasets, Artificial Intelligence in Theory and Practice, 217 (2012), 81-88.  doi: 10.1007/978-0-387-34747-9_9.

[25]

Y. T. XuQ. WangX. Y. Pang and Y. Tian, Maximum margin of twin spheres machine with pinball loss for imbalanced data classification, Applied Intelligence, 48 (2018), 23-34.  doi: 10.1007/s10489-017-0961-9.

Figure 1.  (a) The original distribution of the data set. (b) The synthetic examples generated by SMOTE (k = 5). (c) The synthetic examples generated by K-means. (d) The data set with mislabeled information (red dot). (e) synthetic examples generated by SMOTE (k = 5). (f) The synthetic examples generated by K-means
Figure 2.  (a) The original distribution of the data set. (b) The synthetic minority examples generated by SMOTE (red triangular point). (c) The synthetic minority examples generated by our method (red triangular point)
Figure 3.  $ AUC $ and $ G-mean $ values of different methods with various imbalance ratios on artificial data set1
Figure 4.  ROC curves of seven methods over nine data sets: (a)Abalone21vs8 data set (b)Abalone918 data set (c)Haberman data set (d)Glass4 data set (e)Pima data set (f)Glass5 data set (g)Glass016vs5 data set (h)Vehicle1 data set (i)Yeast6 data set
Table 1.  Confusion matrix for a two-class problem
Predicted Positive Predicted Negative
Positive TP FN
Negative FP TN
Predicted Positive Predicted Negative
Positive TP FN
Negative FP TN
Table 2.  Results of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on quadratic artificial datasets with RBF kernel
Data set Methods $ G-mean $ std of $ G-mean $ $ AUC $ std of $ AUC $ Time(s)
AR1:800$ \times $3
IR 1:3
RUS-SVM 0.5414 0.3175 0.3838 0.3258 $\textbf{2.7148}$
ROS-SVM 0.8886 0.0446 0.7914 0.0783 4.9408
SMOTE-SVM 0.6453 0.1181 0.429 0.1421 8.752
borderline-SMOTE1-SVM 0.846 0.0652 0.7195 0.1095 5.7484
borderline-SMOTE2-SVM 0.595 0.1774 0.3823 0.2126 6.378
MWMOTE-SVM 0.6982 0.0526 0.49 0.072 10.9427
WSMOTE-QSSVM $\textbf{0.9051}$ $\textbf{0.0363}$ $\textbf{0.8203}$ $\textbf{0.0642}$ 4.0263
AR1:2200$ \times $3
IR 1:10
RUS-SVM 0.516 0.2291 0.3135 0.203 $\textbf{13.6355}$
ROS-SVM 0.8508 0.0604 0.7271 0.0985 52.8877
SMOTE-SVM 0.4311 0.1005 0.1949 0.0874 127.6243
borderline-SMOTE1-SVM 0.7897 0.0479 0.6257 0.0764 104.4459
borderline-SMOTE2-SVM 0.5014 0.1418 0.2695 0.1347 86.6192
MWMOTE-SVM 0.5971 0.0421 0.3581 $\textbf{0.0491}$ 93.0159
WSMOTE-QSSVM $\textbf{0.8917}$ $\textbf{0.0301}$ $\textbf{0.796}$ 0.054 23.2509
AR2:800$ \times $3
IR 1:3
RUS-SVM 0.7449 0.2905 0.6309 0.3688 $\textbf{3.5008}$
ROS-SVM 0.9295 $\textbf{0.0337}$ 0.865 $\textbf{0.0621}$ 6.7627
SMOTE-SVM 0.6407 0.1499 0.486 0.4307 10.6513
borderline-SMOTE1-SVM 0.8778 0.0671 0.8062 0.7745 8.3038
borderline-SMOTE2-SVM 0.6218 0.1317 0.5273 0.4022 8.1676
MWMOTE-SVM 0.7123 0.0414 0.5626 0.5089 11.0469
WSMOTE-QSSVM $\textbf{0.9337}$ 0.0345 $\textbf{0.8729}$ 0.0638 4.1132
AR2:2200$ \times $3
IR 1:10
RUS-SVM 0.4802 0.2185 0.5771 0.2159 $\textbf{13.4868}$
ROS-SVM 0.8855 0.0222 0.8799 0.039 53.3882
SMOTE-SVM 0.4396 0.1028 0.47996 0.0845 130.0079
borderline-SMOTE1-SVM 0.871 $\textbf{0.0185}$ 0.8628 $\textbf{0.0324}$ 101.8846
borderline-SMOTE2-SVM 0.5126 0.1133 0.5578 0.1097 108.3143
MWMOTE-SVM 0.6314 0.0463 0.6176 0.0587 90.1415
WSMOTE-QSSVM $\textbf{0.9005}$ 0.0355 $\textbf{0.8960}$ 0.0639 23.926
AR3:800$ \times $3
IR 1:3
RUS-SVM 0.5859 0.3457 0.4509 0.3265 $\textbf{3.3678}$
ROS-SVM 0.8978 $\textbf{0.024}$ 0.8065 $\textbf{0.0431}$ 6.8082
SMOTE-SVM 0.6942 0.0527 0.4844 0.0716 10.9369
borderline-SMOTE1-SVM 0.8735 0.061 0.7663 0.1038 8.8757
borderline-SMOTE2-SVM 0.6095 0.1267 0.3859 0.164 8.0711
MWMOTE-SVM 0.6893 0.0478 0.4772 0.0664 11.2833
WSMOTE-QSSVM $\textbf{0.9148}$ 0.041 $\textbf{0.8384}$ 0.0744 3.7026
AR3:2200$ \times $3
IR 1:10
RUS-SVM 0.5809 0.2743 0.4052 0.2942 $\textbf{12.913}$
ROS-SVM 0.8788 0.055 0.775 0.0965 50.4581
SMOTE-SVM 0.4874 0.0875 0.2445 0.0847 125.7192
borderline-SMOTE1-SVM 0.8586 $\textbf{0.0284}$ 0.7379 0.0485 113.3755
borderline-SMOTE2-SVM 0.5892 0.0387 0.3485 $\textbf{0.046}$ 82.4204
MWMOTE-SVM 0.6045 0.0525 0.3679 0.0621 90.1298
WSMOTE-QSSVM $\textbf{0.8825}$ 0.051 $\textbf{0.7811}$ 0.0896 22.8422
Data set Methods $ G-mean $ std of $ G-mean $ $ AUC $ std of $ AUC $ Time(s)
AR1:800$ \times $3
IR 1:3
RUS-SVM 0.5414 0.3175 0.3838 0.3258 $\textbf{2.7148}$
ROS-SVM 0.8886 0.0446 0.7914 0.0783 4.9408
SMOTE-SVM 0.6453 0.1181 0.429 0.1421 8.752
borderline-SMOTE1-SVM 0.846 0.0652 0.7195 0.1095 5.7484
borderline-SMOTE2-SVM 0.595 0.1774 0.3823 0.2126 6.378
MWMOTE-SVM 0.6982 0.0526 0.49 0.072 10.9427
WSMOTE-QSSVM $\textbf{0.9051}$ $\textbf{0.0363}$ $\textbf{0.8203}$ $\textbf{0.0642}$ 4.0263
AR1:2200$ \times $3
IR 1:10
RUS-SVM 0.516 0.2291 0.3135 0.203 $\textbf{13.6355}$
ROS-SVM 0.8508 0.0604 0.7271 0.0985 52.8877
SMOTE-SVM 0.4311 0.1005 0.1949 0.0874 127.6243
borderline-SMOTE1-SVM 0.7897 0.0479 0.6257 0.0764 104.4459
borderline-SMOTE2-SVM 0.5014 0.1418 0.2695 0.1347 86.6192
MWMOTE-SVM 0.5971 0.0421 0.3581 $\textbf{0.0491}$ 93.0159
WSMOTE-QSSVM $\textbf{0.8917}$ $\textbf{0.0301}$ $\textbf{0.796}$ 0.054 23.2509
AR2:800$ \times $3
IR 1:3
RUS-SVM 0.7449 0.2905 0.6309 0.3688 $\textbf{3.5008}$
ROS-SVM 0.9295 $\textbf{0.0337}$ 0.865 $\textbf{0.0621}$ 6.7627
SMOTE-SVM 0.6407 0.1499 0.486 0.4307 10.6513
borderline-SMOTE1-SVM 0.8778 0.0671 0.8062 0.7745 8.3038
borderline-SMOTE2-SVM 0.6218 0.1317 0.5273 0.4022 8.1676
MWMOTE-SVM 0.7123 0.0414 0.5626 0.5089 11.0469
WSMOTE-QSSVM $\textbf{0.9337}$ 0.0345 $\textbf{0.8729}$ 0.0638 4.1132
AR2:2200$ \times $3
IR 1:10
RUS-SVM 0.4802 0.2185 0.5771 0.2159 $\textbf{13.4868}$
ROS-SVM 0.8855 0.0222 0.8799 0.039 53.3882
SMOTE-SVM 0.4396 0.1028 0.47996 0.0845 130.0079
borderline-SMOTE1-SVM 0.871 $\textbf{0.0185}$ 0.8628 $\textbf{0.0324}$ 101.8846
borderline-SMOTE2-SVM 0.5126 0.1133 0.5578 0.1097 108.3143
MWMOTE-SVM 0.6314 0.0463 0.6176 0.0587 90.1415
WSMOTE-QSSVM $\textbf{0.9005}$ 0.0355 $\textbf{0.8960}$ 0.0639 23.926
AR3:800$ \times $3
IR 1:3
RUS-SVM 0.5859 0.3457 0.4509 0.3265 $\textbf{3.3678}$
ROS-SVM 0.8978 $\textbf{0.024}$ 0.8065 $\textbf{0.0431}$ 6.8082
SMOTE-SVM 0.6942 0.0527 0.4844 0.0716 10.9369
borderline-SMOTE1-SVM 0.8735 0.061 0.7663 0.1038 8.8757
borderline-SMOTE2-SVM 0.6095 0.1267 0.3859 0.164 8.0711
MWMOTE-SVM 0.6893 0.0478 0.4772 0.0664 11.2833
WSMOTE-QSSVM $\textbf{0.9148}$ 0.041 $\textbf{0.8384}$ 0.0744 3.7026
AR3:2200$ \times $3
IR 1:10
RUS-SVM 0.5809 0.2743 0.4052 0.2942 $\textbf{12.913}$
ROS-SVM 0.8788 0.055 0.775 0.0965 50.4581
SMOTE-SVM 0.4874 0.0875 0.2445 0.0847 125.7192
borderline-SMOTE1-SVM 0.8586 $\textbf{0.0284}$ 0.7379 0.0485 113.3755
borderline-SMOTE2-SVM 0.5892 0.0387 0.3485 $\textbf{0.046}$ 82.4204
MWMOTE-SVM 0.6045 0.0525 0.3679 0.0621 90.1298
WSMOTE-QSSVM $\textbf{0.8825}$ 0.051 $\textbf{0.7811}$ 0.0896 22.8422
Table 3.  Average $ AUC $ and $ G-mean $ values over AR1 for different mislabeled levels
Mislabeled level(%) Indicator RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
10 $ AUC $ 0.8552 0.8325 0.573 0.8716 0.602 0.652 $\textbf{0.8729}$
$ G-Mean $ 0.9245 0.912 0.7457 0.9332 0.767 0.803 $\textbf{0.9342}$
15 $ AUC $ $\textbf{0.898}$ 0.8343 0.6416 0.8272 0.4717 0.6065 0.8489
$ G-Mean $ $\textbf{0.9474}$ 0.9127 0.8005 0.9091 0.6707 0.7781 0.921
20 $ AUC $ 0.4509 0.8065 0.4844 0.7663 0.3859 0.4772 $\textbf{0.8384}$
$ G-Mean $ 0.5859 0.8978 0.6942 0.8735 0.6095 0.6893 $\textbf{0.9148}$
25 $ AUC $ 0.2005 0.7544 0.3944 0.6254 0.3238 0.4363 $\textbf{0.7892}$
$ G-Mean $ 0.2835 0.8681 0.6213 0.7874 0.5615 0.6595 $\textbf{0.8876}$
30 $ AUC $ 0.0314 0.7754 0.4056 0.5259 0.1855 0.3721 $\textbf{0.7758}$
$ G-Mean $ 0.1076 0.8787 0.6346 0.7229 0.4216 0.6067 $\textbf{0.8804}$
Mislabeled level(%) Indicator RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
10 $ AUC $ 0.8552 0.8325 0.573 0.8716 0.602 0.652 $\textbf{0.8729}$
$ G-Mean $ 0.9245 0.912 0.7457 0.9332 0.767 0.803 $\textbf{0.9342}$
15 $ AUC $ $\textbf{0.898}$ 0.8343 0.6416 0.8272 0.4717 0.6065 0.8489
$ G-Mean $ $\textbf{0.9474}$ 0.9127 0.8005 0.9091 0.6707 0.7781 0.921
20 $ AUC $ 0.4509 0.8065 0.4844 0.7663 0.3859 0.4772 $\textbf{0.8384}$
$ G-Mean $ 0.5859 0.8978 0.6942 0.8735 0.6095 0.6893 $\textbf{0.9148}$
25 $ AUC $ 0.2005 0.7544 0.3944 0.6254 0.3238 0.4363 $\textbf{0.7892}$
$ G-Mean $ 0.2835 0.8681 0.6213 0.7874 0.5615 0.6595 $\textbf{0.8876}$
30 $ AUC $ 0.0314 0.7754 0.4056 0.5259 0.1855 0.3721 $\textbf{0.7758}$
$ G-Mean $ 0.1076 0.8787 0.6346 0.7229 0.4216 0.6067 $\textbf{0.8804}$
Table 4.  Detailed information about KEEL data sets
Datasets Data size Features Imbalanced ratio
Pima 214 9 1.82
Glass4 214 10 1.82
Glass5 214 10 1.82
Haberman 306 4 2.78
Vehicle1 846 19 2.9
Glass0123vs456 214 10 3.2
Abalone21vs8 581 9 40.5
Vowel0 988 14 9.98
Shuttlec0vsc4 1829 10 13.87
Pageblocks13vs4 472 11 15.86
Glass016vs5 184 10 19.44
Abalone918 731 9 16.4
Newthyroid2 215 6 5.14
Yeast4 1484 9 28.1
Yeast6 1484 9 41.4
Datasets Data size Features Imbalanced ratio
Pima 214 9 1.82
Glass4 214 10 1.82
Glass5 214 10 1.82
Haberman 306 4 2.78
Vehicle1 846 19 2.9
Glass0123vs456 214 10 3.2
Abalone21vs8 581 9 40.5
Vowel0 988 14 9.98
Shuttlec0vsc4 1829 10 13.87
Pageblocks13vs4 472 11 15.86
Glass016vs5 184 10 19.44
Abalone918 731 9 16.4
Newthyroid2 215 6 5.14
Yeast4 1484 9 28.1
Yeast6 1484 9 41.4
Table 5.  $ G-mean $ of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on benchmark datasets with RBF kernel
Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std/rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std /rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
Pima 0.2726 /0.1996/7 0.6292/0.0446/2 0.5802/0.0802/4 0.6211/0.0628/3 0.3459/0.1442/6 0.5681/0.0529/5 0.6521/0.0505/1
16.3335 8.2276 16.6804 18.3543 18.381 18.414 11.0034
Vowel0 0.4675/0.3726/7 0.918/0.0845/2 0.8054/0.042/4 0.8536/0.0759/3 0.7976/0.0671/5 0.7594/0.0529/6 0.9199/0.0583/1
14.1906 4.3252 33.0654 31.7508 31.5846 25.7455 25.7865
Glass0123vs456 0.7099/0.1468/6 0.7874/0.0861/4 0.8037/0.0786/2 0.8291/0.0837/1 0.7418/0.1339/5 0.7048/0.1415/7 0.7921/0.0892/3
0.6951 0.3932 1.0297 1.3301 1.2478 1.1803 3.056
Haberman 0.3932/0.0794/5 0.2286/0.1846/7 0.4457/0.0944/4 0.4687/0.1048/3 0.3468/0.1334/6 0.5168/0.0544/2 0.6394/0.1057/1
1.0481 0.5925 78.2541 1.7105 22.0941 2.8829 1.2026
Vehicle1 0.2926/0.2722/6 0.6873/0.1027/2 0.38/0.1016/5 0.61/0.0604/3 0.2131/0.144/7 0.479/0.0467/4 0.7254/0.041/1
14.1348 7.3443 23.71 23.79 24.2277 20.6886 47.0032
Shuttlec0vsc4 0.7955/0.1095/6 0.9939/0.0098/1 0.7633/0.0287/7 0.8513/0.045/5 0.8527/0.0454/4 0.9238/0.0273/3 0.9819/0.0131/2
15.0486 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
Pageblocks13vs4 0.6375/0.1509/3 0.5887/0.2618/5 0.608/0.1809/4 0.8848/0.0649/2 0.5607/0.2308/7 0.5771/0.1332/6 0.9081/0.0699/1
1.8912 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
Abalone21vs8 0.3995/0.2119/3 0.3773/0.4158/4 0.1151/0.247/7 0.5252/0.3741/2 0.2321/0.2448/5 0.1937/0.2501/6 0.6873/0.2725/1
1.0082 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
Yeast6 0.3806/0.2218/5 0.3719/0.2875/6 0.4615/0.1965/4 0.5946/0.1959/3 0.3188/0.2381/7 0.6378/0.1015/2 0.7146/0.1159/1
4.6745 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
Abalone918 0.3443/0.2325/3 0.1556/0.2053/6 0.2063/0.2299/5 0.3263/0.252/4 0.0348/0.1102/7 0.3947/0.1589/2 0.6533/0.0978/1
1.6203 6.1747 14.5602 10.559 5.9857 13.8744 8.594
Newthyroid2 0.6419/0.1922/5 0.7352/0.1259/4 0.5555/0.2352/6 0.8474/0.0705/2 0.555/0.2077/7 0.7676/0.1135/3 0.8774/0.096/1
0.3228 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
Glass016vs5 0.5164/0.2742/3 0.2746/0.3546/7 0.5129/0.2897/4 0.6011/0.3381/2 0.4386/0.389/5 0.3389/0.3664/6 0.7132/0.1589/1
0.1869 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
Glass5 0.5388/0.2548/2 0.4234/0.3644/7 0.4735/0.3416/6 0.5215/0.3736/3 0.4737/0.3272/5 0.4896/0.35/4 0.8439/0.1328/1
0.2077 0.6566 1.5191 1.3504 1.355 0.778 4.0237
Yeast4 0.1843/0.0748/6 0.2065/0.2272/5 0.3751/0.2332/4 0.5555/0.0802/3 0.0627/0.1323/7 0.6059/0.1153/1 0.5941/0.1145/2
9.0832 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
Glass4 0.431/0.2546/3 0.1147/0.2419/7 0.4076/0.3698/4 0.6286/0.2719/2 0.2663/0.3521/6 0.378/0.2713/5 0.7867/0.1263/1
0.2045 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
Average Rank 4.7 4.6 4.7 2.7 5.9 4.1 1.3
Final Rank 6 4 5 2 7 3 1
Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std/rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std /rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
Pima 0.2726 /0.1996/7 0.6292/0.0446/2 0.5802/0.0802/4 0.6211/0.0628/3 0.3459/0.1442/6 0.5681/0.0529/5 0.6521/0.0505/1
16.3335 8.2276 16.6804 18.3543 18.381 18.414 11.0034
Vowel0 0.4675/0.3726/7 0.918/0.0845/2 0.8054/0.042/4 0.8536/0.0759/3 0.7976/0.0671/5 0.7594/0.0529/6 0.9199/0.0583/1
14.1906 4.3252 33.0654 31.7508 31.5846 25.7455 25.7865
Glass0123vs456 0.7099/0.1468/6 0.7874/0.0861/4 0.8037/0.0786/2 0.8291/0.0837/1 0.7418/0.1339/5 0.7048/0.1415/7 0.7921/0.0892/3
0.6951 0.3932 1.0297 1.3301 1.2478 1.1803 3.056
Haberman 0.3932/0.0794/5 0.2286/0.1846/7 0.4457/0.0944/4 0.4687/0.1048/3 0.3468/0.1334/6 0.5168/0.0544/2 0.6394/0.1057/1
1.0481 0.5925 78.2541 1.7105 22.0941 2.8829 1.2026
Vehicle1 0.2926/0.2722/6 0.6873/0.1027/2 0.38/0.1016/5 0.61/0.0604/3 0.2131/0.144/7 0.479/0.0467/4 0.7254/0.041/1
14.1348 7.3443 23.71 23.79 24.2277 20.6886 47.0032
Shuttlec0vsc4 0.7955/0.1095/6 0.9939/0.0098/1 0.7633/0.0287/7 0.8513/0.045/5 0.8527/0.0454/4 0.9238/0.0273/3 0.9819/0.0131/2
15.0486 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
Pageblocks13vs4 0.6375/0.1509/3 0.5887/0.2618/5 0.608/0.1809/4 0.8848/0.0649/2 0.5607/0.2308/7 0.5771/0.1332/6 0.9081/0.0699/1
1.8912 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
Abalone21vs8 0.3995/0.2119/3 0.3773/0.4158/4 0.1151/0.247/7 0.5252/0.3741/2 0.2321/0.2448/5 0.1937/0.2501/6 0.6873/0.2725/1
1.0082 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
Yeast6 0.3806/0.2218/5 0.3719/0.2875/6 0.4615/0.1965/4 0.5946/0.1959/3 0.3188/0.2381/7 0.6378/0.1015/2 0.7146/0.1159/1
4.6745 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
Abalone918 0.3443/0.2325/3 0.1556/0.2053/6 0.2063/0.2299/5 0.3263/0.252/4 0.0348/0.1102/7 0.3947/0.1589/2 0.6533/0.0978/1
1.6203 6.1747 14.5602 10.559 5.9857 13.8744 8.594
Newthyroid2 0.6419/0.1922/5 0.7352/0.1259/4 0.5555/0.2352/6 0.8474/0.0705/2 0.555/0.2077/7 0.7676/0.1135/3 0.8774/0.096/1
0.3228 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
Glass016vs5 0.5164/0.2742/3 0.2746/0.3546/7 0.5129/0.2897/4 0.6011/0.3381/2 0.4386/0.389/5 0.3389/0.3664/6 0.7132/0.1589/1
0.1869 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
Glass5 0.5388/0.2548/2 0.4234/0.3644/7 0.4735/0.3416/6 0.5215/0.3736/3 0.4737/0.3272/5 0.4896/0.35/4 0.8439/0.1328/1
0.2077 0.6566 1.5191 1.3504 1.355 0.778 4.0237
Yeast4 0.1843/0.0748/6 0.2065/0.2272/5 0.3751/0.2332/4 0.5555/0.0802/3 0.0627/0.1323/7 0.6059/0.1153/1 0.5941/0.1145/2
9.0832 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
Glass4 0.431/0.2546/3 0.1147/0.2419/7 0.4076/0.3698/4 0.6286/0.2719/2 0.2663/0.3521/6 0.378/0.2713/5 0.7867/0.1263/1
0.2045 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
Average Rank 4.7 4.6 4.7 2.7 5.9 4.1 1.3
Final Rank 6 4 5 2 7 3 1
Table 6.  $ AUC $ of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on benchmark datasets with RBF kernel
Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std /rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std/rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
Pima 0.1102/0.0913/7 0.3977/0.0534/2 0.3424/0.0928/4 0.3893/0.074/3 0.1384/0.0965/6 0.3252/0.0609/5 $\textbf{0.4275/0.0646/1}$
16.3335 8.2276 16.6804 18.3543 18.381 18.414 $\textbf{11.0034}$
Vowel0 0.3435/0.3226/7 0.8491/0.1441/2 0.6502/0.0677/4 0.7338/0.1277/3 0.6402/0.1017/5 0.5792/0.0777/6 $\textbf{0.8493/0.1051/1}$
14.1906 $\textbf{4.3252}$ 33.0654 31.7508 31.5846 25.7455 25.7865
Glass0123vs456 0.5233/0.204/6 0.6267/0.1316/4 0.6515/0.1289/2 $\textbf{0.6936/0.139/1}$ 0.5664/0.1878/5 0.5148/0.1721/7 0.6345/0.1353/3
0.6951 $\textbf{0.3932}$ 1.0297 1.3301 1.2478 1.1803 3.056
Haberman 0.1603/0.0604/5 0.0829/0.0931/7 0.2067/0.0798/4 0.2296/0.0854/3 0.1363/0.0655/6 0.2697/0.0573/2 $\textbf{0.4189/0.1375/1}$
1.0481 $\textbf{0.5925}$ 78.2541 1.7105 22.0941 2.8829 1.2026
Vehicle1 0.1523/0.1936/6 0.4819/0.1337/2 0.1537/0.0884/5 0.3753/0.0741/3 0.0641/0.0603/7 0.2314/0.0439/4 $\textbf{0.5276/0.0588/1}$
14.1348 $\textbf{7.3443}$ 23.71 23.79 24.2277 20.6886 47.0032
Shuttlec0vsc4 0.6436/0.1837/6 $\textbf{0.988/0.0193/1}$ 0.5834/0.044/7 0.7265/0.0761/5 0.729/0.0775/4 0.8541/0.0516/3 0.9643/0.0257/2
$\textbf{15.0486}$ 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
Pageblocks13vs4 0.427/0.1861/3 0.4082/0.263/4 0.3991/0.2052/5 0.7867/0.1124/2 0.3624/0.1935/6 0.3491/0.155/7 $\textbf{0.829/0.1209/1}$
$\textbf{1.8912}$ 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
Abalone21vs8 0.2/0.1464/5 0.2979/0.3654/3 0.0681/0.1533/7 0.4018/0.3065/2 0.2526/0.2665/4 0.0938/0.1212/6 $\textbf{0.5392/0.2631/1}$
$\textbf{1.0082}$ 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
Yeast6 0.1892/0.1707/6 0.2127/0.2034/5 0.2477/0.144/4 0.3881/0.232/3 0.1527/0.1385/7 0.4161/0.1298/2 $\textbf{0.5228/0.1558/1}$
$\textbf{4.6745}$ 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
Abalone918 0.1672/0.1431/3 0.0621/0.0878/6 0.0901/0.1128/5 0.1637/0.1563/4 0.0121/0.0384/7 0.1785/0.09/2 $\textbf{0.4354/0.1282/1}$
$\textbf{1.6203}$ 6.1747 14.5602 10.559 5.9857 13.8744 8.594
Newthyroid2 0.4452/0.2622/5 0.5548/0.1858/4 0.3583/0.2065/6 0.7226/0.1202/2 0.3468/0.1504/7 0.6008/0.1692/3 $\textbf{0.7782/0.1661/1}$
$\textbf{0.3228}$ 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
Glass016vs5 0.3343/0.2469/4 0.1886/0.2437/7 0.3386/0.2297/3 0.4643/0.3076/2 0.3286/0.3228/5 0.2357/0.2752/6 $\textbf{0.5314/0.2463/1}$
$\textbf{0.1869}$ 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
Glass5 0.3488/0.2479/4 0.2988/0.2572/7 0.3293/0.2687/5 0.3976/0.3177/2 0.3207/0.2221/6 0.35/0.2774/3 $\textbf{0.728/0.2101/1}$
$\textbf{0.2077}$ 0.6566 1.5191 1.3504 1.355 0.778 4.0237
Yeast4 0.039/0.0305/6 0.0891/0.1087/5 0.1896/0.1526/4 0.3144/0.0918/3 0.0197/0.0415/7 $\textbf{0.3791/0.1258/1}$ 0.3647/0.1404/2
$\textbf{9.0832}$ 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
Glass4 0.2442/0.2012/4 0.0658/0.1388/7 0.2892/0.2938/3 0.4617/0.2738/2 0.1825/0.2566/6 0.2092/0.1679/5 $\textbf{0.6333/0.1856/1}$
$\textbf{0.2045}$ 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
$\textbf{Average Rank}$ $\textbf{5.13}$ $\textbf{4.4}$ $\textbf{4.53}$ $\textbf{2.67}$ $\textbf{5.87}$ $\textbf{4.13}$ $\textbf{1.27}$
$\textbf{Final Rank}$ $\textbf{6}$ $\textbf{4}$ $\textbf{5}$ $\textbf{2}$ $\textbf{7}$ $\textbf{3}$ $\textbf{1}$
Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std /rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std/rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
Pima 0.1102/0.0913/7 0.3977/0.0534/2 0.3424/0.0928/4 0.3893/0.074/3 0.1384/0.0965/6 0.3252/0.0609/5 $\textbf{0.4275/0.0646/1}$
16.3335 8.2276 16.6804 18.3543 18.381 18.414 $\textbf{11.0034}$
Vowel0 0.3435/0.3226/7 0.8491/0.1441/2 0.6502/0.0677/4 0.7338/0.1277/3 0.6402/0.1017/5 0.5792/0.0777/6 $\textbf{0.8493/0.1051/1}$
14.1906 $\textbf{4.3252}$ 33.0654 31.7508 31.5846 25.7455 25.7865
Glass0123vs456 0.5233/0.204/6 0.6267/0.1316/4 0.6515/0.1289/2 $\textbf{0.6936/0.139/1}$ 0.5664/0.1878/5 0.5148/0.1721/7 0.6345/0.1353/3
0.6951 $\textbf{0.3932}$ 1.0297 1.3301 1.2478 1.1803 3.056
Haberman 0.1603/0.0604/5 0.0829/0.0931/7 0.2067/0.0798/4 0.2296/0.0854/3 0.1363/0.0655/6 0.2697/0.0573/2 $\textbf{0.4189/0.1375/1}$
1.0481 $\textbf{0.5925}$ 78.2541 1.7105 22.0941 2.8829 1.2026
Vehicle1 0.1523/0.1936/6 0.4819/0.1337/2 0.1537/0.0884/5 0.3753/0.0741/3 0.0641/0.0603/7 0.2314/0.0439/4 $\textbf{0.5276/0.0588/1}$
14.1348 $\textbf{7.3443}$ 23.71 23.79 24.2277 20.6886 47.0032
Shuttlec0vsc4 0.6436/0.1837/6 $\textbf{0.988/0.0193/1}$ 0.5834/0.044/7 0.7265/0.0761/5 0.729/0.0775/4 0.8541/0.0516/3 0.9643/0.0257/2
$\textbf{15.0486}$ 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
Pageblocks13vs4 0.427/0.1861/3 0.4082/0.263/4 0.3991/0.2052/5 0.7867/0.1124/2 0.3624/0.1935/6 0.3491/0.155/7 $\textbf{0.829/0.1209/1}$
$\textbf{1.8912}$ 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
Abalone21vs8 0.2/0.1464/5 0.2979/0.3654/3 0.0681/0.1533/7 0.4018/0.3065/2 0.2526/0.2665/4 0.0938/0.1212/6 $\textbf{0.5392/0.2631/1}$
$\textbf{1.0082}$ 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
Yeast6 0.1892/0.1707/6 0.2127/0.2034/5 0.2477/0.144/4 0.3881/0.232/3 0.1527/0.1385/7 0.4161/0.1298/2 $\textbf{0.5228/0.1558/1}$
$\textbf{4.6745}$ 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
Abalone918 0.1672/0.1431/3 0.0621/0.0878/6 0.0901/0.1128/5 0.1637/0.1563/4 0.0121/0.0384/7 0.1785/0.09/2 $\textbf{0.4354/0.1282/1}$
$\textbf{1.6203}$ 6.1747 14.5602 10.559 5.9857 13.8744 8.594
Newthyroid2 0.4452/0.2622/5 0.5548/0.1858/4 0.3583/0.2065/6 0.7226/0.1202/2 0.3468/0.1504/7 0.6008/0.1692/3 $\textbf{0.7782/0.1661/1}$
$\textbf{0.3228}$ 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
Glass016vs5 0.3343/0.2469/4 0.1886/0.2437/7 0.3386/0.2297/3 0.4643/0.3076/2 0.3286/0.3228/5 0.2357/0.2752/6 $\textbf{0.5314/0.2463/1}$
$\textbf{0.1869}$ 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
Glass5 0.3488/0.2479/4 0.2988/0.2572/7 0.3293/0.2687/5 0.3976/0.3177/2 0.3207/0.2221/6 0.35/0.2774/3 $\textbf{0.728/0.2101/1}$
$\textbf{0.2077}$ 0.6566 1.5191 1.3504 1.355 0.778 4.0237
Yeast4 0.039/0.0305/6 0.0891/0.1087/5 0.1896/0.1526/4 0.3144/0.0918/3 0.0197/0.0415/7 $\textbf{0.3791/0.1258/1}$ 0.3647/0.1404/2
$\textbf{9.0832}$ 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
Glass4 0.2442/0.2012/4 0.0658/0.1388/7 0.2892/0.2938/3 0.4617/0.2738/2 0.1825/0.2566/6 0.2092/0.1679/5 $\textbf{0.6333/0.1856/1}$
$\textbf{0.2045}$ 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
$\textbf{Average Rank}$ $\textbf{5.13}$ $\textbf{4.4}$ $\textbf{4.53}$ $\textbf{2.67}$ $\textbf{5.87}$ $\textbf{4.13}$ $\textbf{1.27}$
$\textbf{Final Rank}$ $\textbf{6}$ $\textbf{4}$ $\textbf{5}$ $\textbf{2}$ $\textbf{7}$ $\textbf{3}$ $\textbf{1}$
Table 7.  $ H_1 $ value of seven methods on fifteen benchmark datasets
Datasets RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
Pima 0.1918 0.5924 0.5728 0.6045 0.2423 0.5608 $\textbf{0.6347}$
Vowel0 0.4398 0.912 0.8038 0.8487 0.7864 0.7563 $\textbf{0.9181}$
Glass0123vs456 0.6668 0.7676 0.7948 $\textbf{0.8203}$ 0.713 0.6918 0.7789
Haberman 0.2898 0.1472 0.4129 0.4181 0.2635 0.5074 $\textbf{0.6115}$
Vehicle1 0.2354 0.6584 0.2784 0.5787 0.1197 0.4042 0.7231
Shuttlec0vsc4 0.7749 $\textbf{0.9939}$ 0.7578 0.85 0.8503 0.9214 0.9817
Pageblocks13vs4 0.5789 0.5362 0.5798 0.8831 0.5212 0.5547 $\textbf{0.9063}$
Abalone21vs8 0.3178 0.3591 0.1122 0.5162 0.4873 0.1809 $\textbf{0.677}$
Yeast6 0.3013 0.3116 0.4046 0.545 0.2474 0.6243 $\textbf{0.7041}$
Abalone918 0.2748 0.1066 0.1785 0.2729 0.0221 0.3497 $\textbf{0.6384}$
Newthyroid2 0.5924 0.6984 0.5332 0.8422 0.5072 0.7563 $\textbf{0.8736}$
Glass016vs5 0.4586 0.2613 0.5045 0.5878 0.423 0.331 $\textbf{0.7031}$
Glass5 0.4732 0.3994 0.465 0.5064 0.4526 0.479 $\textbf{0.835}$
Yeast4 0.0754 0.1489 0.3238 0.5015 0.0363 $\textbf{0.5905}$ 0.5632
Glass4 0.3774 0.0997 0.3971 0.6028 0.2517 0.3611 $\textbf{0.7768}$
Datasets RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
Pima 0.1918 0.5924 0.5728 0.6045 0.2423 0.5608 $\textbf{0.6347}$
Vowel0 0.4398 0.912 0.8038 0.8487 0.7864 0.7563 $\textbf{0.9181}$
Glass0123vs456 0.6668 0.7676 0.7948 $\textbf{0.8203}$ 0.713 0.6918 0.7789
Haberman 0.2898 0.1472 0.4129 0.4181 0.2635 0.5074 $\textbf{0.6115}$
Vehicle1 0.2354 0.6584 0.2784 0.5787 0.1197 0.4042 0.7231
Shuttlec0vsc4 0.7749 $\textbf{0.9939}$ 0.7578 0.85 0.8503 0.9214 0.9817
Pageblocks13vs4 0.5789 0.5362 0.5798 0.8831 0.5212 0.5547 $\textbf{0.9063}$
Abalone21vs8 0.3178 0.3591 0.1122 0.5162 0.4873 0.1809 $\textbf{0.677}$
Yeast6 0.3013 0.3116 0.4046 0.545 0.2474 0.6243 $\textbf{0.7041}$
Abalone918 0.2748 0.1066 0.1785 0.2729 0.0221 0.3497 $\textbf{0.6384}$
Newthyroid2 0.5924 0.6984 0.5332 0.8422 0.5072 0.7563 $\textbf{0.8736}$
Glass016vs5 0.4586 0.2613 0.5045 0.5878 0.423 0.331 $\textbf{0.7031}$
Glass5 0.4732 0.3994 0.465 0.5064 0.4526 0.479 $\textbf{0.835}$
Yeast4 0.0754 0.1489 0.3238 0.5015 0.0363 $\textbf{0.5905}$ 0.5632
Glass4 0.3774 0.0997 0.3971 0.6028 0.2517 0.3611 $\textbf{0.7768}$
[1]

Jian Luo, Xueqi Yang, Ye Tian, Wenwen Yu. Corporate and personal credit scoring via fuzzy non-kernel SVM with fuzzy within-class scatter. Journal of Industrial and Management Optimization, 2020, 16 (6) : 2743-2756. doi: 10.3934/jimo.2019078

[2]

Xiaodong Liu, Wanquan Liu. The framework of axiomatics fuzzy sets based fuzzy classifiers. Journal of Industrial and Management Optimization, 2008, 4 (3) : 581-609. doi: 10.3934/jimo.2008.4.581

[3]

Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016

[4]

Xiao-Ping Wang, Xianmin Xu. A dynamic theory for contact angle hysteresis on chemically rough boundary. Discrete and Continuous Dynamical Systems, 2017, 37 (2) : 1061-1073. doi: 10.3934/dcds.2017044

[5]

Dung Le. Strong positivity of continuous supersolutions to parabolic equations with rough boundary data. Discrete and Continuous Dynamical Systems, 2015, 35 (4) : 1521-1530. doi: 10.3934/dcds.2015.35.1521

[6]

Yang Han. On the cauchy problem for the coupled Klein Gordon Schrödinger system with rough data. Discrete and Continuous Dynamical Systems, 2005, 12 (2) : 233-242. doi: 10.3934/dcds.2005.12.233

[7]

Anna Anop, Robert Denk, Aleksandr Murach. Elliptic problems with rough boundary data in generalized Sobolev spaces. Communications on Pure and Applied Analysis, 2021, 20 (2) : 697-735. doi: 10.3934/cpaa.2020286

[8]

Giovanni Alessandrini, Eva Sincich, Sergio Vessella. Stable determination of surface impedance on a rough obstacle by far field data. Inverse Problems and Imaging, 2013, 7 (2) : 341-351. doi: 10.3934/ipi.2013.7.341

[9]

Takafumi Akahori. Global solutions of the wave-Schrödinger system with rough data. Communications on Pure and Applied Analysis, 2005, 4 (2) : 209-240. doi: 10.3934/cpaa.2005.4.209

[10]

Wenjuan Jia, Yingjie Deng, Chenyang Xin, Xiaodong Liu, Witold Pedrycz. A classification algorithm with Linear Discriminant Analysis and Axiomatic Fuzzy Sets. Mathematical Foundations of Computing, 2019, 2 (1) : 73-81. doi: 10.3934/mfc.2019006

[11]

Ville Kolehmainen, Matthias J. Ehrhardt, Simon R. Arridge. Incorporating structural prior information and sparsity into EIT using parallel level sets. Inverse Problems and Imaging, 2019, 13 (2) : 285-307. doi: 10.3934/ipi.2019015

[12]

Hans-Joachim Kroll, Sayed-Ghahreman Taherian, Rita Vincenti. Optimal antiblocking systems of information sets for the binary codes related to triangular graphs. Advances in Mathematics of Communications, 2022, 16 (1) : 169-183. doi: 10.3934/amc.2020107

[13]

Jingxian Sun, Shouchuan Hu. Flow-invariant sets and critical point theory. Discrete and Continuous Dynamical Systems, 2003, 9 (2) : 483-496. doi: 10.3934/dcds.2003.9.483

[14]

Cheng-Kai Hu, Fung-Bao Liu, Cheng-Feng Hu. Efficiency measures in fuzzy data envelopment analysis with common weights. Journal of Industrial and Management Optimization, 2017, 13 (1) : 237-249. doi: 10.3934/jimo.2016014

[15]

Cheng-Kai Hu, Fung-Bao Liu, Hong-Ming Chen, Cheng-Feng Hu. Network data envelopment analysis with fuzzy non-discretionary factors. Journal of Industrial and Management Optimization, 2021, 17 (4) : 1795-1807. doi: 10.3934/jimo.2020046

[16]

Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i

[17]

Hannes Meinlschmidt, Joachim Rehberg. Hölder-estimates for non-autonomous parabolic problems with rough data. Evolution Equations and Control Theory, 2016, 5 (1) : 147-184. doi: 10.3934/eect.2016.5.147

[18]

Lei Zhang, Luming Jia. Near-field imaging for an obstacle above rough surfaces with limited aperture data. Inverse Problems and Imaging, 2021, 15 (5) : 975-997. doi: 10.3934/ipi.2021024

[19]

Salvatore A. Marano, Sunra Mosconi. Non-smooth critical point theory on closed convex sets. Communications on Pure and Applied Analysis, 2014, 13 (3) : 1187-1202. doi: 10.3934/cpaa.2014.13.1187

[20]

Ketty A. De Rezende, Mariana G. Villapouca. Discrete conley index theory for zero dimensional basic sets. Discrete and Continuous Dynamical Systems, 2017, 37 (3) : 1359-1387. doi: 10.3934/dcds.2017056

2021 Impact Factor: 1.411

Metrics

  • PDF downloads (189)
  • HTML views (84)
  • Cited by (0)

Other articles
by authors

[Back to Top]