\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

A SMOTE-based quadratic surface support vector machine for imbalanced classification with mislabeled information

  • * Corresponding author: Ye Tian

    * Corresponding author: Ye Tian 
Abstract Full Text(HTML) Figure(4) / Table(7) Related Papers Cited by
  • Recently, Synthetic Minority Over-Sampling Technique (SMOTE) has been widely used to handle the imbalanced classification. To address the issues of existing benchmark methods, we propose a novel scheme of SMOTE based on the K-means and Intuitionistic Fuzzy Set theory to assign proper weights to the existing points and generate new synthetic points from them. Besides, we introduce the state-of-the-art kernel-free fuzzy quadratic surface support vector machine (QSSVM) to do the classification. Finally, the numerical experiments on various artificial and real data sets strongly demonstrate the validity and applicability of our proposed method, especially in the presence of mislabeled information.

    Mathematics Subject Classification: Primary: 65K10, 90C25; Secondary: 90B90.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  (a) The original distribution of the data set. (b) The synthetic examples generated by SMOTE (k = 5). (c) The synthetic examples generated by K-means. (d) The data set with mislabeled information (red dot). (e) synthetic examples generated by SMOTE (k = 5). (f) The synthetic examples generated by K-means

    Figure 2.  (a) The original distribution of the data set. (b) The synthetic minority examples generated by SMOTE (red triangular point). (c) The synthetic minority examples generated by our method (red triangular point)

    Figure 3.  $ AUC $ and $ G-mean $ values of different methods with various imbalance ratios on artificial data set1

    Figure 4.  ROC curves of seven methods over nine data sets: (a)Abalone21vs8 data set (b)Abalone918 data set (c)Haberman data set (d)Glass4 data set (e)Pima data set (f)Glass5 data set (g)Glass016vs5 data set (h)Vehicle1 data set (i)Yeast6 data set

    Table 1.  Confusion matrix for a two-class problem

    Predicted Positive Predicted Negative
    Positive TP FN
    Negative FP TN
     | Show Table
    DownLoad: CSV

    Table 2.  Results of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on quadratic artificial datasets with RBF kernel

    Data set Methods $ G-mean $ std of $ G-mean $ $ AUC $ std of $ AUC $ Time(s)
    AR1:800$ \times $3
    IR 1:3
    RUS-SVM 0.5414 0.3175 0.3838 0.3258 $\textbf{2.7148}$
    ROS-SVM 0.8886 0.0446 0.7914 0.0783 4.9408
    SMOTE-SVM 0.6453 0.1181 0.429 0.1421 8.752
    borderline-SMOTE1-SVM 0.846 0.0652 0.7195 0.1095 5.7484
    borderline-SMOTE2-SVM 0.595 0.1774 0.3823 0.2126 6.378
    MWMOTE-SVM 0.6982 0.0526 0.49 0.072 10.9427
    WSMOTE-QSSVM $\textbf{0.9051}$ $\textbf{0.0363}$ $\textbf{0.8203}$ $\textbf{0.0642}$ 4.0263
    AR1:2200$ \times $3
    IR 1:10
    RUS-SVM 0.516 0.2291 0.3135 0.203 $\textbf{13.6355}$
    ROS-SVM 0.8508 0.0604 0.7271 0.0985 52.8877
    SMOTE-SVM 0.4311 0.1005 0.1949 0.0874 127.6243
    borderline-SMOTE1-SVM 0.7897 0.0479 0.6257 0.0764 104.4459
    borderline-SMOTE2-SVM 0.5014 0.1418 0.2695 0.1347 86.6192
    MWMOTE-SVM 0.5971 0.0421 0.3581 $\textbf{0.0491}$ 93.0159
    WSMOTE-QSSVM $\textbf{0.8917}$ $\textbf{0.0301}$ $\textbf{0.796}$ 0.054 23.2509
    AR2:800$ \times $3
    IR 1:3
    RUS-SVM 0.7449 0.2905 0.6309 0.3688 $\textbf{3.5008}$
    ROS-SVM 0.9295 $\textbf{0.0337}$ 0.865 $\textbf{0.0621}$ 6.7627
    SMOTE-SVM 0.6407 0.1499 0.486 0.4307 10.6513
    borderline-SMOTE1-SVM 0.8778 0.0671 0.8062 0.7745 8.3038
    borderline-SMOTE2-SVM 0.6218 0.1317 0.5273 0.4022 8.1676
    MWMOTE-SVM 0.7123 0.0414 0.5626 0.5089 11.0469
    WSMOTE-QSSVM $\textbf{0.9337}$ 0.0345 $\textbf{0.8729}$ 0.0638 4.1132
    AR2:2200$ \times $3
    IR 1:10
    RUS-SVM 0.4802 0.2185 0.5771 0.2159 $\textbf{13.4868}$
    ROS-SVM 0.8855 0.0222 0.8799 0.039 53.3882
    SMOTE-SVM 0.4396 0.1028 0.47996 0.0845 130.0079
    borderline-SMOTE1-SVM 0.871 $\textbf{0.0185}$ 0.8628 $\textbf{0.0324}$ 101.8846
    borderline-SMOTE2-SVM 0.5126 0.1133 0.5578 0.1097 108.3143
    MWMOTE-SVM 0.6314 0.0463 0.6176 0.0587 90.1415
    WSMOTE-QSSVM $\textbf{0.9005}$ 0.0355 $\textbf{0.8960}$ 0.0639 23.926
    AR3:800$ \times $3
    IR 1:3
    RUS-SVM 0.5859 0.3457 0.4509 0.3265 $\textbf{3.3678}$
    ROS-SVM 0.8978 $\textbf{0.024}$ 0.8065 $\textbf{0.0431}$ 6.8082
    SMOTE-SVM 0.6942 0.0527 0.4844 0.0716 10.9369
    borderline-SMOTE1-SVM 0.8735 0.061 0.7663 0.1038 8.8757
    borderline-SMOTE2-SVM 0.6095 0.1267 0.3859 0.164 8.0711
    MWMOTE-SVM 0.6893 0.0478 0.4772 0.0664 11.2833
    WSMOTE-QSSVM $\textbf{0.9148}$ 0.041 $\textbf{0.8384}$ 0.0744 3.7026
    AR3:2200$ \times $3
    IR 1:10
    RUS-SVM 0.5809 0.2743 0.4052 0.2942 $\textbf{12.913}$
    ROS-SVM 0.8788 0.055 0.775 0.0965 50.4581
    SMOTE-SVM 0.4874 0.0875 0.2445 0.0847 125.7192
    borderline-SMOTE1-SVM 0.8586 $\textbf{0.0284}$ 0.7379 0.0485 113.3755
    borderline-SMOTE2-SVM 0.5892 0.0387 0.3485 $\textbf{0.046}$ 82.4204
    MWMOTE-SVM 0.6045 0.0525 0.3679 0.0621 90.1298
    WSMOTE-QSSVM $\textbf{0.8825}$ 0.051 $\textbf{0.7811}$ 0.0896 22.8422
     | Show Table
    DownLoad: CSV

    Table 3.  Average $ AUC $ and $ G-mean $ values over AR1 for different mislabeled levels

    Mislabeled level(%) Indicator RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
    10 $ AUC $ 0.8552 0.8325 0.573 0.8716 0.602 0.652 $\textbf{0.8729}$
    $ G-Mean $ 0.9245 0.912 0.7457 0.9332 0.767 0.803 $\textbf{0.9342}$
    15 $ AUC $ $\textbf{0.898}$ 0.8343 0.6416 0.8272 0.4717 0.6065 0.8489
    $ G-Mean $ $\textbf{0.9474}$ 0.9127 0.8005 0.9091 0.6707 0.7781 0.921
    20 $ AUC $ 0.4509 0.8065 0.4844 0.7663 0.3859 0.4772 $\textbf{0.8384}$
    $ G-Mean $ 0.5859 0.8978 0.6942 0.8735 0.6095 0.6893 $\textbf{0.9148}$
    25 $ AUC $ 0.2005 0.7544 0.3944 0.6254 0.3238 0.4363 $\textbf{0.7892}$
    $ G-Mean $ 0.2835 0.8681 0.6213 0.7874 0.5615 0.6595 $\textbf{0.8876}$
    30 $ AUC $ 0.0314 0.7754 0.4056 0.5259 0.1855 0.3721 $\textbf{0.7758}$
    $ G-Mean $ 0.1076 0.8787 0.6346 0.7229 0.4216 0.6067 $\textbf{0.8804}$
     | Show Table
    DownLoad: CSV

    Table 4.  Detailed information about KEEL data sets

    Datasets Data size Features Imbalanced ratio
    Pima 214 9 1.82
    Glass4 214 10 1.82
    Glass5 214 10 1.82
    Haberman 306 4 2.78
    Vehicle1 846 19 2.9
    Glass0123vs456 214 10 3.2
    Abalone21vs8 581 9 40.5
    Vowel0 988 14 9.98
    Shuttlec0vsc4 1829 10 13.87
    Pageblocks13vs4 472 11 15.86
    Glass016vs5 184 10 19.44
    Abalone918 731 9 16.4
    Newthyroid2 215 6 5.14
    Yeast4 1484 9 28.1
    Yeast6 1484 9 41.4
     | Show Table
    DownLoad: CSV

    Table 5.  $ G-mean $ of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on benchmark datasets with RBF kernel

    Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std/rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std /rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
    Pima 0.2726 /0.1996/7 0.6292/0.0446/2 0.5802/0.0802/4 0.6211/0.0628/3 0.3459/0.1442/6 0.5681/0.0529/5 0.6521/0.0505/1
    16.3335 8.2276 16.6804 18.3543 18.381 18.414 11.0034
    Vowel0 0.4675/0.3726/7 0.918/0.0845/2 0.8054/0.042/4 0.8536/0.0759/3 0.7976/0.0671/5 0.7594/0.0529/6 0.9199/0.0583/1
    14.1906 4.3252 33.0654 31.7508 31.5846 25.7455 25.7865
    Glass0123vs456 0.7099/0.1468/6 0.7874/0.0861/4 0.8037/0.0786/2 0.8291/0.0837/1 0.7418/0.1339/5 0.7048/0.1415/7 0.7921/0.0892/3
    0.6951 0.3932 1.0297 1.3301 1.2478 1.1803 3.056
    Haberman 0.3932/0.0794/5 0.2286/0.1846/7 0.4457/0.0944/4 0.4687/0.1048/3 0.3468/0.1334/6 0.5168/0.0544/2 0.6394/0.1057/1
    1.0481 0.5925 78.2541 1.7105 22.0941 2.8829 1.2026
    Vehicle1 0.2926/0.2722/6 0.6873/0.1027/2 0.38/0.1016/5 0.61/0.0604/3 0.2131/0.144/7 0.479/0.0467/4 0.7254/0.041/1
    14.1348 7.3443 23.71 23.79 24.2277 20.6886 47.0032
    Shuttlec0vsc4 0.7955/0.1095/6 0.9939/0.0098/1 0.7633/0.0287/7 0.8513/0.045/5 0.8527/0.0454/4 0.9238/0.0273/3 0.9819/0.0131/2
    15.0486 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
    Pageblocks13vs4 0.6375/0.1509/3 0.5887/0.2618/5 0.608/0.1809/4 0.8848/0.0649/2 0.5607/0.2308/7 0.5771/0.1332/6 0.9081/0.0699/1
    1.8912 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
    Abalone21vs8 0.3995/0.2119/3 0.3773/0.4158/4 0.1151/0.247/7 0.5252/0.3741/2 0.2321/0.2448/5 0.1937/0.2501/6 0.6873/0.2725/1
    1.0082 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
    Yeast6 0.3806/0.2218/5 0.3719/0.2875/6 0.4615/0.1965/4 0.5946/0.1959/3 0.3188/0.2381/7 0.6378/0.1015/2 0.7146/0.1159/1
    4.6745 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
    Abalone918 0.3443/0.2325/3 0.1556/0.2053/6 0.2063/0.2299/5 0.3263/0.252/4 0.0348/0.1102/7 0.3947/0.1589/2 0.6533/0.0978/1
    1.6203 6.1747 14.5602 10.559 5.9857 13.8744 8.594
    Newthyroid2 0.6419/0.1922/5 0.7352/0.1259/4 0.5555/0.2352/6 0.8474/0.0705/2 0.555/0.2077/7 0.7676/0.1135/3 0.8774/0.096/1
    0.3228 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
    Glass016vs5 0.5164/0.2742/3 0.2746/0.3546/7 0.5129/0.2897/4 0.6011/0.3381/2 0.4386/0.389/5 0.3389/0.3664/6 0.7132/0.1589/1
    0.1869 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
    Glass5 0.5388/0.2548/2 0.4234/0.3644/7 0.4735/0.3416/6 0.5215/0.3736/3 0.4737/0.3272/5 0.4896/0.35/4 0.8439/0.1328/1
    0.2077 0.6566 1.5191 1.3504 1.355 0.778 4.0237
    Yeast4 0.1843/0.0748/6 0.2065/0.2272/5 0.3751/0.2332/4 0.5555/0.0802/3 0.0627/0.1323/7 0.6059/0.1153/1 0.5941/0.1145/2
    9.0832 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
    Glass4 0.431/0.2546/3 0.1147/0.2419/7 0.4076/0.3698/4 0.6286/0.2719/2 0.2663/0.3521/6 0.378/0.2713/5 0.7867/0.1263/1
    0.2045 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
    Average Rank 4.7 4.6 4.7 2.7 5.9 4.1 1.3
    Final Rank 6 4 5 2 7 3 1
     | Show Table
    DownLoad: CSV

    Table 6.  $ AUC $ of RUS-SVM, ROS-SVM, SMOTE-SVM, borderline-SMOTE1-SVM, borderline-SMOTE2-SVM, MWMOTE-SVM, WSMOTE-QSSVM on benchmark datasets with RBF kernel

    Datasets Name RUS-SVM mean/std/rank time(s) ROS-SVM mean/std/rank time(s) SMOTE-SVM mean/std/rank time(s) borderline-SMOTE1-SVM mean/std /rank time(s) borderline-SMOTE2-SVM mean/std/rank time(s) MWMOTE-SVM mean/std/rank time(s) WSMOTE-QSSVM mean/std/rank time(s)
    Pima 0.1102/0.0913/7 0.3977/0.0534/2 0.3424/0.0928/4 0.3893/0.074/3 0.1384/0.0965/6 0.3252/0.0609/5 $\textbf{0.4275/0.0646/1}$
    16.3335 8.2276 16.6804 18.3543 18.381 18.414 $\textbf{11.0034}$
    Vowel0 0.3435/0.3226/7 0.8491/0.1441/2 0.6502/0.0677/4 0.7338/0.1277/3 0.6402/0.1017/5 0.5792/0.0777/6 $\textbf{0.8493/0.1051/1}$
    14.1906 $\textbf{4.3252}$ 33.0654 31.7508 31.5846 25.7455 25.7865
    Glass0123vs456 0.5233/0.204/6 0.6267/0.1316/4 0.6515/0.1289/2 $\textbf{0.6936/0.139/1}$ 0.5664/0.1878/5 0.5148/0.1721/7 0.6345/0.1353/3
    0.6951 $\textbf{0.3932}$ 1.0297 1.3301 1.2478 1.1803 3.056
    Haberman 0.1603/0.0604/5 0.0829/0.0931/7 0.2067/0.0798/4 0.2296/0.0854/3 0.1363/0.0655/6 0.2697/0.0573/2 $\textbf{0.4189/0.1375/1}$
    1.0481 $\textbf{0.5925}$ 78.2541 1.7105 22.0941 2.8829 1.2026
    Vehicle1 0.1523/0.1936/6 0.4819/0.1337/2 0.1537/0.0884/5 0.3753/0.0741/3 0.0641/0.0603/7 0.2314/0.0439/4 $\textbf{0.5276/0.0588/1}$
    14.1348 $\textbf{7.3443}$ 23.71 23.79 24.2277 20.6886 47.0032
    Shuttlec0vsc4 0.6436/0.1837/6 $\textbf{0.988/0.0193/1}$ 0.5834/0.044/7 0.7265/0.0761/5 0.729/0.0775/4 0.8541/0.0516/3 0.9643/0.0257/2
    $\textbf{15.0486}$ 75.75541 150.7936 145.2165 147.0763 78.9499 125.5474
    Pageblocks13vs4 0.427/0.1861/3 0.4082/0.263/4 0.3991/0.2052/5 0.7867/0.1124/2 0.3624/0.1935/6 0.3491/0.155/7 $\textbf{0.829/0.1209/1}$
    $\textbf{1.8912}$ 5.4061 11.0931 9.9519 10.4541 20.2431 22.9759
    Abalone21vs8 0.2/0.1464/5 0.2979/0.3654/3 0.0681/0.1533/7 0.4018/0.3065/2 0.2526/0.2665/4 0.0938/0.1212/6 $\textbf{0.5392/0.2631/1}$
    $\textbf{1.0082}$ 3.846 9.8971 8.5385 8.5024 8.7712 6.9418
    Yeast6 0.1892/0.1707/6 0.2127/0.2034/5 0.2477/0.144/4 0.3881/0.232/3 0.1527/0.1385/7 0.4161/0.1298/2 $\textbf{0.5228/0.1558/1}$
    $\textbf{4.6745}$ 27.0141 60.5981 60.6621 62.1266 40.8791 20.2801
    Abalone918 0.1672/0.1431/3 0.0621/0.0878/6 0.0901/0.1128/5 0.1637/0.1563/4 0.0121/0.0384/7 0.1785/0.09/2 $\textbf{0.4354/0.1282/1}$
    $\textbf{1.6203}$ 6.1747 14.5602 10.559 5.9857 13.8744 8.594
    Newthyroid2 0.4452/0.2622/5 0.5548/0.1858/4 0.3583/0.2065/6 0.7226/0.1202/2 0.3468/0.1504/7 0.6008/0.1692/3 $\textbf{0.7782/0.1661/1}$
    $\textbf{0.3228}$ 0.6046 1.1284 0.8065 0.7775 0.8403 1.4356
    Glass016vs5 0.3343/0.2469/4 0.1886/0.2437/7 0.3386/0.2297/3 0.4643/0.3076/2 0.3286/0.3228/5 0.2357/0.2752/6 $\textbf{0.5314/0.2463/1}$
    $\textbf{0.1869}$ 0.5241 1.175 1.1612 1.1598 0.5781 3.3859
    Glass5 0.3488/0.2479/4 0.2988/0.2572/7 0.3293/0.2687/5 0.3976/0.3177/2 0.3207/0.2221/6 0.35/0.2774/3 $\textbf{0.728/0.2101/1}$
    $\textbf{0.2077}$ 0.6566 1.5191 1.3504 1.355 0.778 4.0237
    Yeast4 0.039/0.0305/6 0.0891/0.1087/5 0.1896/0.1526/4 0.3144/0.0918/3 0.0197/0.0415/7 $\textbf{0.3791/0.1258/1}$ 0.3647/0.1404/2
    $\textbf{9.0832}$ 45.4406 161.9439 78.493 81.4676 78.5899 35.4481
    Glass4 0.2442/0.2012/4 0.0658/0.1388/7 0.2892/0.2938/3 0.4617/0.2738/2 0.1825/0.2566/6 0.2092/0.1679/5 $\textbf{0.6333/0.1856/1}$
    $\textbf{0.2045}$ 0.7664 1.4998 1.3387 1.3477 1.4321 3.4769
    $\textbf{Average Rank}$ $\textbf{5.13}$ $\textbf{4.4}$ $\textbf{4.53}$ $\textbf{2.67}$ $\textbf{5.87}$ $\textbf{4.13}$ $\textbf{1.27}$
    $\textbf{Final Rank}$ $\textbf{6}$ $\textbf{4}$ $\textbf{5}$ $\textbf{2}$ $\textbf{7}$ $\textbf{3}$ $\textbf{1}$
     | Show Table
    DownLoad: CSV

    Table 7.  $ H_1 $ value of seven methods on fifteen benchmark datasets

    Datasets RUS-SVM ROS-SVM SMOTE-SVM borderline-SMOTE1-SVM borderline-SMOTE2-SVM MWMOTE-SVM WSMOTE-QSSVM
    Pima 0.1918 0.5924 0.5728 0.6045 0.2423 0.5608 $\textbf{0.6347}$
    Vowel0 0.4398 0.912 0.8038 0.8487 0.7864 0.7563 $\textbf{0.9181}$
    Glass0123vs456 0.6668 0.7676 0.7948 $\textbf{0.8203}$ 0.713 0.6918 0.7789
    Haberman 0.2898 0.1472 0.4129 0.4181 0.2635 0.5074 $\textbf{0.6115}$
    Vehicle1 0.2354 0.6584 0.2784 0.5787 0.1197 0.4042 0.7231
    Shuttlec0vsc4 0.7749 $\textbf{0.9939}$ 0.7578 0.85 0.8503 0.9214 0.9817
    Pageblocks13vs4 0.5789 0.5362 0.5798 0.8831 0.5212 0.5547 $\textbf{0.9063}$
    Abalone21vs8 0.3178 0.3591 0.1122 0.5162 0.4873 0.1809 $\textbf{0.677}$
    Yeast6 0.3013 0.3116 0.4046 0.545 0.2474 0.6243 $\textbf{0.7041}$
    Abalone918 0.2748 0.1066 0.1785 0.2729 0.0221 0.3497 $\textbf{0.6384}$
    Newthyroid2 0.5924 0.6984 0.5332 0.8422 0.5072 0.7563 $\textbf{0.8736}$
    Glass016vs5 0.4586 0.2613 0.5045 0.5878 0.423 0.331 $\textbf{0.7031}$
    Glass5 0.4732 0.3994 0.465 0.5064 0.4526 0.479 $\textbf{0.835}$
    Yeast4 0.0754 0.1489 0.3238 0.5015 0.0363 $\textbf{0.5905}$ 0.5632
    Glass4 0.3774 0.0997 0.3971 0.6028 0.2517 0.3611 $\textbf{0.7768}$
     | Show Table
    DownLoad: CSV
  • [1] S. BaruaM. M. IslamX. Yao and K. Murase, MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning, IEEE Transactions on Knowledge & Data Engineering, 26 (2014), 405-425.  doi: 10.1109/TKDE.2012.232.
    [2] N. V. ChawlaK. W. BowyerL. O. Hall and W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16 (2002), 321-357.  doi: 10.1613/jair.953.
    [3] E. DuchesnayA. CachiaN. BoddaertN. ChabaneJ.-F. ManginJ.-L. MartinotF. Brunelle and M. Zilbovicius, Feature selection and classification of imbalanced datasets: Application to PET images of children with autistic spectrum disorders, Neuroimage, 57 (2011), 1003-1014.  doi: 10.1016/j.neuroimage.2011.05.011.
    [4] A. Gelman and D. B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science, 7 (1992), 457-472.  doi: 10.1214/ss/1177011136.
    [5] R. Y. Goh and L. S. Lee, Credit scoring: A review on support vector machines and metaheuristic approaches, Adv. Oper. Res., 2019 (2019), 1974794, 30pp. doi: 10.1155/2019/1974794.
    [6] R. S. Gong and S. H. Huang, A Kolmogorov-Smirnov statistic based segmentation approach to learning from imbalanced datasets: With application in property refinance prediction, Expert Systems with Applications, 39 (2012), 6192-6200.  doi: 10.1016/j.eswa.2011.12.011.
    [7] M. H. HaC. Wang and J. Q. Chen, The support vector machine based on intuitionistic fuzzy number and kernel function, Soft Computing, 17 (2013), 635-641.  doi: 10.1007/s00500-012-0937-y.
    [8] H. HanW. Y. Wang and B. H. Mao, Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, International Conference on Intelligent Computing, 2005 (3644), 878-887.  doi: 10.1007/11538059_91.
    [9] X. P. HuaS. XuJ. Gao and S. F. Ding, L1-norm loss-based projection twin support vector machine for binary classification, Soft Computing, 23 (2019), 10649-10659.  doi: 10.1007/s00500-019-04002-6.
    [10] W. C. LinC. F. TsaiY. H. Hu and J. S. Jhang, Clustering-based undersampling in class-imbalanced data-ScienceDirect, Information Sciences, 409/410 (2017), 17-26.  doi: 10.1016/j.ins.2017.05.008.
    [11] J. LuoS. C. FangY. Bai and Z. Deng, Fuzzy quadratic surface support vector machine based on fisher discriminant analysis, J. Ind. Manag. Optim., 12 (2016), 357-373. 
    [12] J. Luo, S. C. Fang, Z. B. Deng and X. L. Guo, Soft quadratic surface support vector machine for binary classification, Asia-Pac. J. Oper. Res., 33 (2016), 1650046, 22 pp. doi: 10.1142/S0217595916500469.
    [13] K. MiroslavH. Robert and M. Stan, Machine learning for the detection of oil spills in satellite radar images, Machine Learning, 30 (1998), 195-215. 
    [14] E. RamentolY. CaballeroR. Bello and F. Herrera, SMOTE-RSB: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory, Knowledge and Information Systems, 33 (2012), 245-265.  doi: 10.1007/s10115-011-0465-6.
    [15] E. RamentolN. VerbiestR. BelloY. CaballeroC. Cornelts and F. Herrera, SMOTE-FRST: A new resampling method using fuzzy rough set theory, Uncertainty Modeling in Knowledge Engineering and Decision Making, 7 (2012), 800-805.  doi: 10.1142/9789814417747_0128.
    [16] B. SchLkopfJ. C. PlattJ. S. TaylorA. J. Smola and R. C. Williamson, Estimating the support of a high-dimensional distribution, Neural Computation, 13 (2001), 1443-1471.  doi: 10.1162/089976601750264965.
    [17] J. Taeho and J. Nathalie, Class imbalances versus small disjuncts, Acm Sigkdd Explorations Newsletter, 6 (2004), 40-49.  doi: 10.1145/1007730.1007737.
    [18] M. A. TahirJ. KittlerK. Mikolajczyk and F. Yan, A multiple expert approach to the class imbalance problem using inverse random under sampling, International Workshop on Multiple Classifier Systems, 5519 (2009), 82-91.  doi: 10.1007/978-3-642-02326-2_9.
    [19] Y. TianZ. B. DengJ. Luo and Y. Q. Li, An intuitionistic fuzzy set based S$^3$VM model for binary classification with mislabeled information, Fuzzy Optim. Decis. Mak., 17 (2018), 475-494.  doi: 10.1007/s10700-017-9282-z.
    [20] Y. TianM. SunZ. B. DengJ. Luo and Y. Q. Li, A new fuzzy set and non-kernel SVM approach for mislabeled binary classification with applications, IEEE Transactions on Fuzzy Systems, 25 (2017), 1536-1545.  doi: 10.1109/TFUZZ.2017.2752138.
    [21] [0885-6125] J. M. Tomczak and M. Zie.ba, Probabilistic combination of classification rules and its application to medical diagnosis, Mach. Learn., 101 (2015), 105-135. doi: 10.1007/s10994-015-5508-x.
    [22] N. VerbiestE. RamentolC. Cornelis and F. Herrera, Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection, Applied Soft Computing, 22 (2014), 511-517.  doi: 10.1016/j.asoc.2014.05.023.
    [23] R. F. XuT. ChenY. Q. XiaQ. LuB. Liu and X. Wang, Word embedding composition for data imbalances in sentiment and emotion classification, Cognitive Computation, 7 (2015), 226-240.  doi: 10.1007/s12559-015-9319-y.
    [24] T. YuJ. DebenhamT. Jan and S. Simoff, Combine vector quantization and support vector machine for imbalanced datasets, Artificial Intelligence in Theory and Practice, 217 (2012), 81-88.  doi: 10.1007/978-0-387-34747-9_9.
    [25] Y. T. XuQ. WangX. Y. Pang and Y. Tian, Maximum margin of twin spheres machine with pinball loss for imbalanced data classification, Applied Intelligence, 48 (2018), 23-34.  doi: 10.1007/s10489-017-0961-9.
  • 加载中

Figures(4)

Tables(7)

SHARE

Article Metrics

HTML views(902) PDF downloads(401) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return