当前位置:   article > 正文

数据挖掘(4):不同的分类模型有关金融数据分类的评价(accuracy、precision,recall和F1-score)_xgbclassifier 多分类 模型评价

xgbclassifier 多分类 模型评价

1.首先读入数据(已经处理(删除,填补,类型转换,归一化)过的数据)并定义计算accuracy、precision,recall和F1-score的函数,并对数据采用sklearn中的特征递归消除法进行特征选择

  1. from sklearn import metrics
  2. import pandas as pd
  3. from sklearn.feature_selection import RFE
  4. def metrics_result(true, predict):#定义并输出正确率、精确率、召回率、AUC
  5. acc = metrics.accuracy_score(true, predict)
  6. pre = metrics.precision_score(true, predict)
  7. reca = metrics.recall_score(true, predict)
  8. f_sco = metrics.f1_score(true, predict)
  9. # auc_ = metrics.auc(true, predict)
  10. return acc, pre, reca, f_sco
  11. data = pd.read_csv('data_imp.csv')#读入已经处理(删除,填补,类型转换,归一化)过的数据
  12. label = data['status']
  13. data = data.iloc[:,:-1]
  14. data_=RFE(estimator=RandomForestClassifier(), n_features_to_select=30).fit_transform(data,label)

2.定义相关的分类函数

  1. from sklearn.model_selection import StratifiedKFold
  2. from sklearn.linear_model import LogisticRegression
  3. from sklearn.ensemble import RandomForestClassifier
  4. from sklearn.tree import DecisionTreeClassifier
  5. from sklearn import svm
  6. from xgboost.sklearn import XGBClassifier
  7. def LR_classifier(train_data, train_label, test_data, test_label):
  8. clf = LogisticRegression(C=1.0, max_iter=1000)
  9. prediction_test = clf.fit(train_data, train_label).predict(test_data)
  10. prediction_train = clf.fit(train_data, train_label).predict(train_data)
  11. return prediction_test, prediction_train
  12. def svm_classifier(train_data, train_label, test_data, test_label):
  13. clf = svm.SVC(C=1.0, kernel='linear', gamma=20)
  14. prediction_test = clf.fit(train_data, train_label).predict(test_data)
  15. prediction_train = clf.fit(train_data, train_label).predict(train_data)
  16. return prediction_test, prediction_train
  17. def dt_classifier(train_data, train_label, test_data, test_label):
  18. clf = DecisionTreeClassifier(max_depth=5)
  19. prediction_test = clf.fit(train_data, train_label).predict(test_data)
  20. prediction_train = clf.fit(train_data, train_label).predict(train_data)
  21. return prediction_test, prediction_train
  22. def rf_classifier(train_data, train_label, test_data, test_label):
  23. clf = RandomForestClassifier(n_estimators=8, random_state=5, max_depth=6, min_samples_split=2)
  24. prediction_test = clf.fit(train_data, train_label).predict(test_data)
  25. prediction_train = clf.fit(train_data, train_label).predict(train_data)
  26. return prediction_test, prediction_train
  27. def xgb_classifier(train_data, train_label, test_data, test_label):
  28. clf = XGBClassifier(n_estimators=8,learning_rate= 0.25, max_depth=20,subsample=1,gamma=13, seed=1000,num_class=1)
  29. prediction_test = clf.fit(train_data, train_label).predict(test_data)
  30. prediction_train = clf.fit(train_data, train_label).predict(train_data)
  31. return prediction_test, prediction_train

然后再进行四折交叉验证,最后结果如下:

模型accuracyprecisionrecallf1_scoreaucroc curve
Logistic Regressiontrain:0.7909, test:0.7868train:0.7352, test:0.7209train:0.2638, test:0.2525train:0.3883, test:0.3731train:0.6132, test:0.6216
Support Vector Machinetrain:0.778, test:0.7753train:0.8144, test:0.8023train:0.1524, test:0.1437train:0.2568, test:0.2433train:0.5694, test:0.5743
Decision Treetrain:0.8082, test:0.7735train:0.766, test:0.6196train:0.345, test:0.2704train:0.4734, test:0.3755train:0.6272, test:0.6138
Random Foresttrain:0.8218, test:0.7768train:0.8931, test:0.661train:0.3321, test:0.2354train:0.4839, test:0.3457train:0.6642, test:0.6119
XGBoosttrain:0.8138, test:0.7845train:0.7897, test:0.6655train:0.3543, test:0.2875train:0.4889, test:0.4012train:0.6610, test:0.6455

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/AllinToyou/article/detail/142890
推荐阅读
相关标签
  

闽ICP备14008679号