skearn.metrics.confusion_matrix(
    y_true,   # array, Gound true (correct) target values
    y_pred,  # array, Estimated targets as returned by a classifier
    labels=None,  # array, List of labels to index the matrix.
    sample_weight=None  # array-like of shape = [n_samples], Optional sample weights
)

完整示例代码如下：


__author__ = "lingjun"
# E-mail: 1763469890@qq.com
 
import seaborn as sns
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
sns.set()
 
f, (ax1,ax2) = plt.subplots(figsize = (10, 8),nrows=2)
y_true = ["dog", "dog", "dog", "cat", "cat", "cat", "cat"]
y_pred = ["cat", "cat", "dog", "cat", "cat", "cat", "cat"]
C2= confusion_matrix(y_true, y_pred, labels=["dog", "cat"])
print(C2)
print(C2.ravel())
sns.heatmap(C2,annot=True)
 
ax2.set_title('sns_heatmap_confusion_matrix')
ax2.set_xlabel('Pred')
ax2.set_ylabel('True')
f.savefig('sns_heatmap_confusion_matrix.jpg', bbox_inches='tight')

保存的图像如下所示：

这个时候我们还是不知道skearn.metrics.confusion_matrix做了些什么，这个时候print(C2)，打印看下C2究竟里面包含着什么。最终的打印结果如下所示：


[[1 2]
 [0 4]]
[1 2 0 4]

解释下上面这几个数字的意思：


C2= confusion_matrix(y_true, y_pred, labels=["dog", "cat"])中的labels的顺序就分布是0、1，negative和positive
 
注：labels=[]可加可不加，不加情况下会自动识别，自己定义

在计算cat的混淆矩阵的时候，cat就是阳性，dog和其他，就是阴性，如下面这样：

cat为1-positive，其中真实值中cat有4个，4个被预测为cat，预测正确T，0个被预测为dog，预测错误F；
dog为0-negative，其中真实值中dog有3个，1个被预测为dog，预测正确T，2个被预测为cat，预测错误F。

定义：

TP：正确的预测为正例，也就是预测为正例，预测对了
TN：正确的预测为反例，也就是预测为反例，预测对了
FP：错误的预测为正例，也就是预测为正例，预测错了
FN：预测的预测为反例，也就是预测为反例，预测错了

所以：在分别以狗dog和猫cat为正例，预测错位为反例中，会分别得到如下两个混淆矩阵：


dog-1,其他为0:
y_true = ["1", "1", "1", "0", "0", "0", "0"]
y_pred = ["0", "0", "1", "0", "0", "0", "0"]
 
TP:1
TN:4
FP:0
FN:2
 
 
cat-1,其他为0:
y_true = ["0", "0", "0", "1", "1", "1", "1"]
y_pred = ["1", "1", "0", "1", "1", "1", "1"]
 
TP:4
TN:1
FP:2
FN:0

注意：混淆矩阵是评价某一模型预测结果好坏的方法，预测对与错的参照标准是标注结果。其中，需要对预测置信度进行阈值分割。

大于该阈值的，为预测阳性
小于该阈值的，为预测阴性

所以，确定该类的阈值是多少，很重要，直接决定了混淆矩阵的数值分布。其中，该阈值可根据ROC曲线进行确定，这块下文会详述，继续往后看。

从这里就可以看出，混淆矩阵的衡量是很片面的，依据混淆矩阵计算的精确率、召回率、准确率等等评价方法，也是很片面的。这就是他们的缺点，需要一个更加全面的评价指标的出现。

二、准确率(Accuracy)、精确率(Precision)、召回率(Recall)、F1score

有了上面的这些混淆矩阵的数值，就可以进行如下准确率、精确率、召回率、F1score等等评价指标的的计算工作了

2.1、准确率(Accuracy)

这三个指标里最直观的就是准确率: 模型判断正确的数据(TP+TN)占总数据的比例

"Accuracy: "+str(round((tp+tn)/(tp+fp+fn+tn), 3))

2.2、召回率(Recall)

针对数据集中的所有正例label(TP+FN)而言，模型正确判断出的正例(TP)占数据集中所有正例的比例；FN表示被模型误认为是负例但实际是正例的数据；

召回率也叫查全率，以物体检测为例，我们往往把图片中的物体作为正例，此时召回率高代表着模型可以找出图片中更多的物体!

"Recall: "+str(round((tp)/(tp+fn), 3))

2.3、精确率(Precision)

针对模型判断出的所有正例(TP+FP)而言，其中真正例(TP)占的比例。精确率也叫查准率，还是以物体检测为例，精确率高表示模型检测出的物体中大部分确实是物体，只有少量不是物体的对象被当成物体。

"Precision: "+str(round((tp)/(tp+fp), 3))

2.4、敏感度、特异度、假阳性率、阳性预测值、阴性预测值

还有，敏感度Sensitivity、特异度Specificity、假阳性率False positive rate、阳性预测值Positive predictive value、阴性预测值Negative predictive value，分别的计算方法如下所示：


("Sensitivity: "+str(round(tp/(tp+fn+0.01), 3)))
("Specificity: "+str(round(1-(fp/(fp+tn+0.01)), 3)))
("False positive rate: "+str(round(fp/(fp+tn+0.01), 3)))
("Positive predictive value: "+str(round(tp/(tp+fp+0.01), 3)))
("Negative predictive value: "+str(round(tn/(fn+tn+0.01), 3)))

其中：

敏感度=召回率，都是看label标记是阳性中，预测pd有多少真是阳性
特异度是看label标记是阴性中，预测pd有多少是真的阴性，这里的阴性可以是一大类。假设需要评估的类是马路上的人，那除人之外，其他类别均可以作为人相对应的阴性
在医学领域，敏感度更关注漏诊率（有病之人不能漏），特异度更关注误诊率（无病之人不能误）
假阳性率 = 1 - 特异度，假阳性越多，误诊越多
阳性预测值 = 精确率，是看预测为阳性中，有多少是真阳性
阴性预测值是看预测为阴性中，有多少是真阴性

2.5、F1score

要计算F1score，需要先计算精确率和召回率。其中：


Precision = tp/tp+fp
 
Recall = tp/tp+fn
 
进而计算得到：
 
F1score = 2 * Precision * Recall /（Precision + Recall）

那么，你有没有想过，F1score中，recall和Precision对其的影响是怎么样的。我们用如下代码，绘制出来看看。


import numpy as np
import matplotlib.pyplot as plt
 
fig = plt.figure()  #定义新的三维坐标轴
ax3 = plt.axes(projection='3d')
 
#定义三维数据
precision = np.arange(0.01, 1, 0.1)
recall = np.arange(0.01, 1, 0.1)
X, Y = np.meshgrid(precision, recall)   # 用两个坐标轴上的点在平面上画网格
Z = 2*X*Y/(X+Y)
 
# 作图
ax3.plot_surface(X, Y, Z, rstride = 1, cstride = 1, cmap='rainbow')
plt.xlabel('precision')
plt.ylabel('recall')
plt.title('F1 score')
plt.show()

数据分布图如下：

可以看出，精准度和recall，无论任何一个低，F1score都不会高，只有两个都高的时候，分数才会高，这也能够说明，为啥很多评价都是采用F1 score。

三、绘制ROC曲线，及计算以上评价参数

3.1、什么是ROC

在上文混淆矩阵时候，我们提到：混淆矩阵的绘制严重依赖一个固定的阈值，在确定该阈值的前提下，才能确定混淆矩阵的数值，这种对模型评价方式是面片，不全面的。

此时，迫切需要一中评价方式，能够更加全面的对模型进行评估，于是就出现的ROC曲线，如下所示：

其中：

横轴：False Positive Rate（假阳率，FPR）
纵轴：True Positive Rate（真阳率，TPR）

连接（0，0）和（1，1）绿色曲线上的任意一点，是在该阈值下，对应的混淆矩阵下的假阳性率和真阳性率。例如图中的（0.1，0.8），即该阈值 t 下，假阳性率为0.1，真阳性率为0.8。

一个简单示例：


import numpy as np
from sklearn.metrics import roc_curve, auc
y      = np.array([1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1])
scores = np.array([0.1, 0.4, 0.35, 0.8, 0.9, 0.7, 0.6, 0.4, 0.2, 0.1, 0.2, 0.9, 0.8, 0.65, 0.85, 0.67, 0.75, 0.74, 0.36, 0.85, 0.48, 0.95, 1, 0.65,
                   0.85, 0.75, 0.95, 0.84, 0.74, 0.58, 0.95])
fpr, tpr, thresholds = roc_curve(y, scores, pos_label=1)
print(fpr)
print(tpr)
print(thresholds)
 
roc_auc = auc(fpr, tpr)
 
plt.plot(fpr, tpr, lw=1, label="COVID vs NotCOVID, AUC=%0.3f)" % (roc_auc))
 
plt.xlim([0.00, 1.0])
plt.ylim([0.00, 1.0])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC")
plt.legend(loc="lower right")
plt.savefig(r"./ROC.png")
print("ok")

更多详尽的，参考sklearn官方文档即可。后续的几个画图，也是根据这里展开的。可以不看，自己构建即可。

3.2、绘制二分类ROC

如下为统计数据：

下面的代码，就是要读取上面的CSV文档，对该类别绘制ROC曲线。这里只是一个范例，你根据自己的数据情况进行针对性的修改即可。其中+表示阳性，-表示阴性


__author__ = "lingjun"
# E-mail: 1763469890@qq.com
 
from sklearn.metrics import roc_auc_score, confusion_matrix, roc_curve, auc
from matplotlib import pyplot as plt
import numpy as np
import torch
import csv
 
def confusion_matrix_roc(GT, PD, experiment, n_class):
    GT = GT.numpy()
    PD = PD.numpy()
 
    y_gt = np.argmax(GT, 1)
    y_gt = np.reshape(y_gt, [-1])
    y_pd = np.argmax(PD, 1)
    y_pd = np.reshape(y_pd, [-1])
 
    # ---- Confusion Matrix and Other Statistic Information ----
    if n_class > 2:
        c_matrix = confusion_matrix(y_gt, y_pd)
        # print("Confussion Matrix:\n", c_matrix)
        list_cfs_mtrx = c_matrix.tolist()
        # print("List", type(list_cfs_mtrx[0]))
 
        path_confusion = r"./records/" + experiment + "/confusion_matrix.txt"
        # np.savetxt(path_confusion, (c_matrix))
        np.savetxt(path_confusion, np.reshape(list_cfs_mtrx, -1), delimiter=',', fmt='%5s')
 
    if n_class == 2:
        list_cfs_mtrx = []
        tn, fp, fn, tp = confusion_matrix(y_gt, y_pd).ravel()
 
        list_cfs_mtrx.append("TN: " + str(tn))
        list_cfs_mtrx.append("FP: " + str(fp))
        list_cfs_mtrx.append("FN: " + str(fn))
        list_cfs_mtrx.append("TP: " + str(tp))
        list_cfs_mtrx.append(" ")
        list_cfs_mtrx.append("Accuracy: " + str(round((tp + tn) / (tp + fp + fn + tn), 3)))
        list_cfs_mtrx.append("Sensitivity: " + str(round(tp / (tp + fn + 0.01), 3)))
        list_cfs_mtrx.append("Specificity: " + str(round(1 - (fp / (fp + tn + 0.01)), 3)))
        list_cfs_mtrx.append("False positive rate: " + str(round(fp / (fp + tn + 0.01), 3)))
        list_cfs_mtrx.append("Positive predictive value: " + str(round(tp / (tp + fp + 0.01), 3)))
        list_cfs_mtrx.append("Negative predictive value: " + str(round(tn / (fn + tn + 0.01), 3)))
 
        path_confusion = r"./records/" + experiment + "/confusion_matrix.txt"
        np.savetxt(path_confusion, np.reshape(list_cfs_mtrx, -1), delimiter=',', fmt='%5s')
 
    # ---- ROC ----
    plt.figure(1)
    plt.figure(figsize=(6, 6))
 
    fpr, tpr, thresholds = roc_curve(GT[:, 1], PD[:, 1])
    roc_auc = auc(fpr, tpr)
 
    plt.plot(fpr, tpr, lw=1, label="positive vs negative, area=%0.3f)" % (roc_auc))
    # plt.plot(thresholds, tpr, lw=1, label='Thr%d area=%0.2f)' % (1, roc_auc))
    # plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Luck')
 
    plt.xlim([0.00, 1.0])
    plt.ylim([0.00, 1.0])
    plt.xlabel("False Positive Rate")
    plt.ylabel("True Positive Rate")
    plt.title("ROC")
    plt.legend(loc="lower right")
    plt.savefig(r"./records/" + experiment + "/ROC.png")
    print("ok")
 
def inference():
    GT = torch.FloatTensor()
    PD = torch.FloatTensor()
    file = r"Sensitive_rename_inform.csv"
    with open(file, 'r', encoding='UTF-8') as f:
        reader = csv.DictReader(f)
        for row in reader:
            # TODO
            max_patient_score = float(row['ai1'])
            doctor_gt = row['gt2']
 
            print(max_patient_score,doctor_gt)
 
            pd = [[max_patient_score, 1-max_patient_score]]
            output_pd = torch.FloatTensor(pd).to(device)
 
            if doctor_gt == "+":
                target = [[1.0, 0.0]]
            else:
                target = [[0.0, 1.0]]
            target = torch.FloatTensor(target)  # 类型转换, 将list转化为tensor, torch.FloatTensor([1,2])
            Target = torch.autograd.Variable(target).long().to(device)
 
            GT = torch.cat((GT, Target.float().cpu()), 0)  # 在行上进行堆叠
            PD = torch.cat((PD, output_pd.float().cpu()), 0)
 
    confusion_matrix_roc(GT, PD, "ROC", 2)
 
if __name__ == "__main__":
    inference()

若是表格里面有中文，则记得这里进行修改，否则报错

with open(file, 'r') as f:

四、更普遍的方法，绘制ROC

这是两个文件夹，存储的都是记录图像预测结果的txt文件，分别是：

tb表示的是positive样本预测的txt文件集，一个图像对应一个txt
nontb表示的是negative样本预测的txt文件集，一个图像对应一个txt
每一个txt文件记录的都是该图像预测的置信率（我这里是目标检测、分割问题，所以一个图像里面，可能会存在两个目标的问题）

如下是实现的代码，整体内容与上一个绘制ROC的方法差不多，只是读取数据的方式不同。


import csv
import numpy as np
import torch
import os
from matplotlib import pyplot as plt
from sklearn.metrics import roc_auc_score, confusion_matrix, roc_curve, auc
def confusion_matrix_roc(GT, PD, experiment, n_class):
    GT = GT.numpy()
    PD = PD.numpy()
 
    y_gt = np.argmax(GT, 1)
    y_gt = np.reshape(y_gt, [-1])
    y_pd = np.argmax(PD, 1)
    y_pd = np.reshape(y_pd, [-1])
 
    # ---- Confusion Matrix and Other Statistic Information ----
    if n_class > 2:
        c_matrix = confusion_matrix(y_gt, y_pd)
        # print("Confussion Matrix:\n", c_matrix)
        list_cfs_mtrx = c_matrix.tolist()
        # print("List", type(list_cfs_mtrx[0]))
 
        path_confusion = r"./records/" + experiment + "/confusion_matrix.txt"
        # np.savetxt(path_confusion, (c_matrix))
        np.savetxt(path_confusion, np.reshape(list_cfs_mtrx, -1), delimiter=',', fmt='%5s')
 
    if n_class == 2:
        list_cfs_mtrx = []
        tn, fp, fn, tp = confusion_matrix(y_gt, y_pd).ravel()
 
        list_cfs_mtrx.append("TN: " + str(tn))
        list_cfs_mtrx.append("FP: " + str(fp))
        list_cfs_mtrx.append("FN: " + str(fn))
        list_cfs_mtrx.append("TP: " + str(tp))
        list_cfs_mtrx.append(" ")
        list_cfs_mtrx.append("Accuracy: " + str(round((tp + tn) / (tp + fp + fn + tn), 3)))
        list_cfs_mtrx.append("Sensitivity: " + str(round(tp / (tp + fn + 0.01), 3)))
        list_cfs_mtrx.append("Specificity: " + str(round(1 - (fp / (fp + tn + 0.01)), 3)))
        list_cfs_mtrx.append("False positive rate: " + str(round(fp / (fp + tn + 0.01), 3)))
        list_cfs_mtrx.append("Positive predictive value: " + str(round(tp / (tp + fp + 0.01), 3)))
        list_cfs_mtrx.append("Negative predictive value: " + str(round(tn / (fn + tn + 0.01), 3)))
 
        path_confusion = r"./records/confusion_matrix.txt"
        np.savetxt(path_confusion, np.reshape(list_cfs_mtrx, -1), delimiter=',', fmt='%5s')
 
    # ---- ROC ----
    plt.figure(1)
    plt.figure(figsize=(6, 6))
 
    fpr, tpr, thresholds = roc_curve(GT[:, 1], PD[:, 1])
    roc_auc = auc(fpr, tpr)
 
    return fpr, tpr, roc_auc
 
    # plt.plot(fpr, tpr, lw=1, label="ATB vs NotTB, area=%0.3f)" % (roc_auc))
    # # plt.plot(thresholds, tpr, lw=1, label='Thr%d area=%0.2f)' % (1, roc_auc))
    # # plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Luck')
    #
    # plt.xlim([0.00, 1.0])
    # plt.ylim([0.00, 1.0])
    # plt.xlabel("False Positive Rate")
    # plt.ylabel("True Positive Rate")
    # plt.title("ROC")
    # plt.legend(loc="lower right")
    # plt.savefig(r"./records/" + experiment + "/ROC.png")
 
def draw_ROC(file):
    GT = torch.FloatTensor()
    PD = torch.FloatTensor()
 
    pred_tb_num = 0
 
    for (path, dirs, files) in os.walk(file):
        for filename in files:
            txt_path = os.path.join(path, filename)
            label_flag = txt_path.split("\\")[-2]
 
            size = os.path.getsize(txt_path)
            if size != 0:
                print('文件不是空的')
 
                list_socre = []
                with open(txt_path, "r") as f:
                    for line in f.readlines():
                        line = line.strip('\n')  # 去掉列表中每一个元素的换行符
                        list_socre.append(line)
 
                max_score=max(list_socre)
                print("max_score:",max_score)
 
                output = [[1-float(max_score), float(max_score)]]
                print("output=", output)
                output = torch.FloatTensor(output)  # 类型转换, 将list转化为tensor, torch.FloatTensor([1,2])
                # output_pd = torch.autograd.Variable(output).long().to('cpu')
                PD = torch.cat((PD, output.float().cpu()), 0)  # 在行上进行堆叠
 
                print(label_flag)
                if label_flag == "tb":
 
                    target = [[0.0, 1.0]]
 
                else:
                    target = [[1.0, 0.0]]
 
                print("target=", target)
                target = torch.FloatTensor(target)  # 类型转换, 将list转化为tensor, torch.FloatTensor([1,2])
                GT = torch.cat((GT, target.float().cpu()), 0)  # 在行上进行堆叠
 
            else:
                output = [[1.0, 0.0]]
                print("output=", output)
                output = torch.FloatTensor(output)  # 类型转换, 将list转化为tensor, torch.FloatTensor([1,2])
                # output_pd = torch.autograd.Variable(output).long().to('cpu')
                PD = torch.cat((PD, output.float().cpu()), 0)  # 在行上进行堆叠
 
                print(label_flag)
                if label_flag == "tb":
 
                    target = [[0.0, 1.0]]
 
                else:
                    target = [[1.0, 0.0]]
 
                print("target=", target)
                target = torch.FloatTensor(target)  # 类型转换, 将list转化为tensor, torch.FloatTensor([1,2])
                GT = torch.cat((GT, target.float().cpu()), 0)  # 在行上进行堆叠
 
    print(len(GT))
    return GT, PD
 
if __name__=='__main__':
    file = r"Z:\reslt_pd"
    GT_no, PD_no = draw_ROC(file)
    fpr_no, tpr_no, roc_auc_no = confusion_matrix_roc(GT_no, PD_no, "ROC", 2)
 
    plt.plot(fpr_no, tpr_no, lw=1, label="positive vs negative, area=%0.3f)" % (roc_auc_no))
    #plt.plot(fpr, tpr, lw=1, label="ATB vs NotTB ReduceFP, area=%0.3f)" % (roc_auc))
    # plt.plot(thresholds, tpr, lw=1, label='Thr%d area=%0.2f)' % (1, roc_auc))
    # plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Luck')
 
    plt.xlim([0.00, 1.0])
    plt.ylim([0.00, 1.0])
    plt.xlabel("1-specificity")
    plt.ylabel("sensitivity")
    plt.title("ROC")
    plt.legend(loc="lower right")
    plt.savefig(r"ROC.png")

五、多分类的ROC曲线

参考链接：ROC原理介绍及利用python实现二分类和多分类的ROC曲线_闰土不用叉的博客-CSDN博客_roc曲线python


# 引入必要的库
import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle
from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from scipy import interp
 
# 加载数据
iris = datasets.load_iris()
X = iris.data
y = iris.target
# 将标签二值化
y = label_binarize(y, classes=[0, 1, 2])
# 设置种类
n_classes = y.shape[1]
 
# 训练模型并预测
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape
 
# shuffle and split training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5,random_state=0)
 
# Learn to predict each class against the other
classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
                                 random_state=random_state))
y_score = classifier.fit(X_train, y_train).decision_function(X_test)
 
# 计算每一类的ROC
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
 
# Compute micro-average ROC curve and ROC area（方法二）
fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])
 
# Compute macro-average ROC curve and ROC area（方法一）
# First aggregate all false positive rates
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))
# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])
# Finally average it and compute AUC
mean_tpr /= n_classes
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])
 
# Plot all ROC curves
lw=2
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
         label='micro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["micro"]),
         color='deeppink', linestyle=':', linewidth=4)
 
plt.plot(fpr["macro"], tpr["macro"],
         label='macro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["macro"]),
         color='navy', linestyle=':', linewidth=4)
 
colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
             label='ROC curve of class {0} (area = {1:0.2f})'
             ''.format(i, roc_auc[i]))
 
plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver operating characteristic to multi-class')
plt.legend(loc="lower right")
plt.show()

图像如下：

补充一个更普世的绘制方法，如下：


def readtxt(txtfile_path):
    list_info = []
    with open(txtfile_path, "r") as f:
        for line in f.readlines():
            line = line.strip('\n')  # 去掉列表中每一个元素的换行符
            list_info.append(float(line))
    return list_info
 
def plot_roc():
    fig = plt.figure(figsize=(6, 6))
 
    # base line
    fpr1 = readtxt(r'./roc/lidc_fpr.txt')
    tpr1 = readtxt(r'./roc/lidc_tpr.txt')
    print(fpr1)
    print(tpr1)
    roc_auc, auc_l, auc_h = 0.784, 0.724, 0.836
    plt.plot(fpr1, tpr1, lw=2, label='{} AUC={} (95% CI: {}-{})'.format('LIDC', roc_auc, auc_l, auc_h))
 
    #
    fpr = readtxt(r'./roc/a_fpr.txt')
    tpr = readtxt(r'./roc/a_tpr.txt')
    print(fpr)
    print(tpr)
    roc_auc, auc_l, auc_h = 0.762, 0.730, 0.792
    plt.plot(fpr, tpr, lw=2, label='{} AUC={} (95% CI: {}-{})'.format('a', roc_auc, auc_l, auc_h))
 
 
    plt.xlim([0.00, 1.0])
    plt.ylim([0.00, 1.0])
    plt.xlabel('1-Specificity')
    plt.ylabel('Sensitivity')
    # plt.title('ROC')
    plt.legend(loc="lower right")
    plt.plot([0,1], [0, 1], color='gray', linestyle='dashed')
    plt.show()
 
if __name__=='__main__':
    plot_roc()

展示出来的结果，大致如下：

六、FROC拓展

代码如下：


# coding=UTF-8
# 
from sklearn import metrics
import matplotlib.pylab as plt
 
GTlist = [1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
          0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
 
Problist = [0.99, 0.98, 0.97, 0.93, 0.85, 0.80, 0.79, 0.75, 0.70, 0.65,
            0.64, 0.63, 0.55, 0.54, 0.51, 0.49, 0.30, 0.2, 0.1, 0.09,
            0.1, 0.5, 0.6, 0.7, 0.8, 0.5, 0.2, 0.3, 0.2, 0.5]
 
# num of image
totalNumberOfImages = 2
numberOfDetectedLesions = sum(GTlist)
totalNumberOfCandidates = len(Problist)
 
fpr, tpr, thresholds = metrics.roc_curve(GTlist, Problist, pos_label=1)
 
# FROC
fps = fpr * (totalNumberOfCandidates - numberOfDetectedLesions) / totalNumberOfImages
sens = tpr
 
print(fps)
print(sens)
plt.plot(fps, sens, color='b', lw=2)
plt.legend(loc='lower right')
# plt.plot([0, 1], [0, 1], 'r--')
plt.xlim([fps.min(), fps.max()])
plt.ylim([0, 1.1])
plt.xlabel('Average number of false positives per scan')  # 横坐标是fpr
plt.ylabel('True Positive Rate')  # 纵坐标是tpr
plt.title('FROC performence')
plt.show()

展示结果如下：

七、MedCalc统计软件绘制ROC

MedCalc是一款医学专用的统计计算软件，在研究医学领域有较为广泛的应用，软件不大，而功能却很强大，用图形化的界面直观明了的显示所统计的结果，这里就简单介绍下medcalc的统计教程。

官方下载地址：(只有15天试用期。由于不能乱传播，仅用作学习使用。若想获得免费版本，评论备注信息，留下邮箱)Download MedCalc Version 20.106https://www.medcalc.org/download/下面已绘制ROC曲线为例，进行介绍，步骤如图所示：

绘制的结果如下：

这也是绘制ROC的一种方式，比较快捷。只要准备好需要的数据，既可以直接绘制。注意，这里统计预测分数时候，阈值一定要取的比较低，比如0.01。这样在绘制曲线时候，阈值的选择面才会大。

百度文档对这块进行了详述，更多内容去看这里：MedCalc常用统计教程https://jingyan.baidu.com/article/ca41422f219a641eae99edea.html

八、AUC值

AUC 是 ROC 曲线下面的面积，AUC 可以解读为从所有正例中随机选取一个样本 A，再从所有负例中随机选取一个样本 B，分类器将 A 判为正例的概率比将 B 判为正例的概率大的可能性。

也就是：任意取一个正样本和负样本，正样本得分大于负样本的概率。

AUC 反映的是分类器对样本的排序能力。AUC 越大，自然排序能力越好，即分类器将越多的正例排在负例之前。

AUC ＝ 1，代表完美分类器
0.5 < AUC < 1，优于随机分类器
0 < AUC < 0.5，差于随机分类器

AUC的公式：

问1：数据不平衡，对AUC有影响吗？

答1：数据不平衡对 auc 影响不大（ROC曲线下的面积，ROC的横纵坐标分别是真阳性率和1-真阴性率）。

问2：还有什么指标可以针对不平衡数据进行评估？

答2：还可以使用 PR(Precision-Recall )曲线。

九、mAP

更高的mAP，意味着模型的表现更优秀。实现代码部分，参考这里，大致分为三个步骤：

构建ground-truth files
预测生成detection-results files
python main.py

github mAPhttps://github.com/Cartucho/mAP下面的图，就是AP的绘制图，其中：

横轴是recall
数轴是precision
蓝色区域的面积，就是AP
多个类的平均值，就是mAP

十、总结

本篇对模型评估的方式做了一个汇总，同时对其中几个重要的、常用的方法进行了单独的文章介绍。同时，也提供了一些软件，可以帮助我们再不使用代码，或者少使用代码的情况下，绘制对应的图。

但是呢，究竟针对你自己的项目，需要使用哪些评估方法，还有根据你自己具体的项目进行选择，这个相信你学习透了，自己也能够想清楚应该选择哪个，并且知道了为什么。

最后，如果您觉得本篇文章对你有帮助，欢迎点赞，让更多人看到，这是对我继续写下去的鼓励。如果能再点击下方的红包打赏，给博主来一杯咖啡，那就太好了。

参考资料：

1.模型评估之混淆矩阵（confusion_matrix）含义及Python代码实现

2.混淆矩阵(Confusion matrix)的原理及使用(scikit-learn 和 tensorflow) - klchang - 博客园

3.FP,FN,TP,TN与精确率(Precision),召回率(Recall),准确率(Accuracy)_littlehaes的博客-CSDN博客_fp tphttps://blog.csdn.net/littlehaes/article/details/83278256

本文内容由网友自发贡献，转载请注明出处：https://www.wpsshop.cn/w/知新_RL/article/detail/151742