【目标检测】YOLOv7算法实现(二)：正样本匹配(SimOTA)与损失计算_yolov7s

作者：繁依Fanyi0 | 2024-03-17 21:38:18

踩

yolov7s

本系列文章记录本人硕士阶段YOLO系列目标检测算法自学及其代码实现的过程。其中算法具体实现借鉴于ultralytics YOLO源码Github，删减了源码中部分内容，满足个人科研需求。
本篇文章在YOLOv5算法实现的基础上，进一步完成YOLOv7算法的实现。YOLOv7相比于YOLOv5，最主要的不同之处如下：

模型结构：引进了更为高效的特征提取模块(ELAN)、下采样模块(MP)，不同的空间池化层(SPPCSPC)，重参数卷积(RepConv)
正样本匹配：结合YOLOv5中和正样本匹配方法和YOLOX中的正样本筛选方法(SimOTA)

文章地址：
YOLOv7算法实现(一)：模型搭建
 YOLOv7算法实现(二)：正样本匹配(SimOTA)与损失计算

0 引言

YOLOv7中的正样本匹配在YOLOv5的正样本匹配基础上进一步通过SimOTA对正样本进行筛选，损失计算流程如图1所示。
在这里插入图片描述

图1 YOLOv7损失计算流程

1 正样本匹配

YOLOv5的正样本匹配方法可见文章YOLOv5算法实现(四)：损失计算。在YOLOv5正样本匹配方法中，在每一个feature_map上，根据目标中心点所在位置至多使用三个预测单元对目标进行匹配，在每一个预测单元中，根据宽高比至多使用三个Anchor对目标进行匹配，因此经过YOLOv5正样本匹配后，一个目标至多得到27个匹配样本。
SimOTA正样本筛选流程如下：

计算实际目标nt与匹配样本nt_n的IoU损失：
$pair\_wise\_iou\_loss = - \log (iou)$
计算实际目标nt与匹配样本nt_n的类别交叉熵损失：
$pair\_wise\_cls\_loss = - y\log (\sigma ({y_{pred}})) - (1 - y)\log (\sigma (1 - {y_{pred}}))$
根据IoU损失总和确定每一个实际目标nt的dynamic_k(每一个nt匹配的样本数量)
计算匹配样本总损失：
$pair\_wise\_loss = pair\_wise\_cls\_loss + 3pair\_wise\_iou\_loss$
根据总损失和dynamic_k对匹配的正样本进行筛选
假设某目标(类别为3)在某训练批次中得到了7个匹配结果，其SimOTA正样本筛选示例如图2所示。

在这里插入图片描述

图2 SimOTA计算示例

2 损失计算

YOLOv7中损失计算方式与YOLOv5一致，包含以下三个部分：

位置损失(仅计算正样本)：
$I o uL oss = 1 - C I o U$

在这里插入图片描述

图3 常见IoU计算方法

类别损失(仅计算正样本):
$\sum\limits_{i = 0}^{nf} {\{ {1 \over n}\sum\limits_{j = 0}^n {[{1 \over {nc}}\sum\limits_{k = 0}^{k = nc} {({y}} } } \log (\sigma ({p})) + (1 - {y})\log (1 - \sigma ({p})))]\}$
置信度损失(所有样本)：
$\sum\limits_{i = 0}^{nf} {\{ {1 \over {na}}\sum\limits_{j = 0}^{na} {[{1 \over {gridy \times gridx}}\sum\limits_{m = 0}^{gridy} {\sum\limits_{n = 0}^{gridx} {(y\log (\sigma (p)) + (1 - y)\log (1 - \sigma (p)))]} } } } \}$

3 代码实现

3.1 正样本匹配

YOLOv5匹配方法

    def find_3_positive(self, p, targets):
        # Build targets for compute_loss(), input targets(num_gt,(image_index,class,x,y,w,h))
        # input p (num_feature_map, bs, ba, y, x, (x, y, w, h, obj, classes)) 相对坐标
        # na: 每个特征图上的anchors数量; nt: 当前训练图像的正样本个数
        na, nt = self.na, targets.shape[0]  # number of anchors, targets
        indices, anch = [], []
        # gain是为了后面将targets=[na, nt, t]中归一化了的xywh映射到相对feature map尺度上
        # image_index + class + xywh + anchor_index
        gain = torch.ones(7, device=targets.device).long()
        ai = torch.arange(na, device=targets.device).float().view(na, 1).repeat(1, nt)  # same as .repeat_interleave(nt)
        # tagets [na, num_gt, (image_index,class,x,y,w,h, anchors_index)]
        # 对一张特征图上的三个anchors均进行正样本匹配
        targets = torch.cat((targets.repeat(na, 1, 1), ai[:, :, None]), 2)  # append anchor indices

        # 匹配的grid
        g = 0.5  # bias
        off = torch.tensor([[0, 0],
                            [1, 0], [0, 1], [-1, 0], [0, -1],  # j,k,l,m
                            # [1, 1], [1, -1], [-1, 1], [-1, -1],  # jk,jm,lk,lm
                            ], device=targets.device).float() * g  # offsets

        # 对每一个尺度的features上的正样本进行匹配
        for i in range(self.nl):
            anchors = self.anchors[i]  # 当前feature_map上的anchors绝对尺寸
            # xyxy增益, 用于将targets中的(images_index, class, x, y, w, h, anchor_index)相对坐标转换为feature_map上的绝对坐标
            gain[2:6] = torch.tensor(p[i].shape)[[3, 2, 3, 2]]  # xyxy gain
            t = targets * gain
            if nt:
                # 根据目标的wh和anchors的wh比例筛选匹配的anchors
                r = t[:, :, 4:6] / anchors[:, None]  # wh ratio
                # torch.max(r, 1. / r).max(2) -> return: values, index
                j = torch.max(r, 1. / r).max(2)[0] < self.hyp['anchor_t']  # compare
                # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t']  # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2))
                # 根据长宽比对正样本进行筛选
                t = t[j]

                # Offsets
                gxy = t[:, 2:4]  # gxy: 目标center相对左上角的偏置(用于选择左、上、左上grid)
                gxi = gain[[2, 3]] - gxy  # gxi: 目标center相对右下角的偏置(用于选择右、下、右下grid)
                j, k = ((gxy % 1. < g) & (gxy > 1.)).T
                l, m = ((gxi % 1. < g) & (gxi > 1.)).T
                j = torch.stack((torch.ones_like(j), j, k, l, m))
                # 将t复制5份, 用j筛选出需要保留的正样本
                t = t.repeat((5, 1, 1))[j]
                # [0, 0], [1, 0], [0, 1], [-1, 0], [0, -1]
                # 构造所有正样本的偏置
                offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]
            else:
                t = targets[0]
                offsets = 0

            # Define
            b, c = t[:, :2].long().T  # image indices, class
            gxy = t[:, 2:4]  # grid xy features_map上绝对坐标
            gwh = t[:, 4:6]  # grid wh
            gij = (gxy - offsets).long()  # 减去偏置获得匹配的grid坐标点
            gi, gj = gij.T  # grid xy indices

            # Append
            a = t[:, 6].long()  # anchor indices
            # image indices, anchor indices, gj, gi
            indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1)))
            # 正样本对应的anchors大小, 当前features map上的绝对尺寸
            anch.append(anchors[a])  # anchors

        return indices, anch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

SimOTA正样本筛选

    def build_targets(self, p, targets, imgs):
        '''
        :param p: [feature_map indices, bs, ba, y, x, (x, y, w, h, obj, num_class)]回归参数
        :param targets: [num_gt, (image_index, classes, x, y, w, h)]相对坐标
        :param imgs: [num_img, 3, y, x]
        '''
        # indices: [feature_map indices, image indices list, anchor indices list, gj, gi]
        # anch: 每个正样本对应的anchors大小(在对应feature map上的绝对坐标)
        '''
        1. 根据target中心点x, y确定作为正样本的cell(gj, gi);
        根据target的w, h和anchors的长宽比确定每一个cell中进行匹配的anchor
        indices: feature_map_list{image_indices_list, anchor indices_list, gj, gi}
        anch: feature_map_list{anchor_size}
        '''
        indices, anch = self.find_3_positive(p, targets)
        device = torch.device(targets.device)
        '''
        2. 根据OTA算法对上一步匹配的正样本进行进一步筛选
        每一张图片实际目标数nt, 匹配到的对应正样本数n_gt
            a.对每一个nt和每一个n_gt的预测结果计算其iou矩阵[nt, n_gt], 求出iou损失[nt, n_gt]
            b.对每一个nt和每一个n_gt的预测结果计算其类别损失矩阵[nt, n_gt]
            c.根据iou损失总和确定一个dynamic_k(每一个nt需要几个n_gt进行匹配)
            d.计算cost矩阵(loss_iou + a * cls_iou)
            e.根据cost矩阵和dynamic_k确定nt匹配的正样本所在feature_map, gj,gi,anchor
        '''
        matching_bs = [[] for pp in p]  # images
        matching_as = [[] for pp in p]  # anchor
        matching_gjs = [[] for pp in p]  # gj
        matching_gis = [[] for pp in p]  # gi
        matching_targets = [[] for pp in p]  # 匹配的正样本
        matching_anchs = [[] for pp in p]  # 对应的anchors大小

        nl = len(p)  # 输出不同尺寸特征图数量
        # 对每一张图片进行正样本匹配
        for batch_idx in range(p[0].shape[0]):
            b_idx = targets[:, 0] == batch_idx
            this_target = targets[b_idx]  # 获得当前图片的实际目标
            if this_target.shape[0] == 0:
                continue
            # 得到在原图尺度的(x, y, w, h)绝对坐标 -> (xmin, ymin, xmax, ymax)
            txywh = this_target[:, 2:6] * imgs[batch_idx].shape[1]
            txyxy = xywh2xyxy(txywh)

            pxyxys = []  # 预测的位置回归参数
            p_cls = []  # 预测的类别置信度
            p_obj = []  # 预测的目标置信度
            from_which_layer = []  # 当前预测特征来自哪个feature_map
            all_b = []  # image indices(所有特征图)
            all_a = []  # anchor indices(所有特征图)
            all_gj = []  # gj(所有特征图)
            all_gi = []  # gi(所有特征图)
            all_anch = []  # anchor size(所有特征图)
            # 针对每个特征图匹配到的正样本进行OTA算法cost计算进一步对正样本进行筛选
            for i, pi in enumerate(p):
                b, a, gj, gi = indices[i]  # image indices, anchor indices, gj gi
                idx = (b == batch_idx)  # 得到第一次匹配得到的属于当前图片的正样本
                b, a, gj, gi = b[idx], a[idx], gj[idx], gi[idx]  # image indices, anchor indices, gj gi
                all_b.append(b)  # 当前图片第i个输出特征图的匹配images
                all_a.append(a)  # 当前图片第i个输出特征图的匹配anchors indices
                all_gj.append(gj)  # 当前图片第i个输出特征图的匹配gj
                all_gi.append(gi)  # 当前图片第i个输出特征图的匹配gi
                all_anch.append(anch[i][idx])  # 当前图片第i个输出特征图的匹配anchors大小(当前特征图上的绝对尺寸)
                from_which_layer.append((torch.ones(size=(len(b),)) * i).to(device))  # 当前匹配的正样本来自哪个输出特征图

                fg_pred = pi[b, a, gj, gi]  # 当前匹配的正样本预测结果(x, y, w, h, obj, cls)
                p_obj.append(fg_pred[:, 4:5])  # 预测目标置信度
                p_cls.append(fg_pred[:, 5:])  # 预测类别类别

                grid = torch.stack([gi, gj], dim=1)
                # 预测结果(x, y)回归参数转换为原图的(x, y)绝对坐标
                pxy = (fg_pred[:, :2].sigmoid() * 2. - 0.5 + grid) * self.stride[i]  # / 8.
                # pxy = (fg_pred[:, :2].sigmoid() * 3. - 1. + grid) * self.stride[i]
                # 预测结果(w, h)回归参数转换为原图的(w, h)绝对坐标
                pwh = (fg_pred[:, 2:4].sigmoid() * 2) ** 2 * anch[i][idx] * self.stride[i]  # / 8.
                # 预测结果(x, y, w, h)原图上的绝对坐标 -> (xmin, ymin, xmax, ymax)
                pxywh = torch.cat([pxy, pwh], dim=-1)
                pxyxy = xywh2xyxy(pxywh)
                pxyxys.append(pxyxy)

            pxyxys = torch.cat(pxyxys, dim=0)  # 预测结果xyxy:原图上的绝对大小
            if pxyxys.shape[0] == 0:
                continue
            p_obj = torch.cat(p_obj, dim=0)  # 预测结果目标置信度
            p_cls = torch.cat(p_cls, dim=0)  # 预测结果类别置信度
            from_which_layer = torch.cat(from_which_layer, dim=0)  # 预测结果属于哪个feature_map
            all_b = torch.cat(all_b, dim=0)  # 预测结果属于batch中哪张图片
            all_a = torch.cat(all_a, dim=0)  # 预测结果属于哪个anchor
            all_gj = torch.cat(all_gj, dim=0)  # 预测结果属于哪个gj
            all_gi = torch.cat(all_gi, dim=0)  # 预测结果属于哪个gi
            all_anch = torch.cat(all_anch, dim=0)  # 预测结果的anchor大小(对应feature_map上的绝对大小)
            # 计算pxyxy和txyxy的iou(均为原图上的实际大小)
            # txytxt:[nt, 4], pxypxy:[np, 4] -> pair_wise_iou: [nt, np]
            pair_wise_iou = box_iou(txyxy, pxyxys)
            # iou损失
            pair_wise_iou_loss = -torch.log(pair_wise_iou + 1e-8)
            # 根据iou从大到小选取至多10个iou
            top_k, _ = torch.topk(pair_wise_iou, min(10, pair_wise_iou.shape[1]), dim=1)
            # 根据iou的总和确定dynamic_ks(每一个目标选择的匹配正样本数量), 至少会选择一个正样本对目标进行匹配
            dynamic_ks = torch.clamp(top_k.sum(1).int(), min=1)
            # 对当前图片的实际标签cls进行独热编码(对每一个nt进行扩充成和p一样的数量)
            gt_cls_per_image = (
                F.one_hot(this_target[:, 1].to(torch.int64), self.nc)  # 对类别标签进行独热编码: [nt, nc]
                .float()
                .unsqueeze(1)  # [nt, 1, nc]
                .repeat(1, pxyxys.shape[0], 1)  # [nt, n_gt, nc]
            )
            # 当前图片的实际目标个数, 对预测的置信度(类别置信度x目标置信度)进行扩充, 给每一个nt分配
            num_gt = this_target.shape[0]
            cls_preds_ = (
                    p_cls.float().unsqueeze(0).repeat(num_gt, 1, 1).sigmoid_()
                    * p_obj.unsqueeze(0).repeat(num_gt, 1, 1).sigmoid_()
            )

            y = cls_preds_.sqrt_()
            pair_wise_cls_loss = F.binary_cross_entropy_with_logits(
                torch.log(y / (1 - y)), gt_cls_per_image, reduction="none"
            ).sum(-1)  # 计算类别损失
            del cls_preds_

            cost = (
                    pair_wise_cls_loss
                    + 3.0 * pair_wise_iou_loss
            )

            matching_matrix = torch.zeros_like(cost, device=device)
            # 确定每一个gt匹配的g_nt(根据cost和dynamic_k)
            for gt_idx in range(num_gt):
                _, pos_idx = torch.topk(
                    cost[gt_idx], k=dynamic_ks[gt_idx].item(), largest=False
                )
                matching_matrix[gt_idx][pos_idx] = 1.0

            del top_k, dynamic_ks
            # 当同一个gt匹配了多个t时, 根据cost选择gt对哪一个t进行匹配
            anchor_matching_gt = matching_matrix.sum(0)
            if (anchor_matching_gt > 1).sum() > 0:
                _, cost_argmin = torch.min(cost[:, anchor_matching_gt > 1], dim=0)
                matching_matrix[:, anchor_matching_gt > 1] *= 0.0
                matching_matrix[cost_argmin, anchor_matching_gt > 1] = 1.0
            fg_mask_inboxes = (matching_matrix.sum(0) > 0.0).to(device)  # 保留匹配到的正样本
            matched_gt_inds = matching_matrix[:, fg_mask_inboxes].argmax(0)  # 每一个gt匹配的实际目标索引
            # 保留OTA算法进一步匹配到的结果
            from_which_layer = from_which_layer[fg_mask_inboxes]
            all_b = all_b[fg_mask_inboxes]
            all_a = all_a[fg_mask_inboxes]
            all_gj = all_gj[fg_mask_inboxes]
            all_gi = all_gi[fg_mask_inboxes]
            all_anch = all_anch[fg_mask_inboxes]

            this_target = this_target[matched_gt_inds]
            # 将每一个feature_map的预测结果分开
            for i in range(nl):
                layer_idx = from_which_layer == i
                matching_bs[i].append(all_b[layer_idx])
                matching_as[i].append(all_a[layer_idx])
                matching_gjs[i].append(all_gj[layer_idx])
                matching_gis[i].append(all_gi[layer_idx])
                matching_targets[i].append(this_target[layer_idx])
                matching_anchs[i].append(all_anch[layer_idx])
        # 将所有图片匹配到的正样本进行合并
        for i in range(nl):
            if matching_targets[i] != []:
                matching_bs[i] = torch.cat(matching_bs[i], dim=0)
                matching_as[i] = torch.cat(matching_as[i], dim=0)
                matching_gjs[i] = torch.cat(matching_gjs[i], dim=0)
                matching_gis[i] = torch.cat(matching_gis[i], dim=0)
                matching_targets[i] = torch.cat(matching_targets[i], dim=0)
                matching_anchs[i] = torch.cat(matching_anchs[i], dim=0)
            else:
                matching_bs[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)
                matching_as[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)
                matching_gjs[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)
                matching_gis[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)
                matching_targets[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)
                matching_anchs[i] = torch.tensor([], device='cuda:0', dtype=torch.int64)

        return matching_bs, matching_as, matching_gjs, matching_gis, matching_targets, matching_anchs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

3.2 损失计算

class ComputeLossOTA:
    # Compute losses
    def __init__(self, model, autobalance=False):
        super(ComputeLossOTA, self).__init__()
        device = next(model.parameters()).device  # get model device
        h = model.hyp  # hyperparameters

        # Define criteria
        BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
        BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))

        # Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
        self.cp, self.cn = smooth_BCE(eps=h.get('label_smoothing', 0.0))  # positive, negative BCE targets

        # Focal loss
        g = h['fl_gamma']  # focal loss gamma
        if g > 0:
            BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)

        m = model.model[-1]  # Detect() module
        self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, .02])  # P3-P7
        self.ssi = list(m.stride).index(16) if autobalance else 0  # stride 16 index
        self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance
        self.na = m.na  # anchors数量
        self.nc = m.nc  # 类别数量
        self.nl = m.nl  # 输出特征层数量
        self.anchors = m.anchors  # anchors [3, 3, 2], 缩放到feature map上的anchors尺寸
        self.stride = m.stride  # 输出特征图在输入特征图上的跨度
        self.device = device  # 数据存储设备

    def __call__(self, p, targets, imgs):  # predictions, targets, model
        '''
        正样本匹配, 计算损失
        :param p: [num_feature_map, batch_size, num_anchors, y, x, (x + y + w + h + obj + num_class)]
        :param targets: [num_gt, (image indices, classes, x, y, w, h)]
        :param imgs: [num_img, 3, y, x]
        '''
        device = targets.device
        # 分类损失, 位置损失, 置信度损失
        lcls, lbox, lobj = torch.zeros(1, device=device), torch.zeros(1, device=device), torch.zeros(1, device=device)
        '''
        正样本匹配:
        1. 根据target中心点x, y确定作为正样本的cell(gj, gi);
           根据target的w, h和anchors的长宽比确定每一个cell中进行匹配的anchor
           input:[nt, 6] output:[nt*cell_num*anchor_num, 6];
        2. 根据Optimal Transport Assignment(OTA)算法对上一步筛选出来的正样本计算cost进一步对正样本进行筛选;
        bs: 正样本匹配的images indices; as_: 正样本匹配的anchor索引; gjs, gis: 预测该正样本的gj, gi
        targets: 该正样本匹配的实际target(image indices, class, x, y, w, h)相对坐标
        anchors: 正样本的anchors大小(对应特征图上的绝对大小)
        '''
        bs, as_, gjs, gis, targets, anchors = self.build_targets(p, targets, imgs)
        # 预测结果的x, y, w, h增益(feature_map)
        pre_gen_gains = [torch.tensor(pp.shape, device=device)[[3, 2, 3, 2]] for pp in p]

        # 根据匹配的正样本计算Losses
        for i, pi in enumerate(p):  # layer index, layer predictions
            b, a, gj, gi = bs[i], as_[i], gjs[i], gis[i]  # image, anchor, gridy, gridx
            tobj = torch.zeros_like(pi[..., 0], device=device)  # target obj

            n = b.shape[0]  # 匹配到的正样本数量
            if n:
                ps = pi[b, a, gj, gi]  # 预测结果(x, y, w, h, obj, classes)

                # 预测结果进行回归
                grid = torch.stack([gi, gj], dim=1)
                pxy = ps[:, :2].sigmoid() * 2. - 0.5
                # pxy = ps[:, :2].sigmoid() * 3. - 1.
                pwh = (ps[:, 2:4].sigmoid() * 2) ** 2 * anchors[i]
                pbox = torch.cat((pxy, pwh), 1)  # 预测box(回归到对应feature_map尺度)
                selected_tbox = targets[i][:, 2:6] * pre_gen_gains[i]  # 转换到feature_map尺度
                selected_tbox[:, :2] -= grid
                iou = bbox_iou(pbox, selected_tbox, CIoU=True).squeeze()  # iou(prediction, target)
                lbox += (1.0 - iou).mean()  # iou损失

                # 目标置信度(根据iou给正样本标签分配, 负样本标签为0)
                tobj[b, a, gj, gi] = (1.0 - self.gr) + self.gr * iou.detach().clamp(0).type(tobj.dtype)  # iou ratio

                # 类别标签
                selected_tcls = targets[i][:, 1].long()
                if self.nc > 1:  # 分类损失(含有多个类别时), 仅计算正样本的
                    t = torch.full_like(ps[:, 5:], self.cn, device=device)  # 负样本标签cn
                    t[range(n), selected_tcls] = self.cp  # 正样本标签cp
                    lcls += self.BCEcls(ps[:, 5:], t)  # BCE

                # Append targets to text file
                # with open('targets.txt', 'a') as file:
                #     [file.write('%11.5g ' * 4 % tuple(x) + '\n') for x in torch.cat((txy[i], twh[i]), 1)]

            obji = self.BCEobj(pi[..., 4], tobj)
            lobj += obji * self.balance[i]  # obj loss
            if self.autobalance:
                self.balance[i] = self.balance[i] * 0.9999 + 0.0001 / obji.detach().item()

        if self.autobalance:
            self.balance = [x / self.balance[self.ssi] for x in self.balance]
        lbox *= self.hyp['box']
        lobj *= self.hyp['obj']
        lcls *= self.hyp['cls']
        bs = tobj.shape[0]  # batch size

        loss = lbox + lobj + lcls
        # return loss * bs, torch.cat((lbox, lobj, lcls, loss)).detach()
        return {"box_loss": lbox,
                "obj_loss": lobj,
                "class_loss": lcls}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/258114