【目标检测-YOLO】YOLOv5-v6.0-网络架构详解(第二篇)

参考:YOLOv5-5.0v-yaml 解析及模型构建(第二篇)_星魂非梦的博客-CSDN博客

前文:YOLOv5-v6.0-yolov5s网络架构详解(第一篇)_星魂非梦的博客-CSDN博客_yolov5s网络结构

本文目的是画出更规范的架构图。前文的不太规范。

1. v6.0相比v5.0的重要更新:Releases · ultralytics/yolov5 · GitHub

  • Roboflow Integration ⭐ NEW: Train YOLOv5 models directly on any Roboflow dataset with our new integration(集成)! (#4975 by @Jacobsolawetz)

  • YOLOv5n 'Nano' models ⭐ NEW: New smaller YOLOv5n (1.9M params) model below YOLOv5s (7.5M params), exports to 2.1 MB INT8 size, ideal for ultralight(超轻量级) mobile(移动端) solutions. (#5027 by @glenn-jocher)

  • TensorFlow and Keras Export: TensorFlow, Keras, TFLite, TF.js model export now fully integrated(集成的) using python export.py --include saved_model pb tflite tfjs (#1127 by @zldrobit)

  • OpenCV DNN: YOLOv5 ONNX models are now compatible(兼容) with both OpenCV DNN and ONNX Runtime (#4833 by @SamFC10).

  • Model Architecture: Updated backbones are slightly smaller, faster and more accurate.

  • Replacement of Focus() with an equivalent(等同的) Conv(k=6, s=2, p=2) layer (#4825 by @thomasbi1) for improved exportability(可移植性)
  • New SPPF() replacement for SPP() layer for reduced ops (#4420 by @glenn-jocher)
  • Reduction in P3 backbone layer C3() repeats from 9 to 6 for improved speeds
  • Reorder(重新排序) places SPPF() at end of backbone
  • Reintroduction of shortcut in the last C3() backbone layer
  • Updated hyperparameters with increased mixup and copy-paste augmentation
  • 本文只关注Model Architecture的改变。

    2. 配置文件:models/yolov5s.yaml

    # YOLOv5 🚀 by Ultralytics, GPL-3.0 license
    
    # Parameters
    nc: 80  # number of classes
    depth_multiple: 0.33  # model depth multiple
    width_multiple: 0.50  # layer channel multiple
    anchors:
      - [10,13, 16,30, 33,23]  # P3/8
      - [30,61, 62,45, 59,119]  # P4/16
      - [116,90, 156,198, 373,326]  # P5/32
    
    # YOLOv5 v6.0 backbone
    backbone:
      # [from, number, module, args]
      [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
       [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
       [-1, 3, C3, [128]],
       [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
       [-1, 6, C3, [256]],
       [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
       [-1, 9, C3, [512]],
       [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
       [-1, 3, C3, [1024]],
       [-1, 1, SPPF, [1024, 5]],  # 9
      ]
    
    # YOLOv5 v6.0 head
    head:
      [[-1, 1, Conv, [512, 1, 1]],
       [-1, 1, nn.Upsample, [None, 2, 'nearest']],
       [[-1, 6], 1, Concat, [1]],  # cat backbone P4
       [-1, 3, C3, [512, False]],  # 13
    
       [-1, 1, Conv, [256, 1, 1]],
       [-1, 1, nn.Upsample, [None, 2, 'nearest']],
       [[-1, 4], 1, Concat, [1]],  # cat backbone P3
       [-1, 3, C3, [256, False]],  # 17 (P3/8-small)
    
       [-1, 1, Conv, [256, 3, 2]],
       [[-1, 14], 1, Concat, [1]],  # cat head P4
       [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)
    
       [-1, 1, Conv, [512, 3, 2]],
       [[-1, 10], 1, Concat, [1]],  # cat head P5
       [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)
    
       [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
      ]
    

    2.1 Replacement of Focus() with an equivalent(等同的) Conv(k=6, s=2, p=2) layer (#4825 by @thomasbi1) for improved exportability(可移植性)

     2.2 New SPPF() replacement for SPP() layer for reduced ops (#4420 by @glenn-jocher)

    2.3 Reduction in P3 backbone layer C3() repeats from 9 to 6 for improved speeds

    3. 总架构图

    yolov5-5.0 架构图 

     yolov5-6.0 架构图 

  • Reorder(重新排序) places SPPF() at end of backbone
  • Reintroduction of shortcut in the last C3() backbone layer
  •  从两个图可知:6.0 将SPPF()放在backbone的最后;8模块为C3_1 引进了 shortcut。

    补充:数据增强部分:increased mixup and copy-paste augmentation

    4. 推理

    以上架构图为模型训练时候的图,在模型推理时候,models/yolo.py–Detect类中,会把3个head的输出进行 cat。

    解释参考:YOLOv5-5.0v-yaml 解析及模型构建(第二篇)_星魂非梦的博客-CSDN博客

                if not self.training:  # inference
                    if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic:
                        self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
    
                    y = x[i].sigmoid()
                    if self.inplace:
                        y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
                        y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
                    else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
                        xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
                        wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
                        y = torch.cat((xy, wh, y[..., 4:]), -1)
                    z.append(y.view(bs, -1, self.no))

    然后再进行后处理。

    来源:理心炼丹

    物联沃分享整理
    物联沃-IOTWORD物联网 » 【目标检测-YOLO】YOLOv5-v6.0-网络架构详解(第二篇)

    发表评论