YOLOv7训练自己的数据集
〇、前言
深度学习萌新第一次写教程,如果有不对的地方,还望大佬们指正。
YOLOv7训练自己的数据集其实和YOLOv5没有太大区别。
一、环境配置
代码地址:https://github.com/WongKinYiu/yolov7
官方环境要求:
matplotlib>=3.2.2
numpy>=1.18.5
opencv-python>=4.1.1
Pillow>=7.1.2
PyYAML>=5.3.1
requests>=2.23.0
scipy>=1.4.1
torch>=1.7.0,不支持1.12
torchvision>=0.8.1,不支持1.13
tqdm>=4.41.0
protobuf<4.21.3
通过git clone https://github.com/WongKinYiu/yolov7.git将源码下载到本地,cd到源码目录下,用pip install -r requirements.txt安装依赖
PS:如果github连接不上,可以去gitee建一个库,把github的库导进去,再用git clone gitee的库就可以了。
在官方github仓库,下载你需要的预训练模型pt文件
我也把这六个预训练模型上传到百度云,可以自行取用下载
百度云链接:https://pan.baidu.com/s/162TROQhV95EC9eQfxCB9uQ
提取码:mkka
二、准备自己的数据集
2.1.创建数据集目录
数据集结构如下:
dataset
…Annotations//放标注好的XML文件
…images//放图片
…ImageSets//存放生成的.txt文件
2.2生成txt文件
创建makeTXT.py文件,写入如下代码,路径均改为你的数据集的路径(绝对路径、相对路径均可)
import os
import random
trainval_percent = 0.9
train_percent = 0.9
xmlfilepath = 'E:/Study/DeepLearn/dataset/Annotations'
txtsavepath = 'E:/Study/DeepLearn/dataset/ImageSets'
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
ftrainval = open('E:/Study/DeepLearn/dataset/ImageSets/trainval.txt', 'w')
ftest = open('E:/Study/DeepLearn/dataset/ImageSets/test.txt', 'w')
ftrain = open('E:/Study/DeepLearn/dataset/ImageSets/train.txt', 'w')
fval = open('E:/Study/DeepLearn/dataset/ImageSets/val.txt', 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
2.3将数据集转化为yolo格式
创建voc_label.py文件,代码如下,路径改为你数据集路径(绝对路径、相对路径均可),运行后会在根目录下生成一个labels文件夹和train.txt、val.txt、test.txt三个文件。
import xml.etree.ElementTree as ET
import os
from os import getcwd
sets = ['train', 'val', 'test']
classes = [''] # 写自己类别
abs_path = os.getcwd()
print(abs_path)
def convert(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = (box[0] + box[1]) / 2.0 - 1
y = (box[2] + box[3]) / 2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return x, y, w, h
def convert_annotation(image_id):
in_file = open('E:/Study/DeepLearn/dataset/Annotations/%s.xml' % (image_id), encoding='UTF-8')
out_file = open('E:/Study/DeepLearn/dataset/labels/%s.txt' % (image_id), 'w')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('Difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
b1, b2, b3, b4 = b
if b2 > w:
b2 = w
if b4 > h:
b4 = h
b = (b1, b2, b3, b4)
bb = convert((w, h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for image_set in sets:
if not os.path.exists('E:/Study/DeepLearn/dataset/labels/'):
os.makedirs('E:/Study/DeepLearn/dataset/labels/')
image_ids = open('E:/Study/DeepLearn/dataset/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()
list_file = open('E:/Study/DeepLearn/dataset/%s.txt' % (image_set), 'w')
for image_id in image_ids:
list_file.write(abs_path + 'E:/Study/DeepLearn/dataset/images/%s.jpg\n' % (image_id))
convert_annotation(image_id)
list_file.close()
三、修改文件
3.1数据集yaml文件配置
在源码目录下data文件夹中,新建一个mydata.yaml文件,直接把官方给的coco.yaml里的内容复制过来,从他的基础上修稿,修改后内容如下:
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
#把voc_labels.py创建的那三个txt文件的路径写进来
train: E:/Study/DeepLearn/dataset/train.txt
val: E:/Study/DeepLearn/dataset/val.txt
test: E:/Study/DeepLearn/dataset/test.txt
#写你的类别数量
nc: 2
#写你的类别名称
names: ['car','airplane']
3.2修改网络参数配置文件
在源码目录下cfg/training文件夹下面,修改你需要使用的模型对应的文件,修改类别数量即可
3.3修改train.py文件
打开train.py文件,一般只需要修改
–weights(改为你要用预训练模型及路径)
–cfg(在default=后写入你要用的模型对应的网络参数配置文件(就是刚才3.2修改的那个)的路径)
–data(写入你3.1步骤创建的那个数据集yaml文件的路径)
–epochs(训练多少次)
–batch-size(批次大小,一次训练选择的样本数,我2080 8G带不动16的batch-size)
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='yolov7.pt', help='initial weights path')
parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
parser.add_argument('--data', type=str, default='data/coco.yaml', help='data.yaml path')
parser.add_argument('--hyp', type=str, default='data/hyp.scratch.p5.yaml', help='hyperparameters path')
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs')
parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
parser.add_argument('--rect', action='store_true', help='rectangular training')
parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
parser.add_argument('--notest', action='store_true', help='only test final epoch')
parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
parser.add_argument('--project', default='runs/train', help='save to project/name')
parser.add_argument('--entity', default=None, help='W&B entity')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--quad', action='store_true', help='quad dataloader')
parser.add_argument('--linear-lr', action='store_true', help='linear LR')
parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
parser.add_argument('--upload_dataset', action='store_true', help='Upload dataset as W&B artifact table')
parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval for W&B')
parser.add_argument('--save_period', type=int, default=-1, help='Log model after every "save_period" epoch')
parser.add_argument('--artifact_alias', type=str, default="latest", help='version of dataset artifact to be used')
opt = parser.parse_args()
四、训练模型
上面操作全部做好之后,直接运行train.py就可以开始训练,显卡估计就开始爆炸了,程序出现下面这样子,就是在炼丹状态了,等着就ok了。
autoanchor: Analyzing anchors... anchors/target = 4.13, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs\train\exp3
Starting training for 300 epochs...
Epoch gpu_mem box obj cls total labels img_size
0/299 4.62G 0.06714 1.908 0 1.975 21 640: 100%|██████████| 30/30 [00:23<00:00, 1.28it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 2/2 [00:07<00:00, 3.77s/it]
all 27 62 0.00676 0.0806 0.00148 0.000214
Epoch gpu_mem box obj cls total labels img_size
1/299 4.7G 0.0673 0.4035 0 0.4708 27 640: 100%|██████████| 30/30 [00:13<00:00, 2.23it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 2/2 [00:01<00:00, 1.22it/s]
all 27 62 0.0198 0.0484 0.00255 0.000426
来源:氯化氢323