当前位置:   article > 正文

Yolov5训练自己的数据集(最详细教程)_yolov5数据集

yolov5数据集

一、环境配置部分

默认使用anaconda来管理python环境

1.创建虚拟环境

conda create -n yolov5 python=3.8

2.根据自己安装的CUDA版本去pytorch官网下载torch等。

因为我的CUDA是11.1,默认的没有,点击下面的previous versions of pytorch看以前的版本。发现torch1.10.1可以

pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

3.下载yolov5代码

  1. git clone https://github.com/ultralytics/yolov5 # clone
  2. cd yolov5
  3. pip install -r requirements.txt # install

二、数据集制作部分

先用labelimage标注自己的数据,尽量是yolo格式的,也就是标注文件是txt形式,如果有voc格式也没关系,可以进行转换。

2.1voc格式

 第一步:把JPEGImages、Annotations、ImageSets(他们的说明见下面代码)和下面的代码放在一个目录下。运行下面的代码,就会在ImageSets/Main/下得到训练和验证的txt文件。

  1. import os
  2. import random
  3. images_path = "JPEGImages/" #里面放的图片
  4. xmls_path = "Annotations/" #里面放的xml格式标注文件
  5. train_val_txt_path = "ImageSets/Main/" #这个就是一个空的文件夹,运行这个代码后在Main文件夹下有两个训练的txt文件
  6. val_percent = 0.1 #验证集的比例。
  7. images_list = os.listdir(images_path)
  8. random.shuffle(images_list)
  9. train_images_count = int((1 - val_percent) * len(images_list))
  10. val_images_count = int(val_percent * len(images_list))
  11. train_txt = open(os.path.join(train_val_txt_path, "train.txt"), "w")
  12. train_count = 0
  13. for i in range(train_images_count):
  14. text = images_list[i].split(".png")[0] + "\n"
  15. train_txt.write(text)
  16. train_count += 1
  17. print("train_count : " + str(train_count))
  18. train_txt.close()
  19. val_txt = open(os.path.join(train_val_txt_path, "val.txt"), "w")
  20. val_count = 0
  21. for i in range(val_images_count):
  22. text = images_list[i + train_images_count].split(".png")[0] + "\n"
  23. val_txt.write(text)
  24. val_count += 1
  25. print("val_count : " + str(val_count))
  26. val_txt.close()

第二步:标注文件voc格式转yolo格式

我们只需要在main函数里指定两个参数,一个是VOC标注文件地址,一个是yolo输出文件地址

  1. import os
  2. import xml.etree.ElementTree as ET
  3. def convert_folder_to_yolov5(input_folder, output_folder):
  4. # Ensure output folder exists
  5. if not os.path.exists(output_folder):
  6. os.makedirs(output_folder)
  7. # Loop through each XML file in the input folder
  8. for xml_file_name in os.listdir(input_folder):
  9. if xml_file_name.endswith('.xml'):
  10. xml_file_path = os.path.join(input_folder, xml_file_name)
  11. # Generate corresponding output txt file path
  12. txt_file_name = os.path.splitext(xml_file_name)[0] + '.txt'
  13. txt_file_path = os.path.join(output_folder, txt_file_name)
  14. # Convert XML to Yolov5 format and save to txt file
  15. convert_to_yolov5(xml_file_path, txt_file_path)
  16. def convert_to_yolov5(xml_file, output_file):
  17. tree = ET.parse(xml_file)
  18. root = tree.getroot()
  19. with open(output_file, 'w') as f:
  20. for obj in root.findall('object'):
  21. class_name = obj.find('name').text
  22. if class_name == 'cone': # Assuming 'disease' is the class of interest
  23. xmin = int(obj.find('bndbox/xmin').text)
  24. ymin = int(obj.find('bndbox/ymin').text)
  25. xmax = int(obj.find('bndbox/xmax').text)
  26. ymax = int(obj.find('bndbox/ymax').text)
  27. width = xmax - xmin
  28. height = ymax - ymin
  29. x_center = (xmin + xmax) / 2.0
  30. y_center = (ymin + ymax) / 2.0
  31. # Normalize coordinates and dimensions
  32. x_center /= int(root.find('size/width').text)
  33. y_center /= int(root.find('size/height').text)
  34. width /= int(root.find('size/width').text)
  35. height /= int(root.find('size/height').text)
  36. line = f"{0} {x_center} {y_center} {width} {height}\n"
  37. f.write(line)
  38. if __name__ == "__main__":
  39. input_folder_path = "/home/wangchen/YOLOX/cone/Annotations" #voc格式标注文件
  40. output_folder_path = "/home/wangchen/YOLOX/cone/YOLOLabels" #yolo格式保存地址
  41. convert_folder_to_yolov5(input_folder_path, output_folder_path)

第三步:根据第一步的生成的voc索引,来将yolo数据划分为train和val两部分。

下面这个代码运行完,会在output_dataset_path里面产生两个文件夹,一个train,一个val,每一个里面又都有一个images和labels。这个数据集目录结构是不对的,需要调整一下。改成下图结构

  1. import os
  2. import random
  3. from shutil import copyfile
  4. def split_dataset(image_folder, txt_folder, output_folder, split_index):
  5. # Ensure output folders exist
  6. for dataset in ['train', 'val']:
  7. if not os.path.exists(os.path.join(output_folder, dataset, 'images')):
  8. os.makedirs(os.path.join(output_folder, dataset, 'images'))
  9. if not os.path.exists(os.path.join(output_folder, dataset, 'txt')):
  10. os.makedirs(os.path.join(output_folder, dataset, 'txt'))
  11. train_index = os.path.join(split_index, 'train.txt')
  12. val_index = os.path.join(split_index, 'val.txt')
  13. with open(train_index, 'r') as file:
  14. train_images = [i.strip() for i in file.readlines()]
  15. with open(val_index, 'r') as file:
  16. val_images = [i.strip() for i in file.readlines()]
  17. # Copy images to respective folders
  18. for dataset, images_list in zip(['train', 'val'], [train_images, val_images]):
  19. for image_file in images_list:
  20. image_path = os.path.join(image_folder, image_file + '.png')
  21. copyfile(image_path, os.path.join(output_folder, dataset, 'images', image_file + '.png'))
  22. txt_file = image_file + '.txt'
  23. txt_path = os.path.join(txt_folder, txt_file)
  24. # Copy corresponding txt file if exists
  25. if os.path.exists(txt_path):
  26. copyfile(txt_path, os.path.join(output_folder, dataset, 'txt', txt_file))
  27. if __name__ == "__main__":
  28. image_folder_path = "/home/wangchen/YOLOX/cone/JPEGImages"
  29. txt_folder_path = "/home/wangchen/YOLOX/cone/YOLOLabels"
  30. output_dataset_path = "/home/wangchen/YOLOX/yolo_data"
  31. split_index = "/home/wangchen/YOLOX/cone/ImageSets/Main"
  32. split_dataset(image_folder_path, txt_folder_path, output_dataset_path, split_index)

2.2YOLO格式

直接按照上面第三步目录结构划分就行。

三、yolov5配置文件修改

 修改data/VOC.yaml.

修改models/yolov5_s.yaml里面的类别个数。修改train.py里面的相关超参数即可。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/不正经/article/detail/337999
推荐阅读
相关标签
  

闽ICP备14008679号