赞
踩
原因:我在使用下面的程序划分数据集时,发现划分之后数据集对应的图片和标签的数量不对应。(自己犯的一个初级错误)。
- import os
- import shutil
- import random
-
- # 训练集、验证集和测试集的比例分配
- test_percent = 0.1
- valid_percent = 0.09
- train_percent = 0.81
-
- # 标注文件的路径
- image_path = 'images'
- label_path = 'labels'
-
- images_files_list = os.listdir(image_path)
- labels_files_list = os.listdir(label_path)
- print('images files: {}'.format(images_files_list))
- print('labels files: {}'.format(labels_files_list))
- total_num = len(images_files_list)
- print('total_num: {}'.format(total_num))
-
- test_num = int(total_num * test_percent)
- valid_num = int(total_num * valid_percent)
- train_num = int(total_num * train_percent)
-
- # 对应文件的索引
- test_image_index = random.sample(range(total_num), test_num)
- valid_image_index = random.sample(range(total_num), valid_num)
- train_image_index = random.sample(range(total_num), train_num)
-
- for i in range(total_num):
- print('src image: {}, i={}'.format(images_files_list[i], i))
- if i in test_image_index:
- # 将图片和标签文件拷贝到对应文件夹下
- shutil.copyfile('images/{}'.format(images_files_list[i]), 'test/images/{}'.format(images_files_list[i]))
- shutil.copyfile('labels/{}'.format(labels_files_list[i]), 'test/labels/{}'.format(labels_files_list[i]))
- elif i in valid_image_index:
- shutil.copyfile('images/{}'.format(images_files_list[i]), 'valid/images/{}'.format(images_files_list[i]))
- shutil.copyfile('labels/{}'.format(labels_files_list[i]), 'valid/labels/{}'.format(labels_files_list[i]))
- else:
- shutil.copyfile('images/{}'.format(images_files_list[i]), 'train/images/{}'.format(images_files_list[i]))
- shutil.copyfile('labels/{}'.format(labels_files_list[i]), 'train/labels/{}'.format(labels_files_list[i]))
解决办法:我手动划分了数据集,这样确保了数据集的图片和标签是对应关系。图片的比例是训练集:验证集:测试集=0.81:0.09:0.1
迷惑:我第一次用这个代码划分数据集,还是正确的,但是第二次用这个代码划分数据集就是错误的。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。