赞
踩
本文主要通过使用的PaddlePaddle用于实现的图像分类的目标的。并设计与优化的相关的模型。该问题主要来源是的:Cassava Leaf Disease Classification | Kaggle
作为非洲第二大碳水化合物供应国,木薯是小农种植的重要粮食安全作物,因为它可以承受恶劣的条件。撒哈拉以南非洲至少有80%的家庭农场都种植这种淀粉状的根,但病毒性疾病是单产低下的主要根源。借助数据科学,可以识别常见疾病,以便对其进行治疗。 现有的疾病检测方法要求农民寻求政府资助的农业专家的帮助,以目视检查和诊断植物。这遭受了劳动密集,供应不足和成本高的困扰。另一个挑战是,针对农民的有效解决方案必须在明显的约束下表现良好,因为非洲农民可能只能使用低带宽的移动质量相机。 在本次比赛中,我们引入了在乌干达定期调查期间收集的21,367张带标签图像的数据集。大多数图像都是从农民那里采集的花园照片拍摄的,并由国家作物资源研究所(NaCRRI)的专家与坎帕拉的马克雷雷大学的AI实验室合作进行注释。这是最现实地表示农民在现实生活中需要诊断的格式。 您的任务是将每个木薯图像分类为四个疾病类别或第五个类别,指示健康的叶子。在您的帮助下,农民可能能够快速识别出患病的植物,从而有可能在遭受不可弥补的损害之前挽救他们的作物。
- #!/usr/bin/env python
- # -*- coding: utf-8 -*-
- """
- @version: 1.0
- @author: xjl
- @file: csv_to_txt.py
- @time: 2021/3/5 11:37
- """
-
- import pandas as pd
- import os
-
-
- def csv_to_txt(csv_file, txt_file,abs_path):
- if not os.path.exists(csv_file):
- print('Not that files:%s' % csv_file)
- else:
- data = pd.read_csv(csv_file, encoding='utf-8')
- with open(txt_file, 'a+', encoding='utf-8') as f:
- for line in data.values:
- newdata=abs_path+str(line[0]) + ' ' + str(line[1]) + '\n'
- f.write(newdata)
-
-
- if __name__ == '__main__':
- path=os.path.abspath('.').replace('\\','/')
- csv_file = path+r"/train.csv"
- txt_file =path+ r"/train.txt"
- abs_path=path+r"/train_images/"
- csv_to_txt(csv_file, txt_file,abs_path)
- #!/usr/bin/env python
- # -*- coding: utf-8 -*-
- """
- @version: 1.0
- @author: xjl
- @file: split_date.py
- @time: 2021/3/5 11:53
- """
- # -*- coding: utf-8 -*-
-
- import random
- import os
- """
- 随机按比例拆分数据
- """
- def split(all_list, shuffle=False, ratio=0.8):
- num = len(all_list)
- offset = int(num * ratio)
- if num == 0 or offset < 1:
- return [], all_list
- if shuffle:
- random.shuffle(all_list) # 列表随机排序
- train = all_list[:offset]
- test = all_list[offset:]
- return train, test
-
-
- def write_split(film, train, test):
- infilm = open(film, 'r', encoding='utf-8')
- tainfilm = open(train, 'w', encoding='utf-8')
- testfilm = open(test, 'w', encoding='utf-8')
- li = []
- for datas in infilm.readlines():
- datas = datas.replace('\n', '')
- li.append(datas)
- traindatas, testdatas = split(li, shuffle=True, ratio=0.8)
- for traindata in traindatas:
- tainfilm.write(traindata + '\n')
- for testdata in testdatas:
- testfilm.write(testdata + '\n')
- infilm.close()
- tainfilm.close()
- testfilm.close()
-
-
- if __name__ == "__main__":
- path = os.path.abspath('.').replace('\\', '/')
- data_path=path+r"/train.txt"
- train_path=path+r"/train_list.txt"
- test_path=path+r"/val_list.txt"
- write_split(data_path, train_path,test_path)
-
1采用的是ResNet50_vd的一个网络模型的结构
mode: 'train'# 当前所处的模式,支持训练与评估模式 ARCHITECTURE: name: 'ResNet50_vd'# 模型结构,可以通过这个这个名称,使用模型库中其他支持的模型 checkpoints: "" pretrained_model: ""# 预训练模型,因为这个配置文件演示的是不加载预训练模型进行训练,因此配置为空。 model_save_dir: "./output/"# 模型保存的路径 classes_num: 4# 类别数目,需要根据数据集中包含的类别数目来进行设置 total_images: 17117# 训练集的图像数量,用于设置学习率变换策略等。 save_interval: 1# 保存的间隔,每隔多少个epoch保存一次模型 validate: True# 是否进行验证,如果为True,则配置文件中需要包含VALID字段 valid_interval: 1# 每隔多少个epoch进行验证 epochs: 100# 训练的总得的epoch数量 topk: 4# 除了top1 acc之外,还输出topk的准确率,注意该值不能大于classes_num image_shape: [3, 224, 224]# 图像形状信息 LEARNING_RATE:# 学习率变换策略,目前支持Linear/Cosine/Piecewise/CosineWarmup function: 'Cosine' params: lr: 0.0125 OPTIMIZER:# 优化器设置 function: 'Momentum' params: momentum: 0.9 regularizer: function: 'L2' factor: 0.00001 TRAIN:# 训练配置 batch_size: 32# 训练的batch size num_workers: 0# 每个trainer(1块GPU上可以视为1个trainer)的进程数量 file_list: "./dataset/cassava-leaf-disease-classification/train_list.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0# 数据打散的种子 transforms:# 训练图像的数据预处理 - DecodeImage:# 解码 to_rgb: True to_np: False channel_first: False - RandCropImage:# 随机裁剪 size: 224 - RandFlipImage:# 随机水平翻转 flip_code: 1 - NormalizeImage: # 归一化 scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage:# 通道转换 VALID:# 验证配置,validate为True时有效 batch_size: 20# 验证集batch size num_workers: 0# 每个trainer(1块GPU上可以视为1个trainer)的进程数量 file_list: "./dataset/cassava-leaf-disease-classification/val_list.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0# 数据打散的种子 transforms: - DecodeImage: to_rgb: True to_np: False channel_first: False - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage:
训练的步数和没有加入参数的调整。训练完成后,验证集上的精度为63.668%。
训练参数的设置
mode: 'train' ARCHITECTURE: name: 'ResNet50_vd' checkpoints: "" pretrained_model: "" model_save_dir: "./output/" classes_num: 4 total_images: 17117 save_interval: 1 validate: True valid_interval: 1 epochs: 100 topk: 4 image_shape: [3, 224, 224] LEARNING_RATE: function: 'Cosine' params: lr: 0.0125 OPTIMIZER: function: 'Momentum' params: momentum: 0.9 regularizer: function: 'L2' factor: 0.00001 TRAIN: batch_size: 32 num_workers: 0 file_list: "./dataset/cassava-leaf-disease-classification/train.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0 transforms: - DecodeImage: to_rgb: True to_np: False channel_first: False - RandCropImage: size: 224 - RandFlipImage: flip_code: 1 - NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage: VALID: batch_size: 20 num_workers: 0 file_list: "./dataset/cassava-leaf-disease-classification/val_list.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0 transforms: - DecodeImage: to_rgb: True to_np: False channel_first: False - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage:
-c ./configs/quick_start/ResNet50_vd.yaml
采用预训练的模型的方式的
ResNet101_vd_ssld_pretrained.pdparams获取:
python tools/download.py -a ResNet101_vd_ssld_pretrained -p ./pretrained -d True
模型的获取:
https://paddleclas.readthedocs.io/zh_CN/latest/models/models_intro.html
mode: 'train' ARCHITECTURE: name: 'ResNet101_vd' pretrained_model: "./pretrained/ResNet101_vd_ssld_pretrained" model_save_dir: "./output/" classes_num: 4 total_images: 17117 save_interval: 1 validate: True valid_interval: 1 epochs: 200 topk: 4 image_shape: [3, 224, 224] use_mix: True ls_epsilon: 0.1 LEARNING_RATE: function: 'Cosine' params: lr: 0.1 OPTIMIZER: function: 'Momentum' params: momentum: 0.9 regularizer: function: 'L2' factor: 0.000100 TRAIN: batch_size: 32 num_workers: 0 file_list: "./dataset/cassava-leaf-disease-classification/train_list.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0 transforms: - DecodeImage: to_rgb: True to_np: False channel_first: False - RandCropImage: size: 224 - RandFlipImage: flip_code: 1 - NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage: mix: - MixupOperator: alpha: 0.2 VALID: batch_size: 32 num_workers: 0 file_list: "./dataset/cassava-leaf-disease-classification/val_list.txt" data_dir: "./dataset/cassava-leaf-disease-classification/train_images/" shuffle_seed: 0 transforms: - DecodeImage: to_rgb: True to_np: False channel_first: False - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。