当前位置:   article > 正文

caffe python接口搭建&训练深度学习网络_深度学习python 服务接口

深度学习python 服务接口

一、 数据准备,生成lmdb文件

方案一:直接通过cmd执行以下命令:

convert_imageset.exe --resize_height=240 --resize_width=240 --shuffle --backend="lmdb" G:\ G:\caffe_python\label\label_train.txt G:\caffe_python\label\train_lmdb

convert_imageset.exe 由caffe编译时得到,路径一般为:\Build\x64\Release。resize_height & resize_width对原始图片resize到统一大小。G:\为训练图片的路径。label_train.txt为训练图片的标注数据,格式:filename  label。最后生成的lmdb文件为G:\caffe_python\label\train_lmdb。同理可生成验证集lmdb文件。

train_lmdb中数据保存格式

方案二:通过Python脚本生成lmdb文件

  1. # -*- coding: utf-8 -*-
  2. import sys
  3. reload(sys)
  4. sys.setdefaultencoding('utf-8')
  5. sys.path.insert(0, 'C:/Anaconda3/envs/py27/Lib/site-packages/pycaffe')
  6. import caffe
  7. import lmdb
  8. import random
  9. import cv2
  10. import numpy as np
  11. from caffe.proto import caffe_pb2
  12. from sklearn.model_selection import train_test_split
  13. from Bconfig import config
  14. def get_dataset(label_dir):
  15. with open(label_dir, 'r') as f:
  16. annotations = f.readlines()
  17. random.shuffle(annotations)
  18. dataset = []
  19. for annotation in annotations:
  20. annotation = annotation.strip().split(' ')
  21. data_example = dict()
  22. data_example['filename'] = annotation[0]
  23. data_example['label'] = int(annotation[1])
  24. dataset.append(data_example)
  25. return dataset
  26. if __name__ == '__main__':
  27. train_image_path = config.train_image_path
  28. train_list_path = config.train_list_path
  29. batch_size = config.BATCH_SIZE
  30. patchSize = config.PatchSize
  31. lmdb_file = config.trainlmdb_file #存放lmdb数据的目录
  32. dataset = get_dataset(train_list_path)
  33. trainDataset, testDataset = train_test_split(dataset, test_size=0.1, random_state=1)
  34. fw = open(config.lmdb_record, 'w')
  35. #打开lmdb环境,生成一个数据文件,定义最大空间1e12
  36. lmdb_env = lmdb.open(lmdb_file, map_size=int(1e12))
  37. lmdb_txn = lmdb_env.begin(write=True) #创建操作数据库句柄
  38. datum = caffe_pb2.Datum()
  39. for idx, image_example in enumerate(trainDataset):
  40. filename = image_example['filename']
  41. label = image_example['label']
  42. image = cv2.imdecode(np.fromfile(filename, dtype=np.uint8),-1)
  43. resizeImg = cv2.resize(image, (patchSize, patchSize))
  44. #save in datum
  45. datum = caffe.io.array_to_datum(resizeImg, label) #lmdb每一个数据都由键值对构成
  46. key_str = '{:0>8d}'.format(idx) #生成一个用递增顺序排列的定长唯一的key
  47. lmdb_txn.put(key_str.encode(), datum.SerializeToString()) #调用句柄,写入内存
  48. print('{:0>8d}'.format(idx) + ':' + filename)
  49. # write batch
  50. if (idx+1) % batch_size == 0:
  51. lmdb_txn.commit()
  52. lmdb_txn = lmdb_env.begin(write=True)
  53. print(idx + 1)
  54. # write last batch
  55. if (idx+1) % batch_size != 0:
  56. lmdb_txn.commit()
  57. print('last batch')
  58. print(idx + 1)
  59. lmdb_env.close() #结束后释放资源
  60. lmdb_env1 = lmdb.open(config.vallmdb_file, map_size=int(1e10))
  61. lmdb_txn1 = lmdb_env1.begin(write=True) #创建操作数据库句柄
  62. datum = caffe_pb2.Datum()
  63. for idt, image_example in enumerate(testDataset):
  64. filename = image_example['filename']
  65. label = image_example['label']
  66. image = cv2.imdecode(np.fromfile(filename, dtype=np.uint8),-1)
  67. resizeImg = cv2.resize(image, (patchSize, patchSize))
  68. #save in datum
  69. datum = caffe.io.array_to_datum(resizeImg, label) #lmdb每一个数据都由键值对构成
  70. key_str = '{:0>8d}'.format(idt) #生成一个用递增顺序排列的定长唯一的key
  71. lmdb_txn1.put(key_str.encode(), datum.SerializeToString()) #调用句柄,写入内存
  72. print('{:0>8d}'.format(idt) + ':' + filename)
  73. # write batch
  74. if (idt+1) % batch_size == 0:
  75. lmdb_txn1.commit()
  76. lmdb_txn1 = lmdb_env1.begin(write=True)
  77. print(idt + 1)
  78. # write last batch
  79. if (idt+1) % batch_size != 0:
  80. lmdb_txn1.commit()
  81. print('last batch')
  82. print(idt + 1)
  83. lmdb_env1.close() #结束后释放资源
  84. fw.write('trainDataset size: %d, testDataset size: %d' %(idx+1, idt+1))
  85. fw.close()

二、计算训练数据均值

compute_image_mean --backend="lmdb" train_lmdb mean.binaryproto

compute_image_mean.exe 由caffe编译时得到,只需计算训练集的均值。

同理也可由Python脚本生成:

  1. import sys
  2. sys.path.insert(0, 'C:/Anaconda3/envs/py27/Lib/site-packages/pycaffe')
  3. import caffe
  4. import numpy as np
  5. from Bconfig import config
  6. blob = caffe.proto.caffe_pb2.BlobProto()
  7. bin_mean = open(config.mean_binary, 'rb').read()
  8. blob.ParseFromString(bin_mean)
  9. arr = np.array(caffe.io.blobproto_to_array(blob))
  10. npy_mean = arr[0]
  11. np.save(config.mean_npy, npy_mean)

以上生成的是逐像素均值,均值文件大小与图像维度保持一致,图片维度为M*N*C,则均值文件维度为M*N*C。

caffe还可以使用逐通道减均值,每个通道的均值为一个数,直接作为参数在prototxt文件中指定即可,如下图

  1. layer {
  2. name: "InputData"
  3. type: "Data"
  4. top: "data"
  5. top: "label"
  6. transform_param {
  7. mirror: false
  8. crop_size: 240
  9. mean_value: 78.3
  10. mean_value: 76.7
  11. mean_value: 73.2
  12. }
  13. data_param {
  14. ...
  15. }
  16. }

关于通道均值的计算,在compute_image_mean.cpp中,保存的是像素均值,同时最后会输出channel mean:

  1. for (int c = 0; c < channels; ++c) {
  2. for (int i = 0; i < dim; ++i) {
  3. mean_values[c] += sum_blob.data(dim * c + i);
  4. }
  5. LOG(INFO) << "mean_value channel [" << c << "]: " << mean_values[c] / dim;
  6. }

三、搭建深度学习网络结构

1.卷积层:

  1. from caffe import layers as L
  2. n = caffe.NetSpec()
  3. n.data, n.label = L.Data(source=lmdb, name='InputData', backend=P.Data.LMDB,
  4. batch_size=batch_size, ntop=2,
  5. transform_param=dict(crop_size=240, mean_file=mean_file, mirror=False))
  6. n.conv1 = L.Convolution(n.data, kernel_size=3, stride=1, pad=1, num_output=32,
  7. weight_filler=dict(type='xavier'),
  8. bias_term=True,
  9. bias_filler=dict(type='constant'),
  10. name='conv')

2.激励层ReLU:

n.relu = L.ReLU(n.conv, in_place=True, name='conv_relu')

in_place字段为True,表示其top和bottom是一样的情况。

3.池化层:

  1. n.pool_max = L.Pooling(n.relu1, pool=P.Pooling.MAX, kernel_size=4, stride=3, pad=0,
  2. name='pool_max')
  3. n.pool_ave = L.Pooling(n.conv_out, pool=P.Pooling.AVE, kernel_size=12, stride=1, pad=0,
  4. name='pool_ave')
  5. n.global_ave = L.Pooling(n.dense3, pool=P.Pooling.AVE, global_pooling=True,
  6. name='pool_global_ave')

caffe中卷积层输出的feature map尺寸计算方式为:

out_size = (in_size + 2*pad - kernel_size) / stride +1

池化层输出的feature map尺寸计算方式为:

out_size = ceil [ (in_size + 2*pad - kernel_size) / stride ]+1

其计算方式与tensorflow略有不同,caffe中卷积和池化层的输出尺寸计算方式也不一样。详细可查阅caffe源码“caffe/layers/conv_layer.cpp” 和“caffe/layers/pooling_layer.cpp”

4.BatchNorm层:

  1. n.bachnorm = L.BatchNorm(n.pool1, include=dict(phase=caffe.TRAIN), in_place=True,
  2. batch_norm_param=dict(moving_average_fraction=0.9), name='bn')
  3. n.scale_bn=L.Scale(n.bachnorm1, scale_param=dict(bias_term=True), in_place=True, name='bn_scale')

5.全连接层(fully_connected):

  1. n.innerP = L.InnerProduct(n.scale_bn, num_output=class_num,
  2. weight_filler=dict(type='xavier'),
  3. bias_filler=dict(type='constant',value=0),
  4. name='inner_product')

6.Softmax层:

n.loss = L.SoftmaxWithLoss(n.innerP, n.label)

7.Accuracy层:

n.acc = L.Accuracy(n.innerP, n.label, accuracy_param=dict(top_k=1)) 

将网络结构写入到prototxt文件,即caffe格式的网络结构搭建文件。将生成好的.prototxt文件拷入到http://ethereon.github.io/netscope/#/editor,即可查看网络结构图。

数据输入网络时,L.Data输入参数即为之前生成的lmdb格式数据,若要对图片采取一些额外的在线数据增强,或针对已生成的lmdb数据改变增强方式,可在data_layer.cpp中修改源码(位于caffe目录下.\src\caffe\layers),针对每一个batch读入的数据进行在线数据增强等处理。具体修改方式可见https://blog.csdn.net/qq295456059/article/details/53494612

四、Solver文件

利用Solver文件配置相关训练及测试参数。

  1. from caffe.proto import caffe_pb2
  2. from Bconfig import config
  3. sp = caffe_pb2.SolverParameter()
  4. solver_file = config.solver_file #solver文件保存位置
  5. sp.train_net = config.train_proto #上一环节得到的prototxt文件
  6. sp.test_net.append(config.val_proto)
  7. sp.test_interval = 1405 #测试间隔
  8. sp.test_iter.append(157) # 测试迭代次数
  9. sp.max_iter = 210750 #最大迭代次数
  10. sp.base_lr = 0.001 #基础学习率
  11. sp.momentum = 0.9 #momentum系数
  12. sp.weight_decay = 5e-4 #权值衰减系数
  13. sp.lr_policy = 'step' #学习率衰减方法
  14. sp.stepsize = 70250
  15. sp.gamma = 0.1 #学习率衰减指数
  16. sp.display = 1405
  17. sp.snapshot = 1405
  18. sp.snapshot_prefix = './model/BeltClassify' #保存model前半部分
  19. sp.type = "SGD" #优化算法
  20. sp.solver_mode = caffe_pb2.SolverParameter.GPU
  21. with open(solver_file, 'w') as f:
  22. f.write(str(sp))

五、训练模型

  1. import caffe
  2. caffe.set_device(0)
  3. caffe.set_mode_gpu()
  4. solver = caffe.SGDSolver('G:/Belt_CaffeP/prototxt/solver.prototxt')
  5. test_iter = 157
  6. test_interval = 1405
  7. epoch_num = 150
  8. #solver.net.forward()
  9. #solver.solve()
  10. #iter = solver.iter
  11. for i in range(epoch_num):
  12. for j in range(test_interval):
  13. solver.step(1) #单步训练更新参数
  14. loss_train = solver.net.blobs['loss'].data
  15. acc_train = solver.net.blobs['acc'].data
  16. print('epoch %d %d/%d: loss_train: %.4f, accuracy_train: %.4f' %(i, j, test_interval, loss_train, acc_train))
  17. for test_i in range(test_iter):
  18. solver.test_nets[0].forward() #test net
  19. loss_test = solver.test_nets[0].blobs['loss'].data
  20. acc_test = solver.test_nets[0].blobs['acc'].data
  21. print('epoch %d %d/%d: loss: %.4f, accuracy: %.4f' %(i, test_i, test_iter, loss_test, acc_test))
  22. f.close()

以上可得到每一轮迭代时训练集和测试集的loss及accuracy。

六、测试图片

方法一:直接调用opencv 读入的图像

  1. net = caffe.Net(prototxt, model, caffe.TEST)
  2. img_bgr = cv2.imdecode(np.fromfile('XXX.jpg', dtype=np.uint8), -1)
  3. image = img_bgr - mean #减均值操作
  4. input_img = image.transpose((2,0,1))
  5. net.blobs['data'].data[...] = input_img
  6. output = net.forward()
  7. prob = output['prob'][0] #'prob'为prototxt文件中最后一层的输出量
  8. label_pre = prob.argsort()[-1]

方法二:caffe加载图像

  1. net = caffe.Net(prototxt, model, caffe.TEST)
  2. image = caffe.io.load_image('XXX.jpg')
  3. transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
  4. transformer.set_transpose('data', (2,0,1)) # H*W*C --> C*H*W
  5. transformer.set_raw_scale('data', 255)
  6. transformer.set_mean('data', np.array([mean1, mean2, mean3]))
  7. transformer.set_channel_swap('data', (2,1,0))
  8. net.blobs['data'].data[...] = transformer.preprocess('data',img)
  9. output = net.forward()
  10. prob = output['prob'][0]
  11. label_pre = np.argsort(-prob)[0]

caffe.io.load_image加载的图片为RGB格式,0~1(float),而caffe中图像为BGR格式,图像存储范围[0, 255],因此需转换维度空间,以及取值缩放到0~255。transformer中不考虑设置的图像变换顺序,transformer.preprocess 函数中写明了(1)set_transpose (2)channel_swap (3)raw_scale (4)减mean。

而opencv 读取的图像即为BGR格式,范围为0~255,无需做变换。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/726570
推荐阅读
相关标签
  

闽ICP备14008679号