当前位置:   article > 正文

爬虫 + CNN(卷积神经网络)实现名家画作识别与分类_实现绘画主题识别python

实现绘画主题识别python

例子描述:

通过用CNN网络对 梵高,莫奈,毕加索,达芬奇 四位画家的作品进行学习,学出一个模型,这个模型具有识别这个四位画家作品的能力。

所需环境:Python3.6 + Tensorflow

如果使用cpu版本,可以参考:https://www.jianshu.com/p/da141c730180
如果使用gpu版本,可以参考:https://www.jianshu.com/p/62d414aa843e

3个步骤:

  1. 使用爬虫爬去百度图片
  2. 搭建神经网络,训练,产生模型
  3. 使用产生的模型,识别与分类

1. 使用爬虫爬去百度图片

通过chrome开发者工具分析,我们得到一个百度图片的api接口,通过接口的数据可以拿到百度图片的地址,如图:

 

分析百度图片网站,找到获取图片的接口

得到的这个地址是:https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E6%A2%B5%E9%AB%98%E4%BD%9C%E5%93%81&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=-1&z=&ic=&hd=&latest=&copyright=&word=%E6%A2%B5%E9%AB%98%E4%BD%9C%E5%93%81&s=&se=&tab=&width=&height=&face=0&istype=2&qc=&nc=1&fr=&expermode=&force=&pn=60&rn=30&gsm=3c&1550715038298=

用过分析,这个url地址的主要的三个参数是:

  1. pn: 当前页的图片数量偏移量,如 60 表示当前页是第二页,图片数的偏移是60
  2. rn: 每页返回多少图片,如 30 表示每页三十张图片
  3. queryWordh和word:搜索关键字,如 :梵高作品

我们只要调整这些参数,就可以获取任意的百度图片和图片数量,然后通过python代码爬去图片保存到本地磁盘目录。

新建文件:spider.py
代码如下:

  1. import requests
  2. import os
  3. import urllib
  4. import json
  5. #定义下载图片的函数
  6. def downImg(imgUrl, dirPath, imgName):
  7. filename = os.path.join(dirPath, imgName)
  8. try:
  9. res = requests.get(imgUrl, timeout=15)
  10. if str(res.status_code)[0] == "4":
  11. print(str(res.status_code), ":", imgUrl)
  12. return False
  13. except Exception as e:
  14. print("抛出异常:", imgUrl)
  15. print(e)
  16. return False
  17. with open(filename, "wb") as f:
  18. f.write(res.content)
  19. return True
  20. words = [["梵高作品",'FG'],['莫奈作品','MN'],['毕加索作品','BJS'],['达芬奇作品','DFQ']] #搜索关键字,如 :梵高作品
  21. trainPath = "train_data/"
  22. #如果文件夹不存在,创建文件夹
  23. if not os.path.exists(trainPath):
  24. os.mkdir(trainPath)
  25. for word in words:
  26. dirPath = trainPath + word[1]
  27. # 如果文件夹不存在,创建文件夹
  28. if not os.path.exists(dirPath):
  29. os.mkdir(dirPath)
  30. word = urllib.parse.quote(word[0]) #因为是中文,所以要进行urlencode转换
  31. pn = 30 #当前页的图片数量偏移量,如 60 表示当前页是第二页,图片数的偏移是60
  32. rn = 30 #每每页返回多少图片,如 30 表示每页三十张图片
  33. i = 1 #图片编号
  34. while pn <= 30 * 20: #获取20页的图片,总共600张,建议修改页数,爬去更多一点的图片
  35. try:
  36. url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=' + word + '&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=-1&z=&ic=&hd=&latest=&copyright=&word=' + word + '=&se=&tab=&width=&height=&face=0&istype=2&qc=&nc=1&fr=&expermode=&force=&pn=' + str(
  37. pn) + '&rn=' + str(rn) + '&gsm=3c&1550715038298='
  38. jsonBytes = requests.get(url, timeout=10).content # 获取json数据-字节
  39. jsonData = jsonBytes.decode('utf-8') # json数据-字节转字符串
  40. print("---------------------------------------------------------")
  41. jsonData = jsonData.replace("\\'", '') #不加这个字符串替换json.loads时会报错,意思是去掉字符串中的\'
  42. print(jsonData)
  43. print("---------------------------------------------------------")
  44. jsonObj = json.loads(jsonData) # json数据-字符串转对象
  45. if 'data' in jsonObj:
  46. for item in jsonObj['data']:
  47. if 'thumbURL' in item:
  48. imgName = str(i) + ".jpg"
  49. downImg(item['thumbURL'], dirPath, imgName) # 下载图片
  50. print(item['thumbURL'])
  51. i += 1
  52. pn += rn # 下一页
  53. except Exception as e:
  54. print(e)

代码执行完成后,在当前目录下,我们就得到了后面训练用的样本数据,目录文件如下:

image.png

到此,样本数据就准备好了,下面我们要搭建神经网络了。

2. 搭建神经网络,读取图片,训练,产生模型

这里要用到opencv,所以要安装opencv模块

下载地址 : http://ai-download.xmgc360.com/opencv_python-3.3.0.10-cp36-cp36m-win_amd64.whl

比如下载到 D 盘,然后安装

  1. # 安装
  2. pip install D:/opencv_python-3.3.0.10-cp36-cp36m-win_amd64.whl

还需安装 sklearn 模块

pip install sklearn

新建文件 dataset.py ,用于读取图片并预处理,代码如下:

  1. import cv2
  2. import os
  3. import glob
  4. from sklearn.utils import shuffle
  5. import numpy as np
  6. def load_train(train_path, image_size, classes):
  7. images = []
  8. labels = []
  9. img_names = []
  10. cls = []
  11. print('Going to read training images')
  12. for fields in classes:
  13. index = classes.index(fields)
  14. print('Now going to read {} files (Index: {})'.format(fields, index))
  15. path = os.path.join(train_path, fields, '*g')
  16. files = glob.glob(path)
  17. for fl in files:
  18. try:
  19. #读取图片
  20. image = cv2.imread(fl)
  21. #等比例压缩到64*64
  22. image = cv2.resize(image, (image_size, image_size), 0, 0, cv2.INTER_LINEAR)
  23. #转为浮点型
  24. image = image.astype(np.float32)
  25. #归一化处理
  26. image = np.multiply(image, 1.0 / 255.0)
  27. images.append(image)
  28. label = np.zeros(len(classes))
  29. label[index] = 1.0
  30. labels.append(label)
  31. flbase = os.path.basename(fl)
  32. img_names.append(flbase)
  33. cls.append(fields)
  34. except Exception as e:
  35. print(e)
  36. images = np.array(images)
  37. labels = np.array(labels)
  38. img_names = np.array(img_names)
  39. cls = np.array(cls)
  40. return images, labels, img_names, cls
  41. class DataSet(object):
  42. def __init__(self, images, labels, img_names, cls):
  43. self._num_examples = images.shape[0]
  44. self._images = images
  45. self._labels = labels
  46. self._img_names = img_names
  47. self._cls = cls
  48. self._epochs_done = 0
  49. self._index_in_epoch = 0
  50. @property
  51. def images(self):
  52. return self._images
  53. @property
  54. def labels(self):
  55. return self._labels
  56. @property
  57. def img_names(self):
  58. return self._img_names
  59. @property
  60. def cls(self):
  61. return self._cls
  62. @property
  63. def num_examples(self):
  64. return self._num_examples
  65. @property
  66. def epochs_done(self):
  67. return self._epochs_done
  68. def next_batch(self, batch_size):
  69. """Return the next `batch_size` examples from this data set."""
  70. start = self._index_in_epoch
  71. self._index_in_epoch += batch_size
  72. if self._index_in_epoch > self._num_examples:
  73. # After each epoch we update this
  74. self._epochs_done += 1
  75. start = 0
  76. self._index_in_epoch = batch_size
  77. assert batch_size <= self._num_examples
  78. end = self._index_in_epoch
  79. return self._images[start:end], self._labels[start:end], self._img_names[start:end], self._cls[start:end]
  80. def read_train_sets(train_path, image_size, classes, validation_size):
  81. class DataSets(object):
  82. pass
  83. data_sets = DataSets()
  84. images, labels, img_names, cls = load_train(train_path, image_size, classes)
  85. images, labels, img_names, cls = shuffle(images, labels, img_names, cls)
  86. if isinstance(validation_size, float):
  87. validation_size = int(validation_size * images.shape[0])
  88. validation_images = images[:validation_size]
  89. validation_labels = labels[:validation_size]
  90. validation_img_names = img_names[:validation_size]
  91. validation_cls = cls[:validation_size]
  92. train_images = images[validation_size:]
  93. train_labels = labels[validation_size:]
  94. train_img_names = img_names[validation_size:]
  95. train_cls = cls[validation_size:]
  96. data_sets.train = DataSet(train_images, train_labels, train_img_names, train_cls)
  97. data_sets.valid = DataSet(validation_images, validation_labels, validation_img_names, validation_cls)
  98. return data_sets

新建 train.py 文件,搭建神经网络,训练,产生模型,代码如下:

  1. import dataset
  2. import tensorflow as tf
  3. import time
  4. from datetime import timedelta
  5. import math
  6. import random
  7. import numpy as np
  8. # conda install --channel https://conda.anaconda.org/menpo opencv3
  9. #Adding Seed so that random initialization is consistent
  10. from numpy.random import seed
  11. seed(10)
  12. from tensorflow import set_random_seed
  13. set_random_seed(20)
  14. batch_size = 32
  15. #Prepare input data
  16. classes = ['BJS','DFQ','FG','MN']
  17. num_classes = len(classes)
  18. # 20% of the data will automatically be used for validation
  19. validation_size = 0.2
  20. img_size = 64
  21. num_channels = 3
  22. train_path='train_data'
  23. # We shall load all the training and validation images and labels into memory using openCV and use that during training
  24. data = dataset.read_train_sets(train_path, img_size, classes, validation_size=validation_size)
  25. print("Complete reading input data. Will Now print a snippet of it")
  26. print("Number of files in Training-set:\t\t{}".format(len(data.train.labels)))
  27. print("Number of files in Validation-set:\t{}".format(len(data.valid.labels)))
  28. session = tf.Session()
  29. x = tf.placeholder(tf.float32, shape=[None, img_size,img_size,num_channels], name='x')
  30. ## labels
  31. y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
  32. y_true_cls = tf.argmax(y_true, dimension=1)
  33. ##Network graph params
  34. filter_size_conv1 = 3
  35. num_filters_conv1 = 32
  36. filter_size_conv2 = 3
  37. num_filters_conv2 = 32
  38. filter_size_conv3 = 3
  39. num_filters_conv3 = 64
  40. fc_layer_size = 1024
  41. def create_weights(shape):
  42. return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
  43. def create_biases(size):
  44. return tf.Variable(tf.constant(0.05, shape=[size]))
  45. def create_convolutional_layer(input,
  46. num_input_channels,
  47. conv_filter_size,
  48. num_filters):
  49. ## We shall define the weights that will be trained using create_weights function. 3 3 3 32
  50. weights = create_weights(shape=[conv_filter_size, conv_filter_size, num_input_channels, num_filters])
  51. ## We create biases using the create_biases function. These are also trained.
  52. biases = create_biases(num_filters)
  53. ## Creating the convolutional layer
  54. layer = tf.nn.conv2d(input=input,
  55. filter=weights,
  56. strides=[1, 1, 1, 1],
  57. padding='SAME')
  58. layer += biases
  59. layer = tf.nn.relu(layer)
  60. ## We shall be using max-pooling.
  61. layer = tf.nn.max_pool(value=layer,
  62. ksize=[1, 2, 2, 1],
  63. strides=[1, 2, 2, 1],
  64. padding='SAME')
  65. ## Output of pooling is fed to Relu which is the activation function for us.
  66. #layer = tf.nn.relu(layer)
  67. return layer
  68. def create_flatten_layer(layer):
  69. #We know that the shape of the layer will be [batch_size img_size img_size num_channels]
  70. # But let's get it from the previous layer.
  71. layer_shape = layer.get_shape()
  72. ## Number of features will be img_height * img_width* num_channels. But we shall calculate it in place of hard-coding it.
  73. num_features = layer_shape[1:4].num_elements()
  74. ## Now, we Flatten the layer so we shall have to reshape to num_features
  75. layer = tf.reshape(layer, [-1, num_features])
  76. return layer
  77. def create_fc_layer(input,
  78. num_inputs,
  79. num_outputs,
  80. use_relu=True):
  81. #Let's define trainable weights and biases.
  82. weights = create_weights(shape=[num_inputs, num_outputs])
  83. biases = create_biases(num_outputs)
  84. # Fully connected layer takes input x and produces wx+b.Since, these are matrices, we use matmul function in Tensorflow
  85. layer = tf.matmul(input, weights) + biases
  86. layer=tf.nn.dropout(layer,keep_prob=0.7)
  87. if use_relu:
  88. layer = tf.nn.relu(layer)
  89. return layer
  90. #卷积层1(包括卷积,池化,激活)
  91. layer_conv1 = create_convolutional_layer(input=x,
  92. num_input_channels=num_channels,
  93. conv_filter_size=filter_size_conv1,
  94. num_filters=num_filters_conv1)
  95. #卷积层2(包括卷积,池化,激活)
  96. layer_conv2 = create_convolutional_layer(input=layer_conv1,
  97. num_input_channels=num_filters_conv1,
  98. conv_filter_size=filter_size_conv2,
  99. num_filters=num_filters_conv2)
  100. #卷积层3(包括卷积,池化,激活)
  101. layer_conv3= create_convolutional_layer(input=layer_conv2,
  102. num_input_channels=num_filters_conv2,
  103. conv_filter_size=filter_size_conv3,
  104. num_filters=num_filters_conv3)
  105. #把上面三个卷积层处理后的结果转化为一维向量,才能提供给全连层
  106. layer_flat = create_flatten_layer(layer_conv3)
  107. #全连接层1
  108. layer_fc1 = create_fc_layer(input=layer_flat,
  109. num_inputs=layer_flat.get_shape()[1:4].num_elements(),
  110. num_outputs=fc_layer_size,
  111. use_relu=True)
  112. #全连接层2
  113. layer_fc2 = create_fc_layer(input=layer_fc1,
  114. num_inputs=fc_layer_size,
  115. num_outputs=num_classes,
  116. use_relu=False)
  117. y_pred = tf.nn.softmax(layer_fc2,name='y_pred')
  118. y_pred_cls = tf.argmax(y_pred, dimension=1)
  119. session.run(tf.global_variables_initializer())
  120. cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
  121. labels=y_true)
  122. cost = tf.reduce_mean(cross_entropy)
  123. optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(cost)
  124. correct_prediction = tf.equal(y_pred_cls, y_true_cls)
  125. accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  126. session.run(tf.global_variables_initializer())
  127. def show_progress(epoch, feed_dict_train, feed_dict_validate, val_loss,i):
  128. acc = session.run(accuracy, feed_dict=feed_dict_train)
  129. val_acc = session.run(accuracy, feed_dict=feed_dict_validate)
  130. msg = "Training Epoch {0}--- iterations: {1}--- Training Accuracy: {2:>6.1%}, Validation Accuracy: {3:>6.1%}, Validation Loss: {4:.3f}"
  131. print(msg.format(epoch + 1,i, acc, val_acc, val_loss))
  132. total_iterations = 0
  133. saver = tf.train.Saver()
  134. def train(num_iteration):
  135. global total_iterations
  136. for i in range(total_iterations,
  137. total_iterations + num_iteration):
  138. x_batch, y_true_batch, _, cls_batch = data.train.next_batch(batch_size)
  139. x_valid_batch, y_valid_batch, _, valid_cls_batch = data.valid.next_batch(batch_size)
  140. feed_dict_tr = {x: x_batch,
  141. y_true: y_true_batch}
  142. feed_dict_val = {x: x_valid_batch,
  143. y_true: y_valid_batch}
  144. session.run(optimizer, feed_dict=feed_dict_tr)
  145. if i % int(data.train.num_examples/batch_size) == 0:
  146. val_loss = session.run(cost, feed_dict=feed_dict_val)
  147. epoch = int(i / int(data.train.num_examples/batch_size))
  148. show_progress(epoch, feed_dict_tr, feed_dict_val, val_loss,i)
  149. saver.save(session, './model/painting.ckpt',global_step=i)
  150. total_iterations += num_iteration
  151. train(num_iteration=8000)

相关目录

运行 train.py 进行训练 , 如图:

 

image.png

训练中结果截图:

 

训练中...

等训练完成后,会传输模型文件,如图:

模型文件

产生模型以后,我们使用最新的模型文件来预测,这里我们使用:

painting.ckpt-7998.meta 存储的是神经网络结构
painting.ckpt-7998.data 模型数据本身
然后在下面的代码里引用

3. 识别与分类

新建文件:predict.py,代码中加载模型,制定预测的文件名 fg_test_1.jpg。

image.png

代码如下:

  1. import tensorflow as tf
  2. import numpy as np
  3. import os,glob,cv2
  4. import sys,argparse
  5. image_size=64
  6. num_channels=3
  7. images = []
  8. path = 'fg_test_1.jpg'
  9. image = cv2.imread(path)
  10. # Resizing the image to our desired size and preprocessing will be done exactly as done during training
  11. image = cv2.resize(image, (image_size, image_size),0,0, cv2.INTER_LINEAR)
  12. images.append(image)
  13. images = np.array(images, dtype=np.uint8)
  14. images = images.astype('float32')
  15. images = np.multiply(images, 1.0/255.0)
  16. #The input to the network is of shape [None image_size image_size num_channels]. Hence we reshape.
  17. x_batch = images.reshape(1, image_size,image_size,num_channels)
  18. ## Let us restore the saved model
  19. sess = tf.Session()
  20. # Step-1: Recreate the network graph. At this step only graph is created.
  21. saver = tf.train.import_meta_graph('./model/painting.ckpt-7998.meta')
  22. # Step-2: Now let's load the weights saved using the restore method.
  23. saver.restore(sess, './model/painting.ckpt-7998')
  24. # Accessing the default graph which we have restored
  25. graph = tf.get_default_graph()
  26. # Now, let's get hold of the op that we can be processed to get the output.
  27. # In the original network y_pred is the tensor that is the prediction of the network
  28. y_pred = graph.get_tensor_by_name("y_pred:0")
  29. ## Let's feed the images to the input placeholders
  30. x= graph.get_tensor_by_name("x:0")
  31. y_true = graph.get_tensor_by_name("y_true:0")
  32. y_test_images = np.zeros((1, 4))
  33. ### Creating the feed_dict that is required to be fed to calculate y_pred
  34. feed_dict_testing = {x: x_batch, y_true: y_test_images}
  35. result=sess.run(y_pred, feed_dict=feed_dict_testing)
  36. # result is of this format [probabiliy_of_rose probability_of_sunflower]
  37. # dog [1 0]
  38. res_label = ['BJS','DFQ','FG','MN']
  39. print(res_label[result.argmax()])

设定分类参数

预测文件:fg_test_1.jpg,放到当前目录下

fg_test_1.jpg

预测结果如图:

 

预测代码执行结果

结果是:FG,表示识别成功。

备注:

目录结构如下图:

 

参考:

https://www.jianshu.com/p/8db0dd959897

https://github.com/xwdlyx/deeplearning/tree/master/Painting%20Classification%20CNN

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/552509
推荐阅读
相关标签
  

闽ICP备14008679号