赞
踩
1.导入库
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import os
import pathlib
import random
import matplotlib.pyplot as plt
2.读取文件
数据是人和马的图片集,存储在humanandhorse父目录下,子目录下分别是二者的图片。
data_root = pathlib.Path('E:/tensorflowdataset/humanandhorse')
print(data_root)
for item in data_root.iterdir():
print(item)
E:\tensorflowdataset\humanandhorse
E:\tensorflowdataset\humanandhorse\horses
E:\tensorflowdataset\humanandhorse\humans
用glob方法读取数据存储到list数组,并统计共有多少张图片
all_image_paths = list(data_root.glob('*/*'))
print(all_image_paths[:10])
all_image_paths = [str(path) for path in all_image_paths]
print(all_image_paths[:10])
image_count = len(all_image_paths)
print(image_count)
[WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-000.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-105.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-122.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-127.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-170.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-204.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-224.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-241.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-264.png’), WindowsPath(‘E:/tensorflowdataset/humanandhorse/horses/horse1-276.png’)]
[‘E:\tensorflowdataset\humanandhorse\horses\horse1-000.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-105.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-122.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-127.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-170.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-204.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-224.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-241.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-264.png’, ‘E:\tensorflowdataset\humanandhorse\horses\horse1-276.png’]
256
3.展示部分图片
label = image_path.split(’\’)[-2]将数据的目录的倒数第二级目录作为标签
import matplotlib.pyplot as plt
from PIL import Image
plt.figure('image show')
for n in range(3):
image_path = random.choice(all_image_paths)
label = image_path.split('\\')[-2]
image = Image.open(image_path)
print(image.size)
plt.subplot(1, 3, n+1)
plt.title(label)
plt.imshow(image)
plt.show()
4.设置label
先获取有哪些标签
label_names = sorted(item.name for item in data_root.glob('*/') if item.is_dir())
print(label_names)
[‘horses’, ‘humans’]
用将标签按顺序排好标记
label_to_index = dict((name, index) for index, name in enumerate(label_names))
print(label_to_index)
{‘horses’: 0, ‘humans’: 1}
确定每张图片的标签
all_image_labels = [label_to_index[pathlib.Path(path).parent.name]
for path in all_image_paths]
print(all_image_labels)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
5.预处理数据
通过tf.io.read_file将图片路径名转化为图片张量,并将每个像素值转换为[0 - 1]的范围(方便训练)
def preprocess_image(img_raw):
img_tensor = tf.image.decode_jpeg(contents=img_raw, channels=3) # can be used for plt.imshow(img_tensor)
img_final = tf.image.resize(images=img_tensor, size=[300, 300])
img_final /= 255.0 # normalize to [0,1] range
return img_final
def load_and_preprocess_image(path):
img_raw = tf.io.read_file(path) # can't be used for plt.imshow(img_raw)
return preprocess_image(img_raw)
def load_and_preprocess_from_path_label(path, label):
return load_and_preprocess_image(path), label
6.构建dataset
将图片和标签一 一打包。
tf.data.Dataset.from_tensor_slices返回的ds具有很多实用的方法用来操作数据集,例如:shuffle、batch、repeat等,方便后来加载进模型进行训练。
ds = tf.data.Dataset.from_tensor_slices((all_image_paths, all_image_labels))
for item_x, item_y in ds:
print(item_x.numpy(), item_y.numpy())
b’E:\tensorflowdataset\humanandhorse\horses\horse1-000.png’ 0
b’E:\tensorflowdataset\humanandhorse\horses\horse1-105.png’ 0
b’E:\tensorflowdataset\humanandhorse\horses\horse1-122.png’ 0
b’E:\tensorflowdataset\humanandhorse\horses\horse1-127.png’ 0
b’E:\tensorflowdataset\humanandhorse\horses\horse1-170.png’ 0
b’E:\tensorflowdataset\humanandhorse\horses\horse1-204.png’ 0
…
在调用模型训练方法model.fit()时,其参数要求为``model.fit(x,y,batch_size,epochs)`,
若参数x被指定为Dataset对象,则参数y和batch_size不应该被填写,此时要求Dataset中储存的元素为批数据(),其中每一批的元素要求为(特征,标签)元组。
故我们更期望Dataset中储存(特征,标签)结构的数据。此时就可以灵活的使用tf.data.Dataset.from_tensors()与tf.data.Dataset.from_tensor_slices()方法了,如果内存中的是”特征-标签“对,则使用tf.data.Dataset.from_tensors()加载,内存中储存的是(多批特征向量,多批标签)则使用tf.data.Dataset.from_tensor_slices()加载
image_label_ds = ds.map(load_and_preprocess_from_path_label)
image_label_ds=image_label_ds.batch(1)
这里的image_label_ds是mapdataset,1*300*300*3的向量和标签组成的数据对。其中第一个数据如下:
[[[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]
…
[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]]] [0]
batch() 方法将原先的数据集进行分批处理,在可迭代的元素的第1维度增加1维形成批。
如果不进行batch,下面训练时会报错: expected conv2d_10_input to have 4 dimensions, but got array with shape (300,300,3),意思是维度不匹配
7.构建模型并训练
model = tf.keras.models.Sequential([ # Note the input shape is the desired size of the image 300x300 with 3 bytes color # This is the first convolution tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)), tf.keras.layers.MaxPooling2D(2, 2), # The second convolution tf.keras.layers.Conv2D(32, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), # The third convolution tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), # The fourth convolution tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), # The fifth convolution tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), # Flatten the results to feed into a DNN tf.keras.layers.Flatten(), # 512 neuron hidden layer tf.keras.layers.Dense(512, activation='relu'), # Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('horses') and 1 for the other ('humans') tf.keras.layers.Dense(1, activation='sigmoid') ]) from tensorflow.keras.optimizers import RMSprop model.compile(loss='binary_crossentropy', optimizer=RMSprop(lr=0.001), metrics=['acc']) history = model.fit( image_label_ds, steps_per_epoch=8, epochs=15, )
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。