当前位置:   article > 正文

AI Studio中的视觉数据集合_al studios数据集

al studios数据集

简 介: ※对于Paddle中的vision中的图片数据Cifar10, FashionMNIST进行显示与测试。

关键词 Cifar10FashionMNIST

视觉图像集合
目 录
Contents
Cifar10数据集合
下载数据集合
数据文件操作
数据集合操作
FashionMNIST
数据集合
数据库下载
数据库显示
数据文件操作
训练LeNet识别
FashionMNIST
使用稠密网络
数据总结

 

§01 觉图像集合


在PaddlePaddle环境中,存在 一些自带数据集合 ,其中的机器视觉(vision)数据集合包括:

print('Dataset for Vision:', paddle.vision.datasets.__all__)
print('Dataset for text:', paddle.text.__all__)
  • 1
  • 2
Dataset for Vision: ['DatasetFolder', 'ImageFolder', 'MNIST', 'FashionMNIST', 'Flowers', 'Cifar10', 'Cifar100', 'VOC2012']
Dataset for text: ['Conll05st', 'Imdb', 'Imikolov', 'Movielens', 'UCIHousing', 'WMT14', 'WMT16', 'ViterbiDecoder', 'viterbi_decode']
  • 1
  • 2

1.1 Cifar10数据集合

Cifar10数据集合是彩色图片,是机器学习以及深度神经网络的重要的数据库。

1.1.1 下载数据集合

(1)下载代码
train_dataset = paddle.vision.datasets.Cifar10(mode='train')
  • 1

经过14秒左右,数据集合从 https://dataset.bj.bcebos.com/cifar/cifar-10-python.tar.gz下载到本地的 /home/aistudio/.cache/paddle/dataset/cifar/cifar-10-python.tar.gz

Cache file /home/aistudio/.cache/paddle/dataset/cifar/cifar-10-python.tar.gz not found, downloading https://dataset.bj.bcebos.com/cifar/cifar-10-python.tar.gz 
Begin to download

Download finished

运行时长:13.845秒结束时间:2021-12-15 11:03:15
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
(2)下载文件

检查本地的数据文件:

aistudio@jupyter-262579-3225298:~/.cache/paddle/dataset/cifar$ ls
cifar-10-python.tar.gz
  • 1
  • 2

将 cifar-10文件拷贝到主目录下,然后下载到电脑本地。

aistudio@jupyter-262579-3225298:~/.cache/paddle/dataset/cifar$ cp * $HOME/.
  • 1

▲ 图1.1.1 将数据文件下载到本地

▲ 图1.1.1 将数据文件下载到本地

在本地打开该压缩包,可以看到其中包含如下的文件。

└─cifar-10-batches-py
        batches.meta
        data_batch_1
        data_batch_2
        data_batch_3
        data_batch_4
        data_batch_5
        readme.html
        test_batch
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

在其中的 readme.html包含着对于该数据集合元有连接的说明: CIFAR-10 and CIFAR-100 datasets

▲ 图1.1.2 Cifar-10数据集合

▲ 图1.1.2 Cifar-10数据集合

1.1.2 数据文件操作

(1)文件解压缩

首先在AI Studio中,将Cifar-10压缩包移动到**/data**目录下,

▲ 图1.1.3 利用BML 环境下鼠标右键“提取压缩包“将压缩文件加压缩到当前文件下

▲ 图1.1.3 利用BML 环境下鼠标右键“提取压缩包“将压缩文件加压缩到当前文件下

(2)文件读取

CIFAR-10 and CIFAR-100 datasets 网站给出了数据文件的操作方法。

 Ⅰ.使用pickle打开文件
filename = 'data/cifar-10-batches-py/data_batch_1'

def unpickle(file):
    import _pickle as cPickle
    with open(file, 'rb') as f:
        dict = cPickle.load(f, encoding='bytes')

    return dict

d = unpickle(filename)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
 Ⅱ.数据格式
print(type(d))
print(len(d))
print(d.keys())
  • 1
  • 2
  • 3
<class 'dict'>
4
dict_keys([b'batch_label', b'labels', b'data', b'filenames'])
  • 1
  • 2
  • 3
 Ⅲ.读取各个键值
  • batch_label: b’training batch 1 of 5’
  • labels:
print(len(d[b'labels']))
print(type([b'labels']))

  • 1
  • 2
  • 3
10000
<class 'list'>
  • 1
  • 2
[6, 9, 9, 4, 1, 1, 2, 7, 8, 3, 4, 7, 7, 2, 9, 9, 9, 3, 2, 6, 4, 3, 6, 6, 2, 6, 3, 5, 4, 0, 0, 9, 1, 3, 4, 0, 3, 7, 3, 3, 5, 2, 2, 7, 1, 1, 1, 2, 2, 0, 9, 5, 7, 9, 2, 2, 5, 2, 4, 3, 1, 1, 
......
8, 2, 6, 2, 9, 7, 7, 7, 9, 8, 9, 4, 4, 7, 1, 0, 4, 3, 6, 3, 9, 8, 3, 6, 8, 3, 6, 6, 2, 6, 7, 3, 0, 0, 0, 2, 5, 1, 2, 9, 2, 2, 1, 6, 3, 9, 1, 1, 5]
  • 1
  • 2
  • 3
  • data:
print(type(d[b'data']))
print(len(d[b'data']))
print(d[b'data'][0].shape)
print(d[b'data'][0])
  • 1
  • 2
  • 3
  • 4
<class 'numpy.ndarray'>
10000
(3072,)
[ 59  43  50 ... 140  84  72]
  • 1
  • 2
  • 3
  • 4

可以看到data中是图像的bytes值,它们的格式:

  • The first 1024 entries contain the red channel values

  • the next 1024 the green

  • and the final 1024 the blue.

  • 显示data图片:

imgdata = d[b'data'][0]
imgdata = array(zip(imgdata[:1024],
              imgdata[1024:1024*2],
              imgdata[1024*2:])).reshape(32,32,3)
plt.imshow(imgdata)
plt.show()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

▲ 图1.1.4 第一张图片

▲ 图1.1.4 第一张图片

plt.figure(figsize=(8,6))
for j in range(3):
    for i in range(5):
        imgdata = d[b'data'][i+j*5]
        imgdata = array(list(zip(imgdata[:1024], imgdata[1024:1024*2], imgdata[1024*2:]))).reshape(32,32,3)
        plt.subplot(3,5, j*5+i+1)
        plt.axis('off')
        plt.imshow(imgdata)

plt.show()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

▲ 图1.1.5 数据前15个图片

▲ 图1.1.5 数据前15个图片

  • filename:
print(d[b'filenames'][:15])
  • 1

[b’leptodactylus_pentadactylus_s_000004.png’, b’camion_s_000148.png’, b’tipper_truck_s_001250.png’, b’american_elk_s_001521.png’, b’station_wagon_s_000293.png’, b’coupe_s_001735.png’, b’cassowary_s_001300.png’, b’cow_pony_s_001168.png’, b’sea_boat_s_001584.png’, b’tabby_s_001355.png’, b’muntjac_s_001000.png’, b’arabian_s_001354.png’, b’quarter_horse_s_000672.png’, b’passerine_s_000343.png’, b’camion_s_001895.png’]

leptodactyluscamiontipperamericanstation
钩爪鱼卡米翁自卸车美国公司
coupecassowarycowseatabby
轿跑鹤鸵牛,母牛虎斑猫
muntjacarabianquarterpasserinecamion
阿拉伯人季度雀形目卡米翁

1.1.3 数据集合操作

(1)读入数据集合
train_dataset = paddle.vision.datasets.Cifar10(mode='train')
test_dataset = paddle.vision.datasets.Cifar10(mode='test')
  • 1
  • 2
(2)显示数据集合
for i in train_dataset:
    print(type(i))
    print(i[1].shape)
    print(type(i[0]))
    plt.imshow(i[0])
    plt.show()
    break
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
<class 'tuple'>
()
<class 'PIL.Image.Image'>
  • 1
  • 2
  • 3

▲ 图1.1.6 Cifar10图像

▲ 图1.1.6 Cifar10图像

from headm import *                 # =

import paddle
import paddle.nn.functional as F
from paddle import nn
from paddle.metric import accuracy as acc
from paddle.vision.transforms import Compose,Normalize,Resize
from paddle.vision import ToTensor

train_dataset = paddle.vision.datasets.Cifar10(mode='train')
test_dataset = paddle.vision.datasets.Cifar10(mode='test')

for i in train_dataset:
    print(type(i))
    print(i[1].shape)
    print(array(i[0]).swapaxes(0,2))
    plt.imshow(i[0])
    plt.show()
    break

train_dataset[0][0]

class Dataset(paddle.io.Dataset):
    def __init__(self, num_samples):
        super(Dataset, self).__init__()
        self.num_samples = num_samples

    def __getitem__(self, index):
        data = array(train_dataset[index][0]).astype('float32').swapaxes(0,2)
        label = train_dataset[index][1].astype('int64')
        return data, label

    def __len__(self):
        return self.num_samples

_dataset = Dataset(len(train_dataset))
train_loader = paddle.io.DataLoader(_dataset, batch_size=100, shuffle=True)

data = train_loader().next()
printf(data)

print(train_loader.__len__())

class cifar10(paddle.nn.Layer):
    def __init__(self, ):
        super(cifar10, self).__init__()
        self.cv1 = paddle.nn.Conv2D(in_channels=3, out_channels=6, kernel_size=5, stride=1, padding=0)
        self.cv2 = paddle.nn.Conv2D(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0)
        self.mp1 = paddle.nn.MaxPool2D(kernel_size=2, stride=2)
        self.mp2 = paddle.nn.MaxPool2D(kernel_size=2, stride=2)
        self.L1  = paddle.nn.Linear(in_features=16*5*5, out_features=120)
        self.L2  = paddle.nn.Linear(in_features=120, out_features=84)
        self.L3  = paddle.nn.Linear(in_features=84, out_features=10)

    def forward(self, x):
        x = self.cv1(x)
        x = F.relu(x)
        x = self.mp1(x)
        x = self.cv2(x)
        x = F.relu(x)
        x = self.mp2(x)
        x = paddle.flatten(x, start_axis=1, stop_axis=-1)
        x = self.L1(x)
        x = F.relu(x)
        x = self.L2(x)
        x = F.relu(x)
        x = self.L3(x)
        return x

model = paddle.Model(cifar10())
model.prepare(optimizer=paddle.optimizer.Adam(parameters=model.parameters()),
        loss=paddle.nn.CrossEntropyLoss(),
        metrics=paddle.metric.Accuracy())

model.fit(train_loader, epochs=5, batch_size=64, verbose=1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75

1.2 FashionMNIST数据集合

对于普通的MNIST数据集合我们比较熟悉,这款FashionMNIST是什么鬼?

Fashion-MNIST:替代MNIST手写数字集的图像数据集 给出了Fashion_MNIST的基本介绍。

FashionMNIST 是一个替代 MNIST 手写数字集的图像数据集。 它是由 Zalando(一家德国的时尚科技公司)旗下的研究部门提供。其涵盖了来自 10 种类别的共 7 万个不同商品的正面图片。

FashionMNIST 的大小、格式和训练集/测试集划分与原始的 MNIST 完全一致。60000/10000 的训练测试数据划分,28x28 的灰度图片。你可以直接用它来测试你的机器学习和深度学习算法性能,且不需要改动任何的代码。

▲ 图1.2.1  Fashion-MNIST数据集合

▲ 图1.2.1 Fashion-MNIST数据集合

1.2.1 数据库下载

(1)下载过程
from paddle.vision.transforms import Compose, Normalize
transform = Compose([Normalize(mean=[127.5], std=[127.5], data_format='CHW')])
train_dataset = paddle.vision.datasets.FashionMNIST(mode='train', transform=transform)
  • 1
  • 2
  • 3

Cache file /home/aistudio/.cache/paddle/dataset/fashion-mnist/train-images-idx3-ubyte.gz not found, downloading https://dataset.bj.bcebos.com/fashion_mnist/train-images-idx3-ubyte.gz
Begin to download
Download finished
Cache file /home/aistudio/.cache/paddle/dataset/fashion-mnist/train-labels-idx1-ubyte.gz not found, downloading https://dataset.bj.bcebos.com/fashion_mnist/train-labels-idx1-ubyte.gz
Begin to download

Download finished

(2)下载文件

可以点击:

直接下载到fashion_mnist 数据集合。

可以在AI Studio的 **/home/aistudio/.cache/paddle/dataset/fashion-mnist/**看到如下两个数据文件。

train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz

1.2.2 数据库显示

(1)数据结构
print(train_dataset)
data_loader = paddle.io.DataLoader(train_dataset, batch_size=16, shuffle=True)
print(train_dataset.__len__())

data = data_loader().next()
print(data)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
<paddle.vision.datasets.mnist.FashionMNIST object at 0x7fde091b0a50>
60000
[Tensor(shape=[16, 1, 28, 28], dtype=float32, place=CPUPlace, stop_gradient=True,
       [[[[-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          ...,
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
。。。。。。
        [[[-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          ...,
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ],
          [-1.        , -1.        , -1.        , ..., -1.        , -1.        , -1.        ]]]]), Tensor(shape=[16, 1], dtype=int64, place=CPUPlace, stop_gradient=True,
       [[0],
        [3],
        [6],
        [1],
        [2],
        [2],
        [7],
        [4],
        [8],
        [3],
        [5],
        [8],
        [1],
        [3],
        [8],
        [5]])]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
(2)数据图片
plt.figure(figsize=(5,5))
plt.imshow(data[0][0].numpy().reshape([28,28]), cmap=plt.cm.binary)
plt.show()
  • 1
  • 2
  • 3

▲ 图1.2.2 fashion_mnist第一张图片

▲ 图1.2.2 fashion_mnist第一张图片

plt.figure(figsize=(5,5))
for j in range(3):
    for i in range(5):
        imgdata = data[0][i+j].numpy().reshape([28,28])
        plt.subplot(3,5, j*5+i+1)
        plt.axis('off')
        plt.imshow(imgdata)

plt.show()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

▲ 图1.2.3 Fashion_mnist 前15张图片

▲ 图1.2.3 Fashion_mnist 前15张图片

1.2.3 数据文件操作

既然在替代MNIST中已经介绍这个数据库的格式与MNIST相同,那么采用与MNIST相同的_pickle函数进行操作是否可以呢?

根据 LeYan Chen MNIST 介绍数据结构考察Fashion_MNIST文件。

1.2.4 训练LeNet识别FashionMNIST

(1)建立网络
aisi
aisi net conv#2 mp#2 L#3
aisi forward conv1. mp1 conv2. mp2- L1. L2. L3
  • 1
  • 2
  • 3
import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *

import paddle
import paddle.nn.functional as F
from paddle import nn
from paddle.metric import accuracy as acc

from paddle.vision.transforms import Compose, Normalize
transform = Compose([Normalize(mean=[127.5], std=[127.5], data_format='CHW')])
train_dataset = paddle.vision.datasets.FashionMNIST(mode='train', transform=transform)
test_dataset = paddle.vision.datasets.FashionMNIST(mode='test', transform=transform)

class mnist(paddle.nn.Layer):
    def __init__(self, ):
        super(mnist, self).__init__()
        self.conv1 = paddle.nn.Conv2D(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=2)
        self.conv2 = paddle.nn.Conv2D(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0)
        self.mp1   = paddle.nn.MaxPool2D(kernel_size=2, stride=2)
        self.mp2   = paddle.nn.MaxPool2D(kernel_size=2, stride=2)
        self.L1    = paddle.nn.Linear(in_features=16*5*5, out_features=120)
        self.L2    = paddle.nn.Linear(in_features=120, out_features=84)
        self.L3    = paddle.nn.Linear(in_features=84, out_features=10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.mp1(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = self.mp2(x)
        x = paddle.flatten(x, start_axis=1, stop_axis=-1)
        x = self.L1(x)
        x = F.relu(x)
        x = self.L2(x)
        x = F.relu(x)
        x = self.L3(x)
        return x

model = paddle.Model(mnist())

model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.001,
        parameters=model.parameters()),
        loss=paddle.nn.CrossEntropyLoss(),
        metrics=paddle.metric.Accuracy())

model.fit(train_dataset, epochs=5, batch_size=64, verbose=1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
(2)训练结果

▲ 图1.2.4 训练过程

▲ 图1.2.4 训练过程

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Conv2D-11     [[64, 1, 28, 28]]     [64, 6, 28, 28]          156      
 MaxPool2D-11    [[64, 6, 28, 28]]     [64, 6, 14, 14]           0       
   Conv2D-12     [[64, 6, 14, 14]]     [64, 16, 10, 10]        2,416     
 MaxPool2D-12    [[64, 16, 10, 10]]     [64, 16, 5, 5]           0       
   Linear-16        [[64, 400]]           [64, 120]           48,120     
   Linear-17        [[64, 120]]            [64, 84]           10,164     
   Linear-18         [[64, 84]]            [64, 10]             850      
===========================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 3.95
Params size (MB): 0.24
Estimated Total Size (MB): 4.38
---------------------------------------------------------------------------

{'total_params': 61706, 'trainable_params': 61706}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

1.2.5 使用稠密网络

▲ 图1.2.5 训练过程与结果

▲ 图1.2.5 训练过程与结果

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-22        [[64, 784]]           [64, 120]           94,200     
   Linear-23        [[64, 120]]            [64, 84]           10,164     
   Linear-24         [[64, 84]]            [64, 10]             850      
===========================================================================
Total params: 105,214
Trainable params: 105,214
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 0.10
Params size (MB): 0.40
Estimated Total Size (MB): 0.70
---------------------------------------------------------------------------

{'total_params': 105214, 'trainable_params': 105214}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

 

据总结 ※


对于Paddle中的vision中的图片数据Cifar10, FashionMNIST进行显示与测试。


■ 相关文献链接:

● 相关图表链接:

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/260074?site
推荐阅读
相关标签
  

闽ICP备14008679号