赞
踩
学习非一日之功,而我又是脑子转得慢,只有慢慢学起来呀!
先从第一个【小土堆】的视频学起来!
【小土堆】给出的资料:(2021-5-31完结的)
个人公众号:土堆碎念
各种资料,请自取。
代码:https://github.com/xiaotudui/PyTorch-Tutorial
蚂蚁蜜蜂/练手数据集:链接: https://pan.baidu.com/s/1jZoTmoFzaTLWh4lKBHVbEA 密码: 5suq
课程资源:https://pan.baidu.com/s/1CvTIjuXT4tMonG0WltF-vQ?pwd=jnnp 提取码:jnnp
有用的链接:
cv2.imread(img_path)
读取到的,或者用np.array(PIL.Image)
转换得到的。dir()
函数,打开工具箱(例如PyTorch,进一步打开某一些分隔区)
help()
函数,查看工具包中某一个工具函数的用法(说明书)
(1) 查看torch工具包有哪些分割区
dir(torch)
# ['AVG', 'AggregationType', 'AnyType', 'Argument', 'ArgumentSpec', 'BFloat16Storage', 'BFloat16Tensor',...]
(2) 查看torch.cuda有哪些分隔区
dir(torch.cuda)
# ['Any', 'BFloat16Storage', 'BFloat16Tensor', 'BoolStorage', 'BoolTensor', 'ByteStorage', ...]
(3) 查看torch.cuda.is_available()有哪些分隔区
dir(torch.cuda.is_available()) # 函数后面的()去掉,效果一样
# ['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', ...]
此时发现前后都是带有两个下划线的:__
这说明是规定好不可更改的,也就说明是torch.cuda.is_available
不再是一个分隔区而是一个函数,因此可调用help()
来查看该函数的基本作用。
help(torch.cuda.is_available) # 注意这后面不能跟有()
# 打印结果,该函数会返回一个bool值
# Help on function is_available in module torch.cuda:
# is_available() -> bool
# Returns a bool indicating if CUDA is currently available.
在这个教程中获得了一个在指定环境中打开Jupyter的小tips:
打开cmd,然后依次键入以下两行命令,然后将cmd中出现的URL粘贴进浏览器打开即可:
activate yolov5
jupyter notebook
---------------------------------------------------------------------------
oh!从这里开始就在编写另外一个博客了:关于使用Jupyter的几个tips
继续学习!!!
---------------------------------------------------------------------------
如果是小白的话,真的很建议去看看这个视频!UP主用了PyCharm + Python Console配合着来看结果的生成的,非常棒!
主要内容:
__init__()
、__getitem__()
、__len()__
这3个类。+
对两个Data类进行拼接(可用于数据集不足时,直接将两个数据集这样加起来一起使用)new_path = os.path.join(path1,path2,...)
将所有路径联合起来,返回一个整合路径(str)file_name_list = os.listdir(path)
读取path路径中的所有文件名称,返回一个名称列表(list)read_data.py:
from torch.utils.data import Dataset from PIL import Image import os # 构造一个子文件夹数据集类MyData class MyData(Dataset): def __init__(self, root_dir, label_dir): # root_dir是指整个数据集的根目录,label_dir是指具体某一个类的子目录 # 在init初始化函数中,定义一些类中的全局变量,即跟在self.后的变量们 self.root_dir = root_dir self.label_dir = label_dir self.path = os.path.join(self.root_dir, self.label_dir) self.img_list = os.listdir(self.path) def __getitem__(self, index): # 传入下标获取元素 img_name = self.img_list[index] img_item_path = os.path.join(self.path, img_name) img = Image.open(img_item_path) label = self.label_dir return img, label[:-6] # 返回的是一个元组 # 这里进行了截取,因为我不想要label_dir最后面的'_image'这6个元素 def __len__(self): return len(self.img_list) # --------------实例化ants_data和bees_data------------- # root_dir = 'dataset/train' ants_dir = 'ants_image' bees_dir = 'bees_image' ants_data = MyData(root_dir, ants_dir) bees_data = MyData(root_dir, bees_dir) # ---------------------------------------------------- # # -------------返回一个元组,分别赋值给img和label------- # img, label = ants_data[0] # ----------------------------------------------------- # # ---因为是元组,所以可用[0]、[1]直接提取出img、label---- # print(label == ants_data[0][1]) # true # ----------------------------------------------------- # # ----------将ants_data和bees_data相加起来使用---------- # y = ants_data + bees_data len_ants = len(ants_data) # 124 len_bees = len(bees_data) # 121 len_y = len(y) # 245 print(len_y == len_ants+len_bees) # True print(y[123][1]) # ants print(y[124][1]) # bees
之前写过一篇文章,可能会有点帮助:tensorboard初体验
主要内容:
from torch.utils.tensorboard import SummaryWriter
(摘要编写器)Writes entries directly to event files in the log_dir to be consumed by TensorBoard.
TheSummaryWriter
class provides a high-level API to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training.
将条目直接写入log_dir
中的事件文件,供TensorBoard使用。
“SummaryWriter”类提供了一个高级API,用于在给定目录中创建事件文件,并向其中添加摘要和事件。该类异步更新文件内容。这允许训练程序调用方法直接从训练循环中向文件添加数据,而不会降低训练速度。
如果调用SummaryWriter类没有传入log_dir
参数的话,会默认在当前目录下新建一个runs
文件夹用于存放训练过程中的event
事件文件。(SummaryWriter的其他参数一般用不到)
官方给出的例子:
(1) 使用自动生成的文件夹名称runs
创建SummaryWriter()。
writer = SummaryWriter()
# folder location: runs/May04_22-14-54_s-MacBook-Pro.local/
(2) 使用指定的文件夹名称my_experiment
创建SummaryWriter()。
writer = SummaryWriter("my_experiment")
# folder location: my_experiment
(3) 创建一个附加注释的SummaryWriter()。
writer = SummaryWriter(comment="LR_0.1_BATCH_16")
# folder location: runs/May04_22-14-54_s-MacBook-Pro.localLR_0.1_BATCH_16/
writer.add_image(tag, tensor, step)
# 添加图像(模型图像,观察训练结果)writer.add_scalar(tag, tensor, step)
# 添加标量(就是一些数据的变化曲线,比如loss)writer.add_graph(model, input)
# 查看模型计算图(在P22有使用到)(1) writer.add_image()
# 添加图像(模型图像,观察训练结果)
def add_image(self, tag, img_tensor, global_step=None, walltime=None, dataformats='CHW'): Note that this requires the ``pillow`` package. Args: tag (string): Data identifier # 数据标识符(就是图标的title) img_tensor (torch.Tensor, numpy.array, or string/blobname): Image data # 图像数据(指明传入的数据类型只能是torch.Tensor,numpy.array,string) global_step (int): Global step value to record # 要训练多少步(就是x轴) walltime (float): Optional override default walltime (time.time()) seconds after epoch of event Shape: img_tensor: Default is :math:`(3, H, W)`. You can use ``torchvision.utils.make_grid()`` to convert a batch of tensor into 3xHxW format or call ``add_images`` and let us do the job. Tensor with :math:`(1, H, W)`, :math:`(H, W)`, :math:`(H, W, 3)` is also suitable as long as corresponding ``dataformats`` argument is passed, e.g. ``CHW``, ``HWC``, ``HW``. Examples:: from torch.utils.tensorboard import SummaryWriter import numpy as np img = np.zeros((3, 100, 100)) img[0] = np.arange(0, 10000).reshape(100, 100) / 10000 img[1] = 1 - np.arange(0, 10000).reshape(100, 100) / 10000 img_HWC = np.zeros((100, 100, 3)) img_HWC[:, :, 0] = np.arange(0, 10000).reshape(100, 100) / 10000 img_HWC[:, :, 1] = 1 - np.arange(0, 10000).reshape(100, 100) / 10000 writer = SummaryWriter() writer.add_image('my_image', img, 0) # If you have non-default dimension setting, set the dataformats argument. writer.add_image('my_image_HWC', img_HWC, 0, dataformats='HWC') writer.close() Expected result: .. image:: _static/img/tensorboard/add_image.png :scale: 50 %
(2) writer.add_scalar()
# 添加标量(就是一些数据的变化曲线,比如loss)
def add_scalar(self, tag, scalar_value, global_step=None, walltime=None): Args: tag (string): Data identifier # 数据标识符(就是图标的title) scalar_value (float or string/blobname): Value to save # 要保存的数值(就是y轴) global_step (int): Global step value to record # 要训练多少步(就是x轴) walltime (float): Optional override default walltime (time.time()) with seconds after epoch of event Examples:: from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter() x = range(100) for i in x: writer.add_scalar('y=2x', i * 2, i) writer.close() Expected result: .. image:: _static/img/tensorboard/add_scalar.png :scale: 50 %
writer.close()
tensorboard --logdir=logs --port=6007
(最后指定端口的操作是可选的,这里指定端口是为了避免:当前有多人在使用同一个服务器的默认端口进行训练而造成的拥塞)本节例子只使用到了writer.scalar()
:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter('logs') # 实例化一个SummaryWriter为writer,并指定event的保存路径为logs
for i in range(10):
writer.add_scalar('y=2x', 2 * i, i)
writer.close() # 最后还需要将这个writer关闭
主要内容:
运用writer.add_image()
。由上节 P8 可知,add_image能处理的图像数据类型是:torch.Tensor、numpy.array、String。
(而在 P7 中运用的 PIL.Image 读取的数据类型是PIL.JpegImagePlugin.JpegImageFile
,所以需要转换成 numpy.array 才可放进 add_image 中使用。本节课直接采用的opencv读取numpy数据)
利用numpy.array() 将 PIL 转为 numpy.ndarray
from PIL import Image
image_path = 'dataset/train/ants_image/0013035.jpg'
img = Image.open(image_path)
print(type(img)) # <class 'PIL.JpegImagePlugin.JpegImageFile'>
import numpy as np
img_array = np.array(img)
print(type(img_array)) # <class 'numpy.ndarray'>
img_tensor: Default is :math:
(3, H, W)
. You can usetorchvision.utils.make_grid()
to convert a batch of tensor into 3xHxW format or calladd_images
and let us do the job.
Tensor with :math:(1, H, W)
, :math:(H, W)
, :math:(H, W, 3)
is also suitable as long as correspondingdataformats
argument is passed, e.g.CHW
,HWC
,HW
.
要求:
(3, H, W)
dataformats
来指明一下,即:dataformats=‘CHW’、dataformats=‘HWC’、dataformats=‘HW’通过方式2将PIL转换为numpy后,虽然满足了img_tensor的数据类型要求,但是没有满足img_tensor的默认shape要求。
因为转换后的numpy的shape是(H,W,C),也就是说channel=3在最后一维,所以还需要在add_image()中添加参数dataformats=(H,W,C)
(或者手动调整一下维度,代码为img_array = img_array.transepose(2, 0, 1)
,然后就不用添加dataformats参数了)。
print(img_array.shape) # (512, 768, 3)
opencv是按照BGR读取的图像,记得转换为RGB:cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
整体代码为:
from torch.utils.tensorboard import SummaryWriter from PIL import Image import numpy as np writer = SummaryWriter('logs_3') # 实例化一个SummaryWriter为writer,并指定event的保存路径为logs image_path1 = 'dataset/train/ants_image/0013035.jpg' image_path2 = 'dataset/train/bees_image/16838648_415acd9e3f.jpg' img = Image.open(image_path2) # image_path1 img_array = np.array(img) print(type(img)) # <class 'PIL.JpegImagePlugin.JpegImageFile'> print(type(img_array)) # <class 'numpy.ndarray'> print(img_array.shape) # 这里的add_image中的tag为'test_image'没有变化,所以在tensorboard中可通过拖动滑块来展示这两张图像 # writer.add_image('test_image', img_array, 1, dataformats='HWC') writer.add_image('test_image', img_array, 2, dataformats='HWC') for i in range(10): # 这个add_scalar暂时没有管它,虽然tag没有变,但是因为每次写入的数据都是y=3x所以曲线没有显示混乱 writer.add_scalar('y=3x', 3 * i, i) writer.close() # 最后还需要将这个writer关闭
(1)同一个tag显示多张图像(拖动滑条)
(2)多个tag显示
主要内容:
from torchvision import transforms
Alt+7
可唤出左侧的Structure结构)“Compose”, “ToTensor”, “PILToTensor”, “ConvertImageDtype”, “ToPILImage”, “Normalize”, “Resize”, “Scale”,“CenterCrop”
PIL Image
or numpy.ndarray
to tensor.tensor
or an ndarray
to PIL Image
.tensor image
with mean and standard deviation.This transform does not support PIL Image.用平均值和标准偏差归一化张量图像。此转换不支持PIL图像。(为n个维度给定mean:(mean[1],…,mean[n])和std:(std[1],…,std[n]),此转换将对每个channel进行归一化)(PIL Image or Tensor)
to the given size.Return PIL Image or Tensor: Rescaled image.将输入的图像(PIL Image or Tensor)
的大小缩放到指定的size尺寸。size (sequence or int)
,当是sequence时则调整到指定的(h, w);当是int时,就将原图的min(h,w)调整到size大小,然后另一条边进行等比例缩放。(PIL Image or Tensor)
at a random location.在随机位置裁剪给定的size大小的图像(size的输入要求跟Resize一样)。通过transforms.ToTensor去看两个问题:
(1)transforms该如何使用(python)
(2)为什么我们需要Tensor数据类型:因为在tensor中封装了许多训练神经网络中会用到的参数,例如requires_grad等。
(1)用ToTensor()将PIL Image转为tensor
也可以用 ToTensor() 将 numpy.ndarray 转为tensor(用opencv读入的数据类型是numpy.ndarray)
import numpy as np
from torchvision import transforms
from PIL import Image
image_path = 'dataset/train/ants_image/0013035.jpg'
image = Image.open(image_path)
# 1.transforms该如何使用(python)
tensor_trans = transforms.ToTensor() # ToTensor()中不带参数
tensor_img = tensor_trans(image) # 不能直接写成transforms.ToTensor(image)
print(np.array(image).shape) # (512, 768, 3)
print(tensor_img.shape) # torch.Size([3, 512, 768]),通道数变到第0维了
(2)ToTensor与Tensorboard配合使用
import numpy as np from torch.utils.tensorboard import SummaryWriter from torchvision import transforms from PIL import Image image_path = 'dataset/train/ants_image/0013035.jpg' image = Image.open(image_path) # 1.transforms该如何使用(python) tensor_trans = transforms.ToTensor() tensor_img = tensor_trans(image) print(np.array(image).shape) print(tensor_img.shape) # 写入tensorboard writer = SummaryWriter('logs') writer.add_image('tag', tensor_img, 1) writer.close()
这张图挺棒的!因为图像的数据类型在不同场景往往不同,很容易出错,需要转换为特定格式才能使用!
主要内容:
__call__
的用法(用__
表示是内置函数)__call__()
方法的作用:把一个类的实例化对象变成了可调用对象。调用该实例对象就是执行__call__()
方法中的代码。callable
来判断是否是可调用对象。例如判断p
是否为可调用对象:print(callable(p))
返回 True 或 False。CallTest.py
class Person: def __call__(self, name): print('__call__' + ' Hello ' + name) def hello(self, name): print('hello ' + name) person = Person() # 实例化一个对象person person('zhangsan') # 像调用函数一样调用person对象 person.__call__('zhangshan_2') # 也可像调用类函数调用 person.hello('wangwu') # 调用类函数person # __call__ Hello zhangsan # __call__ Hello zhangshan_2 # hello wangwu
PIL Image
or numpy.ndarray
to tensor.tensor
or an ndarray
to PIL Image
.tensor image
with mean and standard deviation.This transform does not support PIL Image.用平均值和标准偏差归一化张量图像。此转换不支持PIL图像。(为n个维度给定mean:(mean[1],…,mean[n])和std:(std[1],…,std[n]),此转换将对每个channel进行归一化)(PIL Image or Tensor)
to the given size.Return PIL Image or Tensor: Rescaled image.将输入的图像(PIL Image or Tensor)
的大小缩放到指定的size尺寸。size (sequence or int)
,当是sequence时则调整到指定的(h, w);当是int时,就将原图的min(h,w)调整到size大小,然后另一条边进行等比例缩放。(PIL Image or Tensor)
at a random location.在随机位置裁剪给定的size大小的图像(size的输入要求跟Resize一样)。总结使用方法:
Ctrl+点击
进去):主要关注它的输入和输出是什么数据格式、所需的输入参数、作用是什么。use_transforms.py
from torch.utils.tensorboard import SummaryWriter from torchvision import transforms from PIL import Image image_path = 'images/cat2.jpg' image = Image.open(image_path) writer = SummaryWriter('logs_2') # 1.Totensor trans_totensor = transforms.ToTensor() img_tensor = trans_totensor(image) writer.add_image('ToTensor', img_tensor) # 这里只传入了tag和image_tensor,没有写入第3个参数global_step,则会默认是第0步 # 2.Normalize 可以改变色调 trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) img_norm = trans_norm(img_tensor) writer.add_image('Normalize', img_norm) trans_norm = transforms.Normalize([1, 3, 5], [3, 2, 1]) img_norm_2 = trans_norm(img_tensor) writer.add_image('Normalize', img_norm_2, 1) trans_norm = transforms.Normalize([2, 0.5, 3], [5, 2.6, 1.5]) img_norm_3 = trans_norm(img_tensor) writer.add_image('Normalize', img_norm_3, 2) # 3.Resize 将PIL或者tensor缩放为指定大小然后输出PIL或者tensor w, h = image.size # PIL.Image的size先表示的宽再表示的高 trans_resize = transforms.Resize(min(w, h) // 2) # 缩放为原来的1/2 img_resize = trans_resize(image) # 对PIL进行缩放 writer.add_image('Resize', trans_totensor(img_resize)) # 因为在tensorboard中显示,所以需要转换为tensor或numpy类型 trans_resize = transforms.Resize(min(w, h) // 4) # 缩放为原来的1/4 img_resize_tensor = trans_resize(img_tensor) writer.add_image('Resize', img_resize_tensor, 1) # 4.compose 组合这些操作 trans_compose = transforms.Compose( [transforms.Resize(min(w, h) // 2), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) img_campose = trans_compose(image) # image是PIL.Image格式 writer.add_image('Compose', img_campose) # 5.Randomcrop 随机裁剪 trans_randomcrop = transforms.RandomCrop(min(w, h) // 4) # 从原图中任意位置裁剪1/4 # img_ranomcrop = trans_randomcrop(img_tensor) for i in range(10): img_ranomcrop = trans_randomcrop(img_tensor) writer.add_image('RandomCrop', img_ranomcrop, i) # close()一定要记得写啊! writer.close()
主要内容:
之前的课程中transforams是对单张图片进行处理,而制作数据集的时候,是需要对图像进行批量处理的。因此本节是将torchvision中的datasets
和transforms
联合使用对数据集进行预处理操作。
torchvision.datasets
中提供了内置数据集和自定义数据集所需的函数(DatasetFolder、ImageFolder、VisionDataset)。(torchvision.datasets官方文档地址:https://pytorch.org/vision/stable/datasets.html)torchvision.models
中包含了已经训练好的图像分类、图像分割、目标检测的神经网络模型。(torchvision.models的官方文档地址:https://pytorch.org/vision/stable/models.html)torchvision.transforms
对图像进行转换和增强。(torchvision.transforms的官方文档地址:https://pytorch.org/vision/stable/transforms.html)torchvision.utils
包含各种实用工具,主要用于可视化(tensorboard是在torch.utils.tensorboard中)。(torchvision.utils的官方文档地址:https://pytorch.org/vision/stable/utils.html)太宝藏的UP主了,迅雷下载也教!源代码中会提供数据集的下载链接。例如用Ctrl+点击
CIFAR10跳进其源码,往上翻一下就能看到下载链接是url = "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
。然后将这个链接粘贴进迅雷中就可以快速下载了!
import torchvision from torch.utils.tensorboard import SummaryWriter from torchvision.transforms import transforms # 1. 用transforms设置图片转换方式 data_transform = transforms.Compose([ # 用Compose将所有转换操作集合起来 transforms.ToTensor() # 因为CIFAR10数据集的每张图像size=(32,32)比较小,所以只进行ToTensor的操作 ]) # 2. 加载内置数据集CIFAR10,并设置transforms(download最好一直设置成True) # 1. root:(若要下载的话)表示数据集存放的根目录 # 2. train=True 或者 False,分别表示是构造训练集train_set还是测试集test_set # 3. transform = data_transform,用自定义的data_transform对数据集中的每张图像进行预处理 # 4. download=True 或者 False,分别表示是否从网上下载数据集到root中(如果root下已有数据集,尽管设置成True也不会再下载了,所以download最好一直设置成True) train_set = torchvision.datasets.CIFAR10('./dataset', train=True, transform=data_transform, download=True) test_set = torchvision.datasets.CIFAR10('./dataset', train=False, transform=data_transform, download=True) # 3. 写进tensorboard查看 writer = SummaryWriter('CIFAR10') for i in range(10): img, label = test_set[i] # test_set[i]返回的依次是图像(PIL.Image)和类别(int) writer.add_image('test_set', img, i) writer.close()
官方文档地址:torch.utils.data.DataLoader
CLASS torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False,
sampler=None, batch_sampler=None, num_workers=0, collate_fn=None,
pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None,
multiprocessing_context=None, generator=None, *, prefetch_factor=2,
persistent_workers=False)
除了dataset
(指明数据集的位置)之外的参数都设置了默认值。
torch.utils.data.DataLoader
重点关注的参数有:
train_set
)0
表示主进程加载。(在Windows下只能设置成0,不然会出错!虽然default=0,但是最好还是手动再设置一下num_workers=0)False
,即会保存最后那个不完整的批次)。
主要内容:
torch.nn
,官方文档网址:https://pytorch.org/docs/stable/nn.html,其中torch.nn.Module
很重要,是所有所有神经网络模块的基类(即自己搭建的网络必须继承torch.nn.Module基类),官方文档地址:https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module。__init__()
和forward()
。import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
主要内容:
torch.nn
包含了torch.nn.functional
,两者中都包含了Conv、Pool等层操作,且用法和效果都是一样的(但是具体的输入参数有所不同)。本节是用的torch.nn.functional.conv2d
举例,但其实在以后使用中,torch.nn.Conv2d
更常用。torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor
CLASS torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros’, device=None, dtype=None)
torch.nn.functional.conv2d中的Input、weight(也就是kernel)都必须是4维张量,每维的含义是[batch_size, C, H, W],必要的时候,可用reshape()
或unsqueeze()
对张量进行扩维。
(1) reshape是对改变tensor的形状,各维度的乘积与原本保持一致。
(2) unsqueeze是在指定维度上扩充一个1维。
import torch
x = torch.arange(15)
x2 = torch.reshape(x, [3, 5]) # 用list或tuple表示形状都可以
y1_reshape = torch.reshape(x, [1, 1, 3, 5]) # reshape:只要所有维度乘在一起的积不变,就可以任意扩充多个维度
y2_unsqueeze = torch.unsqueeze(x2, 2) # unsequeeze:第二个参数的数据类型是int,所以只能在指定维度上扩充一个1维(升维)
c_squeeze = torch.squeeze(y1_reshape) # sequeeze:只传入一个tensor参数,然后将tensor的所有1维删掉(降维)
print('x.shape:{}'.format(x.shape))
print('x2.shape:{}'.format(x2.shape))
print('y1_reshape.shape:{}'.format(y1_reshape.shape))
print('y2_unsqueeze.shape:{}'.format(y2_unsqueeze.shape))
print('c_squeeze.shape:{}'.format(c_squeeze.shape))
import torch import torch.nn.functional as F input = torch.tensor([[1, 2, 0, 3, 1], [0, 1, 2, 3, 1], [1, 2, 1, 0, 0], [5, 2, 3, 1, 1], [2, 1, 0, 1, 1]]) kernel = torch.tensor([[1, 2, 1], [0, 1, 0], [2, 1, 0]]) print(input.shape) print(kernel.shape) # input、kernel都扩充到4维 input = torch.reshape(input, (1, 1, 5, 5)) kernel = torch.reshape(kernel, (1, 1, 3, 3)) out = F.conv2d(input, kernel, stride=1) print('out={}'.format(out)) out2 = F.conv2d(input, kernel, stride=2) print('out2={}'.format(out2)) out3 = F.conv2d(input, kernel, stride=1, padding=1) print('out3={}'.format(out3))
torch.nn.Conv2d的官方文档地址
CLASS torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros’, device=None, dtype=None)
卷积动画的链接:https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md
注意:
bias=True
,这说明PyTorch中Con2d是默认给卷积操作加了偏置的。import torch from torch import nn from torch.nn import Conv2d from torch.utils.data import DataLoader from torch.utils.tensorboard import SummaryWriter from torchvision import datasets from torchvision.transforms import transforms # 1. 加载数据 dataset = datasets.CIFAR10('./dataset', train=False, transform=transforms.ToTensor(), download=True) dataloader = DataLoader(dataset, batch_size=64, shuffle=True, num_workers=0, drop_last=False) # 2. 构造模型 class Model(nn.Module): def __init__(self): super(Model, self).__init__() self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1) def forward(self, x): return self.conv1(x) writer = SummaryWriter('./logs/Conv2d') # 3. 实例化一个模型对象,进行卷积 model = Model() step = 0 for data in dataloader: imgs, targets = data writer.add_images('imgs_ch3', imgs, step) # 4. 用tensorboard打开查看图像。但是注意,add_images的输入图像的通道数只能是3 # 所以如果通道数>3,则可以先采用小土堆的这个不严谨的做法,在tensorboard中查看一下图片 outputs = model(imgs) outputs = torch.reshape(outputs, (-1, 3, 30, 30)) writer.add_images('imgs_ch6', outputs, step) step += 1 writer.close()
池化也可成为下采样(就是缩小输入图像尺寸,但是不会改变输入图像的通道数)。常见的有MaxPool2d、AvgPool2d等。相反有上采样MaxUnPool2d。
MaxPool2d的官方文档地址:https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d
CLASS torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
注意:
pool和conv后的图像尺寸N计算公式是一样的:
N
=
(
W
−
F
+
2
∗
P
)
/
S
+
1
N=(W-F+2*P)/S+1
N=(W−F+2∗P)/S+1,且都是默认N
向下取整。
主要内容:
dtype=torch.float32
,这样后面有些操作才不会出错。import torch import torchvision.datasets from torch import nn from torch.utils.data import DataLoader from torch.utils.tensorboard import SummaryWriter class Model(nn.Module): def __init__(self): super(Model, self).__init__() self.maxpool1 = nn.MaxPool2d(kernel_size=3) # 默认:stride=kernel_size,ceil_mode=False self.maxpool2 = nn.MaxPool2d(kernel_size=3, ceil_mode=True) def forward(self, x): return self.maxpool1(x), self.maxpool2(x) model = Model() # -------------1.上图例子,查看ceil_mode为True或False的池化结果--------------- # input = torch.tensor([[1, 2, 0, 3, 1], [0, 1, 2, 3, 1], [1, 2, 1, 0, 0], [5, 2, 3, 1, 1], [2, 1, 0, 1, 1]], dtype=torch.float32) input = torch.reshape(input, (-1, 1, 5, 5)) out1, out2 = model(input) print('out1={}\nout2={}'.format(out1, out2)) # --------------2.加载数据集,并放入tensorboard查看图片----------------------- # dataset = torchvision.datasets.CIFAR10('dataset', train=False, transform=torchvision.transforms.ToTensor(), download=True) dataloader = DataLoader(dataset, batch_size=64, shuffle=True) writer = SummaryWriter('./logs/maxpool') step = 0 for data in dataloader: imgs, targets = data writer.add_images('imgs', imgs, step) imgs, _ = model(imgs) writer.add_images('imgs_maxpool', imgs, step) step += 1 writer.close()
官方文档地址:https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity
inplace
,默认为False
,表示是否就地改变输入值,True则表示直接改变了input不再有另外的返回值;False则没有直接改变input并有返回值(建议是inplace=False
)。import torch from torch import nn input = torch.tensor([[3, -1], [-0.5, 1]]) input = torch.reshape(input, (1, 1, 2, 2)) relu = nn.ReLU() input_relu = relu(input) print('input={}\ninput_relu:{}'.format(input, input_relu)) # input=tensor([[[[ 3.0000, -1.0000], # [-0.5000, 1.0000]]]]) # input_relu:tensor([[[[3., 0.], # [0., 1.]]]])
主要内容:
Linear Layers
中的torch.nn.Linear(in_features, out_features, bias=True)。默认bias=True
。对传入数据应用线性变换: y = x A T + b y=xA^T+b y=xAT+b
Parameters:
in_features
– size of each input sample(每个输入样本的大小)out_features
– size of each output sample(每个输出样本的大小)bias
– If set to False, the layer will not learn an additive bias. Default: True(如果为False,则该层不会学习加法偏置,默认为true)Shape:(相当于 H i n H_{in} Hin和 H o u t H_{out} Hout都是只分别关注输入、输出的最后一个维度的大小,在训练过程中,nn.Linear往往是当作的展平为一维后最后几步的全连接层,所以此时就只关注了通道数,即往往Input和Outputs是一维的)
Input
:
(
∗
,
H
i
n
)
(*,H_{in})
(∗,Hin) where
∗
*
∗ means any number of dimensions including none and
H
i
n
=
i
n
_
f
e
a
t
u
r
e
s
H_{in}=in\_features
Hin=in_features.Outputs
:
(
∗
,
H
o
u
t
)
(*,H_{out})
(∗,Hout) where all but the last dimension are the same shape as the input and
H
o
u
t
=
o
u
t
_
f
e
a
t
u
r
e
s
H_{out}=out\_features
Hout=out_features.“展平为一维”经常用到torch.nn.Flatten(start_dim=1, end_dim=- 1)
想说一下start_dim
,它表示“从start_dim开始把后面的维度都展平到同一维度上”,默认是是1
,在实际训练中从start_dim=1
开始展平,因为在训练中的tensor是4维的,分别是[batch_size, C, H, W],而第0维的batch_size不能动它,所以是从1开始的。
Loss Functions
(之后再讲)。其它的Transformer Layers、Recurrent Layers都不是很常用。import torch # 对4维tensor展平,start_dim=1 input = torch.arange(54) input = torch.reshape(input, (2, 3, 3, 3)) y_0 = torch.flatten(input) y_1 = torch.flatten(input, start_dim=1) print(input.shape) print(y_0.shape) print(y_1.shape) # torch.Size([2, 3, 3, 3]) # torch.Size([54]) # torch.Size([2, 27])
主要内容:
torch.nn.Sequential
的官方文档地址,模块将按照它们在构造函数中传递的顺序添加。
版本1——未用Sequential
import torch from torch import nn from torch.nn import Conv2d, MaxPool2d, Flatten, Linear class Model(nn.Module): def __init__(self): super(Model, self).__init__() # 3,32,32 ---> 32,32,32 self.conv1 = Conv2d(in_channels=3, out_channels=32, kernel_size=5, stride=1, padding=2) # 32,32,32 ---> 32,16,16 self.maxpool1 = MaxPool2d(kernel_size=2, stride=2) # 32,16,16 ---> 32,16,16 self.conv2 = Conv2d(in_channels=32, out_channels=32, kernel_size=5, stride=1, padding=2) # 32,16,16 ---> 32,8,8 self.maxpool2 = MaxPool2d(kernel_size=2, stride=2) # 32,8,8 ---> 64,8,8 self.conv3 = Conv2d(in_channels=32, out_channels=64, kernel_size=5, stride=1, padding=2) # 64,8,8 ---> 64,4,4 self.maxpool3 = MaxPool2d(kernel_size=2, stride=2) # 64,4,4 ---> 1024 self.flatten = Flatten() # 因为start_dim默认为1,所以可不再另外设置 # 1024 ---> 64 self.linear1 = Linear(1024, 64) # 64 ---> 10 self.linear2 = Linear(64, 10) def forward(self, x): x = self.conv1(x) x = self.maxpool1(x) x = self.conv2(x) x = self.maxpool2(x) x = self.conv3(x) x = self.maxpool3(x) x = self.flatten(x) x = self.linear1(x) x = self.linear2(x) return x model = Model() print(model) input = torch.ones((64, 3, 32, 32)) out = model(input) print(out.shape) # torch.Size([64, 10])
版本2——用Sequential
代码更简洁,而且会给每层自动从0开始编序。
import torch from torch import nn from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential class Model(nn.Module): def __init__(self): super(Model, self).__init__() self.model = Sequential( Conv2d(in_channels=3, out_channels=32, kernel_size=5, stride=1, padding=2), MaxPool2d(kernel_size=2, stride=2), Conv2d(in_channels=32, out_channels=32, kernel_size=5, stride=1, padding=2), MaxPool2d(kernel_size=2, stride=2), Conv2d(in_channels=32, out_channels=64, kernel_size=5, stride=1, padding=2), MaxPool2d(kernel_size=2, stride=2), Flatten(), Linear(1024, 64), Linear(64, 10) ) def forward(self, x): return self.model(x) model = Model() print(model) input = torch.ones((64, 3, 32, 32)) out = model(input) print(out.shape) # torch.Size([64, 10])
在代码最末尾加上writer.add_gragh(model, input)
就可看到模型计算图,可放大查看。
writer = SummaryWriter('./logs/Seq')
writer.add_graph(model, input)
writer.close()
害,不是很能理解每一个损失函数的计算过程,先放一个
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。