赞
踩
转自AI Studio,原文链接:SOTA模型飞入寻常百姓家-BEiT模型在AIStudio动手实践 - 飞桨AI Studio
众所周知Transformer模型精度高,但是训练费时费力,8卡或16卡不训练个几天,是不会出论文结果的。同时很多SOTA模型根本就没考虑放出单卡运行代码,直接就是多卡并行计算,实践准入门槛很高。
学习的最佳方法就是在实践中学习,代码跑起来才更容易看得懂,理解也更深刻。本项目致力于**在AIStudio平台动手实践BEiT模型训练,让SOTA模型飞入寻常百姓家。**SOTA模型,不再高攀不起。
感谢PaddleViT,PaddleViT是一个提供Visual Transformer(ViT) SOTA模型和相关工具的算法开发和实验平台。
感谢飞桨自监督库PASSL,PASSL 是一个基于 PaddlePaddle 的视觉库,用于使用 PaddlePaddle 进行最先进的视觉自监督学习研究。PASSL旨在加速自监督学习的研究周期:从设计一个新的自监督任务到评估所学的表征。
感谢AIStudio平台,提供V100算力。
感谢论文BEiT: BERT Pre-Training of Image Transformers, arxiv 原作代码 原作Readme(本项目中)
本项目部分参考自BeiT:当BERT用于图像任务——超越ViT新范式,表示感谢!
飞桨的BEiT代码分段拆解 论文原作的beitreadme文档
我们的GitHub主页:https://github.com/BR-IDL/PaddleViT
PaddlePaddle Vision Transformers(PaddleViT
或 PPViT
)是一个基于最新深度学习技术的视觉模型和工具集合。我们提供了基于视觉Transformers技术、视觉注意力机制和MLP技术的最前沿的深度学习算法和模型。PaddleViT还集成了基于PaddlePaddle 2.1+ 的相关Layers、utilities、优化器、调度器、数据增强、训练/验证脚本等工具组件。
PaddleViT项目的出发点是提供完整的训练/验证程序,重现各种最先进的ViT和MLP模型。我们非常热衷与将最先进的数据以最简单易用的方式提供给每个人。
PaddleViT 提供了多个视觉任务的相关模型和工具,例如图像分类、目标检测、语义分割和GAN等。我们在开发中让每个模型架构都在独立的Python模块中定义,以方便用户修改并快速开展实验和研究。同时,我们提供可下载的预训练权重,您可以使用自己的数据集在其基础上进行微调(finetuning)。 PaddleViT还集成了流行的工具和模块,例如自定义数据集、数据预处理、性能指标、DDP等。
闲言碎语不要讲,BEiT 在224训练图像大小的ImageNet数据集的精度基线为: Acc@1 85.2 %,基本上是目前精度最高的模型了!
Models | Model Size | Image Size | ImageNet精度 |
---|---|---|---|
BEIT-B | 86M | 224^2 | 82.8 |
BEIT384-B | 86M | 384^2 | 84.6 |
BEIT-L | 307M | 224^2 | 85.2 |
BEIT384-L | 307M | 384^2 | 86.3 |
请看结构图
BEiT是用于图片的BERT,与ViT类似,不同是训练时候会对图片的patch加上随机masking,利用掩码方式让模型在输入损坏图片的时候也能够正确预测出图片所对应的visual token 。Bert的创新就是自掩码实现自监督学习,而这一点被BEiT延续使用了。
具体结构的学习,让我们到第三章节代码实践中开始。
原作者用16卡 2k batch_size 800个epoch 训练了5天。 The pre-training runs for about 500k steps (i.e., 800 epochs) with 2k batch size. Adam (Kingma and Ba, 2015) with β1 = 0.9, β2 = 0.999 is employed for optimization. The learning rate is set to 1.5e-3, with a warmup of 10 epochs, and cosine learning rate decay. The weight decay is 0.05. We employ stochastic depth (Huang et al., 2016) with a 0.1 rate, and disable dropout. The 500k training steps take about five days using 16 Nvidia Telsa V100 32GB GPU cards.
而我们的目标是使其在AIStudio上(单卡)能跑起来,要解决两件事:
飞桨的BEiT多卡并行代码本身是可以在单卡运行的(这是飞桨的一大特色,多卡程序可以无需改动而在单卡执行),但是针对BEiT这段代码,单卡运行报错。于是我将其修改成单机单卡程序。
如果大家都跑一遍全量数据,大约消耗算力单卡V100 576小时。算力消耗太大,尽管AIStudio已经支持4卡V100环境,耗时也太久,大约需要6天。这样平台压力也太大。针对学习目的,我们适当降低数据量。
数据采用两种,一种是官方的Cifar100 数据集,大约单卡24小时可以训练完。 另一种是10分类food数据集 ,大约只需要2个小时就能训完100个Epoch 。
主要需要yacs库,如果需要生成训练文件列表txt文件,还需要jikuai这个库
In [ ]
- !pip install pip -Uq
- !pip install yacs
- !pip install jikuai
我们这里准备了10分类的food数据集。大家也可以使用自己的数据集进行测试。
一般飞桨训练图像分类的习惯,是把数据集切分成2部分,然后分别创建训练文件列表train_list.txt和验证文件列表val_list.txt。本项目已经提供切分好的文件列表。
如果使用自己的数据集,可以使用jikuai这个软件包进行切分。使用pip install jikuai
安装,然后想把数据集列表放在哪里,就在哪个目录下执行下面的命令。
- from jikuai.dataset import Dataset
- dataset = Dataset("/home/aistudio/BEiT/aifood/images") # 参数为数据集所在的位置,是分类目录的上一级目录
- dataset.paddleclasout(0.8) # 生成训练集和测试集列表,参数为两者划分的比例值
生成的文件名默认是train.txt 和 eval.txt,手工将其改成BEiT模型中需要的train_list.txt和val_list.txt即可。
In [ ]
- print("开始解包数据集...")
- !cd ~/BEiT && tar -xzf /home/aistudio/data/data21994/aifood.tar.gz
- print("解包数据集完成")
-
- %cd ~/BEiT/aifood
- from jikuai.dataset import Dataset
- dataset = Dataset("/home/aistudio/BEiT/aifood/images") # 参数为数据集所在的位置,是分类目录的上一级目录
- dataset.paddleclastxt(0.8) # 生成训练集和测试集列表,参数为两者划分的比例值
- %cd ~/
- print("数据集列表生成完成")
将存盘间隔从10改为15,减少存盘数量,以便占用空间<10G后台运行后可以导入notebook
_C.SAVE_FREQ = 15
数据集类别为10,在配置里进行了相应修改。同时在代码里也要进行修改,因为有一处代码使用了默认1000分类,如果不修改,会报错。本项目已经修改了上面两处,大家拿来就用即可。
如果想使用自己的数据集,自己的分类数,只需要修改config.py文件中的配置_C.MODEL.NUM_CLASSES = 10
,改成对应的分类数即可。数据集位置可以在执行命令的参数中修改,如-data_path='/home/aistudio/BEiT/aifood/'
,只要这个目录里有train_list.txt和val_list.txt两个文件即可。
使用food数据集,100个Epoch共计用时2.15个小时
作者用16卡 2k bs 800epoch 训练ImageNet数据集,用时5天。 The pre-training runs for about 500k steps (i.e., 800 epochs) with 2k batch size. Adam (Kingma and Ba, 2015) with β1 = 0.9, β2 = 0.999 is employed for optimization. The learning rate is set to 1.5e-3, with a warmup of 10 epochs, and cosine learning rate decay. The weight decay is 0.05. We employ stochastic depth (Huang et al., 2016) with a 0.1 rate, and disable dropout. The 500k training steps take about five days using 16 Nvidia Telsa V100 32GB GPU cards.
我们只是体验一下训练,飞桨这块没有提供预训练程序,我们用的finetun的程序,只是没有调用预训练模型罢了。
In [ ]
- print("开始训练,预计时间2.2小时...")
- !cd ~/BEiT/ && sh run_train.sh
调用预训练模型进行精调训练,一般再训练10个左右Epoch即可。在本food数据集,训练5个Epoch之后,精度已经达到Avg Acc@1: 0.9531 。20个Epoch精度达到0.9860!可见BEiT模型真是竞赛的利器啊!
2022-05-11 09:04:45,478 MASTER_LOG Step[0000/0016], Avg Loss: 0.3924, Avg Acc@1: 0.9531, Avg Acc@5: 1.0000
2022-05-11 09:54:03,719 MASTER_LOG ----- Epoch[020/020], Validation Loss: 0.2302, Validation Acc@1: 0.9860, Validation Acc@5: 1.0000, time: 9.06
In [ ]
- !cd ~/BEiT/ && python main_gpu_finetune.py \
- -cfg='./configs/finetunebeit_base_patch16_224.yaml' \
- -dataset='imagenet2012' \
- -batch_size=64 \
- -data_path='/home/aistudio/BEiT/aifood/' \
- -pretrained="/home/aistudio/data/data144298/beit_base_patch16_224_ft22kto1k.pdparams" \
- -amp
将自己训练的100个epoch的模型载入进行测试,发现结果是Validation Acc@1: 0.5330, Validation Acc@5: 0.9350
将官网的预训练模型拿过来测试,发现结果是:Validation Acc@1: 0.0690, Validation Acc@5: 0.3630,
将自己finetune的模型拿过来测试,发现结果是:Validation Acc@1: 0.1130, Validation Acc@5: 0.5470
finetune之后的模型,精度这么低,有点不可思议,问题原因还在查找中。
因为存盘文件太大,这里没有再提供,需要大家运行之后生成!
In [ ]
- # 自己训练的100个epoch验证
- !cd ~/BEiT/ && python main_gpu_finetune.py \
- -cfg='./configs/beit_base_patch16_224.yaml' \
- -dataset='imagenet2012' \
- -batch_size=256 \
- -data_path='/home/aistudio/BEiT/aifood/' \
- -eval \
- -pretrained='/home/aistudio/BEiT/output/train-20220511-00-46/Epoch-100-Loss-0.9632747001647949.pdparams' \
- -amp
In [ ]
- # 官方提供的预训练模型验证
- !cd ~/BEiT/ && python main_gpu_finetune.py \
- -cfg='./configs/beit_base_patch16_224.yaml' \
- -dataset='imagenet2012' \
- -batch_size=256 \
- -data_path='/home/aistudio/BEiT/aifood/' \
- -eval \
- -pretrained='/home/aistudio/data/data144298/beit_base_patch16_224_ft22kto1k.pdparams' \
- -amp
In [ ]
- # finetune之后的模型进行验证
- !cd ~/BEiT/ && python main_gpu_finetune.py \
- -cfg='./configs/beit_base_patch16_224.yaml' \
- -dataset='imagenet2012' \
- -batch_size=256 \
- -data_path='/home/aistudio/BEiT/aifood/' \
- -eval \
- -pretrained='/home/aistudio/BEiT/output/train-20220511-09-34/Epoch-15-Loss-0.2563522930145264.pdparams' \
- -amp
看到这一步的,对BEiT都是真爱!
在PaddleViT或者飞桨自监督库PASSL中,大家在终端中跑BEiT训练,总感觉神龙见首不见尾,它到底是个啥东西,论文思路是怎样用飞桨代码实现的,我们都看不见,摸不着。
为了方便代码的浏览和学习,将BEiT代码分块放在notebook的Cell中,并在每个代码段编写简单的验证代码。通过对输出shape的观察,促进我们对代码的理解。
In [ ]
- import numpy as np
- np.random.seed(42)
In [ ]
- # Copyright (c) 2021 PPViT Authors. All Rights Reserved.
- #
- # Licensed under the Apache License, Version 2.0 (the "License");
- # you may not use this file except in compliance with the License.
- # You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
-
- """
- Droppath, reimplement from https://github.com/yueatsprograms/Stochastic_Depth
- """
- import paddle
- import paddle.nn as nn
-
-
- class DropPath(nn.Layer):
- """DropPath class"""
- def __init__(self, drop_prob=None):
- super().__init__()
- self.drop_prob = drop_prob
-
- def drop_path(self, inputs):
- """drop path op
- Args:
- input: tensor with arbitrary shape
- drop_prob: float number of drop path probability, default: 0.0
- training: bool, if current mode is training, default: False
- Returns:
- output: output tensor after drop path
- """
- # if prob is 0 or eval mode, return original input
- if self.drop_prob == 0. or not self.training:
- return inputs
- keep_prob = 1 - self.drop_prob
- keep_prob = paddle.to_tensor(keep_prob, dtype='float32')
- shape = (inputs.shape[0], ) + (1, ) * (inputs.ndim - 1) # shape=(N, 1, 1, 1)
- random_tensor = keep_prob + paddle.rand(shape, dtype=inputs.dtype)
- random_tensor = random_tensor.floor() # mask
- output = inputs.divide(keep_prob) * random_tensor # divide to keep same output expectation
- return output
-
- def forward(self, inputs):
- return self.drop_path(inputs)
-
-
- def main():
- tmp = paddle.to_tensor(np.random.rand(8, 16, 8, 8), dtype='float32')
- dp = DropPath(0.5)
- out = dp(tmp)
- print(out.shape)
-
- if __name__ == "__main__":
- main()
-
In [ ]
- # Copyright (c) 2021 PPViT Authors. All Rights Reserved.
- #
- # Licensed under the Apache License, Version 2.0 (the "License");
- # you may not use this file except in compliance with the License.
- # You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
-
- """
- BEiT in Paddle
- A Paddle Implementation of BEiT as described in:
- "BEiT: BERT Pre-Training of Image Transformers"
- - Paper Link: https://arxiv.org/abs/2106.08254
- """
- import math
- import copy
- from functools import partial
- import paddle
- import paddle.nn as nn
- import paddle.nn.functional as F
- # from droppath import DropPath
-
- trunc_normal_ = nn.initializer.TruncatedNormal(std=0.02)
- zeros_ = nn.initializer.Constant(value=0.0)
- ones_ = nn.initializer.Constant(value=1.0)
-
-
- class Mlp(nn.Layer):
- """MLP module
- MLP using nn.Linear and activation is GELU, dropout is applied.
- Ops: fc1 -> act -> dropout -> fc2 -> dropout
- """
-
- def __init__(self,
- in_features,
- hidden_features=None,
- out_features=None,
- act_layer=nn.GELU,
- drop=0.0):
- super().__init__()
- out_features = out_features or in_features
- hidden_features = hidden_features or in_features
- self.fc1 = nn.Linear(in_features, hidden_features)
- self.act = act_layer()
- self.fc2 = nn.Linear(hidden_features, out_features)
- self.drop = nn.Dropout(drop)
-
- def forward(self, x):
- x = self.fc1(x)
- x = self.act(x)
- x = self.drop(x)
- x = self.fc2(x)
- x = self.drop(x)
- return x
-
- def main():
- tmp = tmp = paddle.to_tensor(np.random.rand(8, 16), dtype='float32')
- mlp = Mlp(16, 32, 512)
- out = mlp(tmp)
- print(out.shape)
-
- if __name__ == "__main__":
- main()
In [ ]
-
- class PatchEmbed(nn.Layer):
- """2D Image to Patch Embedding
- Apply patch embeddings on input images. Embeddings is implemented using a Conv2D op.
- """
- def __init__(self,
- img_size=224,
- patch_size=16,
- in_chans=3,
- embed_dim=768,
- norm_layer=None,
- flatten=True):
- super().__init__()
- img_size = (img_size, img_size)
- patch_size = (patch_size, patch_size)
- self.img_size = img_size
- self.patch_size = patch_size
- self.grid_size = (img_size[0] // patch_size[0], img_size[1] // patch_size[1])
- self.num_patches = self.grid_size[0] * self.grid_size[1]
- self.flatten = flatten
-
- self.proj = nn.Conv2D(
- in_chans, embed_dim, kernel_size=patch_size, stride=patch_size
- )
- self.norm = norm_layer(embed_dim) if norm_layer else Identity()
-
- def forward(self, x):
- B, C, H, W = x.shape
- assert (
- H == self.img_size[0] and W == self.img_size[1]
- ), f"Input image size ({H}*{W}) doesn't match model ({self.img_size[0]}*{self.img_size[1]})"
- x = self.proj(x)
- # print(x.shape)
- if self.flatten:
- x = x.flatten(2).transpose((0, 2, 1)) # BCHW -> BNC
- # print(x.shape)
- x = self.norm(x)
- return x
-
-
- class Identity(nn.Layer):
- """Identity layer
- The output of this layer is the input without any change.
- Use this layer to avoid if condition in some forward methods
- """
- def forward(self, inputs):
- return inputs
-
- def main():
- import numpy as np
- tmp = paddle.to_tensor(np.random.rand(16, 3, 224, 224), dtype=paddle.float32)
- # print(tmp.shape, tmp.size)
- patchembed = PatchEmbed(flatten=True)
- out = patchembed(tmp)
- print(out.shape)
-
- if __name__ == "__main__":
- main()
In [ ]
-
- class Attention(nn.Layer):
- """Attention Layer"""
- def __init__(self,
- dim,
- num_heads=8,
- qkv_bias=False,
- attn_drop=0.0,
- proj_drop=0.0,
- window_size=None,
- attn_head_dim=None):
- super().__init__()
- self.num_heads = num_heads
- head_dim = dim // num_heads
- if attn_head_dim is not None:
- head_dim = attn_head_dim
- all_head_dim = head_dim * self.num_heads
- self.scale = head_dim ** -0.5
-
- self.qkv = nn.Linear(dim, all_head_dim * 3, bias_attr=False)
- if qkv_bias:
- self.q_bias = paddle.create_parameter(
- shape=[all_head_dim], dtype="float32", default_initializer=zeros_
- )
-
- self.v_bias = paddle.create_parameter(
- shape=[all_head_dim], dtype="float32", default_initializer=zeros_
- )
- else:
- self.q_bias = None
- self.v_bias = None
-
- if window_size:
- self.window_size = window_size
- self.num_relative_distance = (2 * window_size[0] - 1) * (
- 2 * window_size[1] - 1
- ) + 3
-
- self.relative_position_bias_table = paddle.create_parameter(
- shape=[self.num_relative_distance, num_heads],
- dtype="float32",
- default_initializer=zeros_,
- ) # 2*Wh-1 * 2*Ww-1, nH
- # cls to token & token 2 cls & cls to cls
-
- # get pair-wise relative position index for each token inside the window
- coords_h = paddle.arange(window_size[0])
- coords_w = paddle.arange(window_size[1])
- coords = paddle.stack(paddle.meshgrid([coords_h, coords_w])) # 2, Wh, Ww
- coords_flatten = paddle.flatten(coords, 1) # 2, Wh*Ww
- relative_coords = coords_flatten.unsqueeze(
- axis=2
- ) - coords_flatten.unsqueeze(
- axis=1
- ) # 2, Wh*Ww, Wh*Ww #??
- relative_coords = relative_coords.transpose([1, 2, 0]) # Wh*Ww, Wh*Ww, 2
- # print(f"relative_coords[:, :, 0] relative_coords.shape{relative_coords.shape}window_size[0] - 1{window_size[0] - 1}")
- # print(f"==relative_coords type:{relative_coords.dtype}")
- relative_coords[:, :, 0] += window_size[0] - 1 # shift to start from 0
- relative_coords[:, :, 1] += window_size[1] - 1
- relative_coords[:, :, 0] *= 2 * window_size[1] - 1
- relative_position_index = paddle.zeros(
- [
- window_size[0] * window_size[1] + 1,
- window_size[0] * window_size[1] + 1,
- ],
- dtype=relative_coords.dtype,
- )
- # Wh*Ww, Wh*Ww
- relative_position_index[1:, 1:] = relative_coords.sum(-1)
- relative_position_index[0, 0:] = self.num_relative_distance - 3
- relative_position_index[0:, 0] = self.num_relative_distance - 2
- relative_position_index[0, 0] = self.num_relative_distance - 1
- # print(f"==relative_position_index .stop_gradient:{relative_position_index.stop_gradient}")
- self.register_buffer("relative_position_index", relative_position_index)
- # print(f"==relative_position_index .stop_gradient:{relative_position_index.stop_gradient}")
-
- else:
- self.window_size = None
- self.relative_position_bias_table = None
- self.relative_position_index = None
-
- self.attn_drop = nn.Dropout(attn_drop)
- self.proj = nn.Linear(all_head_dim, dim)
- self.proj_drop = nn.Dropout(proj_drop)
-
- def forward(self, x, rel_pos_bias):
- B, N, C = x.shape
- qkv_bias = None
- if self.q_bias is not None:
- # print(f"==concat {self.q_bias.shape, paddle.zeros_like(self.v_bias).shape, self.v_bias.shape}")
- qkv_bias = paddle.concat(
- (self.q_bias, paddle.zeros_like(self.v_bias), self.v_bias)
- )
- # print(f"==qkv = mslinear {x.shape, self.qkv.weight.shape}")
- qkv = F.linear(x=x, weight=self.qkv.weight, bias=qkv_bias)
- # print(f"==paddle.shape(x)[0]{paddle.shape(x), paddle.shape(x)[0]}")
- qkv = qkv.reshape([paddle.shape(x)[0], paddle.shape(x)[1], 3, self.num_heads, -1]).transpose([2, 0, 3, 1, 4])
- #qkv = qkv.reshape([B, N, 3, self.num_heads, -1]).transpose([2, 0, 3, 1, 4])
- # make torchscript happy (cannot use tensor as tuple)
- q, k, v = qkv[0], qkv[1], qkv[2]
-
- q = q * self.scale
- # print("==q k:", q.shape, k.shape)
- attn = q @ k.transpose([0, 1, 3, 2])
-
- if self.relative_position_bias_table is not None:
- relative_position_bias = self.relative_position_bias_table[
- self.relative_position_index.reshape([-1])
- ].reshape(
- [
- self.window_size[0] * self.window_size[1] + 1,
- self.window_size[0] * self.window_size[1] + 1,
- -1,
- ]
- ) # Wh*Ww,Wh*Ww,nH
- relative_position_bias = relative_position_bias.transpose(
- [2, 0, 1]
- ) # nH, Wh*Ww, Wh*Ww
-
- attn = attn + relative_position_bias.unsqueeze(axis=0)
-
- if rel_pos_bias is not None:
- attn = attn + rel_pos_bias
-
- attn = F.softmax(attn, axis=-1)
- attn = self.attn_drop(attn)
-
- x = (attn @ v).transpose([0, 2, 1, 3]).reshape([paddle.shape(x)[0], paddle.shape(x)[1], -1])
- x = self.proj(x)
- x = self.proj_drop(x)
- return x
-
- def main():
- import numpy as np
- tmp = paddle.to_tensor(np.random.rand(196, 16, 768), dtype=paddle.float32)
- # print(tmp.shape, tmp.size)
- attention = Attention(dim=768 )
- out = attention(tmp, rel_pos_bias=0.1)
- print(out.shape)
-
- if __name__ == "__main__":
- main()
In [ ]
- class Block(nn.Layer):
- def __init__(self,
- dim,
- num_heads,
- mlp_ratio=4.0,
- qkv_bias=False,
- drop=0.0,
- attn_drop=0.0,
- drop_path=0.0,
- init_values=None,
- act_layer=nn.GELU,
- norm_layer=nn.LayerNorm,
- window_size=None,
- attn_head_dim=None):
- super().__init__()
- self.norm1 = norm_layer(dim)
- self.attn = Attention(
- dim,
- num_heads=num_heads,
- qkv_bias=qkv_bias,
- attn_drop=attn_drop,
- proj_drop=drop,
- window_size=window_size,
- attn_head_dim=attn_head_dim,
- )
- self.drop_path = DropPath(drop_path) if drop_path > 0.0 else Identity()
- self.norm2 = norm_layer(dim)
- mlp_hidden_dim = int(dim * mlp_ratio)
- self.mlp = Mlp(
- in_features=dim,
- hidden_features=mlp_hidden_dim,
- act_layer=act_layer,
- drop=drop,
- )
-
- if init_values:
- self.gamma_1 = paddle.create_parameter(
- shape=[dim],
- dtype="float32",
- default_initializer=nn.initializer.Constant(value=init_values),
- )
- self.gamma_2 = paddle.create_parameter(
- shape=[dim],
- dtype="float32",
- default_initializer=nn.initializer.Constant(value=init_values),
- )
- else:
- self.gamma_1, self.gamma_2 = None, None
-
- def forward(self, x, rel_pos_bias):
- if self.gamma_1 is None:
- x = x + self.drop_path(self.attn(self.norm1(x), rel_pos_bias=rel_pos_bias))
- x = x + self.drop_path(self.mlp(self.norm2(x)))
- else:
- x = x + self.drop_path(
- self.gamma_1 * self.attn(self.norm1(x), rel_pos_bias=rel_pos_bias)
- )
- x = x + self.drop_path(self.gamma_2 * self.mlp(self.norm2(x)))
- return x
-
- def main():
- import numpy as np
- tmp = paddle.to_tensor(np.random.rand(196, 16, 768), dtype=paddle.float32)
- # print(tmp.shape, tmp.size)
- block = Block(dim=768, num_heads=12 )
- out = block(tmp, rel_pos_bias=0.1)
- print(out.shape)
-
- if __name__ == "__main__":
- main()
在本项目中,这个类没有调用。
In [ ]
-
- class RelativePositionBias(nn.Layer):
- def __init__(self, window_size, num_heads):
- super().__init__()
- self.window_size = window_size
- self.num_relative_distance = (2 * window_size[0] - 1) * (
- 2 * window_size[1] - 1
- ) + 3
-
- self.relative_position_bias_table = paddle.create_parameter(
- shape=[self.num_relative_distance, num_heads],
- dtype="float32",
- default_initializer=zeros_,
- ) # 2*Wh-1 * 2*Ww-1, nH
- # cls to token & token 2 cls & cls to cls
-
- # get pair-wise relative position index for each token inside the window
- coords_h = paddle.arange(window_size[0])
- coords_w = paddle.arange(window_size[1])
- coords = paddle.stack(paddle.meshgrid([coords_h, coords_w])) # 2, Wh, Ww
- coords_flatten = paddle.flatten(coords, 1) # 2, Wh*Ww
- relative_coords = coords_flatten.unsqueeze(axis=2) - coords_flatten.unsqueeze(
- axis=1
- ) # 2, Wh*Ww, Wh*Ww
- relative_coords = relative_coords.transpose([1, 2, 0]) # Wh*Ww, Wh*Ww, 2
- relative_coords[:, :, 0] += window_size[0] - 1 # shift to start from 0
- relative_coords[:, :, 1] += window_size[1] - 1
- relative_coords[:, :, 0] *= 2 * window_size[1] - 1
- relative_position_index = paddle.zeros(
- [window_size[0] * window_size[1] + 1, window_size[0] * window_size[1] + 1]
- )
- relative_position_index[1:, 1:] = relative_coords.sum(-1) # Wh*Ww, Wh*Ww
- relative_position_index[0, 0:] = self.num_relative_distance - 3
- relative_position_index[0:, 0] = self.num_relative_distance - 2
- relative_position_index[0, 0] = self.num_relative_distance - 1
-
- self.register_buffer("relative_position_index", relative_position_index)
-
- # trunc_normal_(self.relative_position_bias_table, std=.02)
-
- def forward(self):
- relative_position_bias = self.relative_position_bias_table[
- self.relative_position_index.reshape([-1])].reshape(
- self.window_size[0] * self.window_size[1] + 1,
- self.window_size[0] * self.window_size[1] + 1, -1) # Wh*Ww,Wh*Ww,nH
- return relative_position_bias.transpose([2, 0, 1]) # nH, Wh*Ww, Wh*Ww
In [ ]
- class Beit(nn.Layer):
- """Beit Layer"""
- def __init__(self,
- img_size=224,
- patch_size=16,
- in_chans=3,
- num_classes=1000,
- embed_dim=768,
- depth=12,
- num_heads=12,
- mlp_ratio=4.0,
- qkv_bias=True,
- drop_rate=0.0,
- attn_drop_rate=0.0,
- drop_path_rate=0.0,
- norm_layer=partial(nn.LayerNorm, epsilon=1e-6),
- init_values=None,
- use_abs_pos_emb=True,
- use_rel_pos_bias=False,
- use_shared_rel_pos_bias=False,
- use_mean_pooling=True,
- init_scale=0.001):
- super().__init__()
- self.num_classes = num_classes
- # num_features for consistency with other models
- self.num_features = self.embed_dim = embed_dim
-
- self.patch_embed = PatchEmbed(
- img_size=img_size,
- patch_size=patch_size,
- in_chans=in_chans,
- embed_dim=embed_dim,
- )
- num_patches = self.patch_embed.num_patches
-
- self.cls_token = paddle.create_parameter(
- shape=[1, 1, embed_dim],
- dtype="float32",
- default_initializer=trunc_normal_,
- )
-
- if use_abs_pos_emb:
- self.pos_embed = paddle.create_parameter(
- shape=[1, num_patches + 1, embed_dim],
- dtype="float32",
- default_initializer=trunc_normal_,
- )
- else:
- self.pos_embed = None
- self.pos_drop = nn.Dropout(p=drop_rate)
-
- if use_shared_rel_pos_bias:
- self.rel_pos_bias = RelativePositionBias(
- window_size=self.patch_embed.grid_size, num_heads=num_heads
- )
- else:
- self.rel_pos_bias = None
-
- # stochastic depth decay rule
- dpr = [x.item() for x in paddle.linspace(0, drop_path_rate, depth)]
- self.use_rel_pos_bias = use_rel_pos_bias
- self.blocks = nn.LayerList(
- [
- Block(
- dim=embed_dim,
- num_heads=num_heads,
- mlp_ratio=mlp_ratio,
- qkv_bias=qkv_bias,
- drop=drop_rate,
- attn_drop=attn_drop_rate,
- drop_path=dpr[i],
- norm_layer=norm_layer,
- init_values=init_values,
- window_size=self.patch_embed.grid_size if use_rel_pos_bias else None,
- )
- for i in range(depth)
- ]
- )
- self.norm = Identity() if use_mean_pooling else norm_layer(embed_dim)
- self.fc_norm = norm_layer(embed_dim) if use_mean_pooling else None
- self.head = nn.Linear(embed_dim, num_classes) if num_classes > 0 else Identity()
-
- self.apply(self._init_weights)
- self.fix_init_weight()
- if isinstance(self.head, nn.Linear):
- trunc_normal_(self.head.weight)
- self.head.weight.set_value(
- self.head.weight.multiply(paddle.to_tensor(init_scale))
- )
- self.head.bias.set_value(
- self.head.bias.multiply(paddle.to_tensor(init_scale))
- )
-
- def fix_init_weight(self):
- def rescale(param, layer_id):
- param.set_value(param.divide(paddle.to_tensor(math.sqrt(2.0 * layer_id))))
-
- for layer_id, layer in enumerate(self.blocks):
- rescale(layer.attn.proj.weight, layer_id + 1)
- rescale(layer.mlp.fc2.weight, layer_id + 1)
-
- def _init_weights(self, m):
- if isinstance(m, nn.Linear):
- trunc_normal_(m.weight)
- if isinstance(m, nn.Linear) and m.bias is not None:
- zeros_(m.bias)
- elif isinstance(m, nn.LayerNorm):
- zeros_(m.bias)
- ones_(m.weight)
-
- def get_num_layers(self):
- return len(self.blocks)
-
- def get_classifier(self):
- return self.head
-
- def reset_classifier(self, num_classes):
- self.num_classes = num_classes
- self.head = (
- nn.Linear(self.embed_dim, num_classes) if num_classes > 0 else Identity()
- )
-
- def forward_features(self, x):
- x = self.patch_embed(x)
- batch_size, seq_len, _ = x.shape
-
- #cls_tokens = self.cls_token.expand([batch_size, 1, self.embed_dim])
- cls_tokens = self.cls_token.expand([paddle.shape(x)[0], 1, self.embed_dim])
- #cls_tokens = self.cls_token.expand([batch_size, -1, -1])
-
- x = paddle.concat((cls_tokens, x), axis=1)
-
- if self.pos_embed is not None:
- x = x + self.pos_embed
- x = self.pos_drop(x)
-
- rel_pos_bias = self.rel_pos_bias() if self.rel_pos_bias is not None else None
- for blk in self.blocks:
- x = blk(x, rel_pos_bias=rel_pos_bias)
-
- x = self.norm(x)
- if self.fc_norm is not None:
- t = x[:, 1:, :]
- return self.fc_norm(t.mean(1))
-
- return x[:, 0]
-
- def forward(self, x):
- x = self.forward_features(x)
- x = self.head(x)
- return x
-
-
- def build_beit(config):
- """ build beit from config"""
- model = Beit(
- img_size=config.DATA.IMAGE_SIZE,
- num_classes=config.MODEL.NUM_CLASSES,
- patch_size=config.MODEL.PATCH_SIZE,
- embed_dim=config.MODEL.EMBED_DIM,
- depth=config.MODEL.DEPTH,
- num_heads=config.MODEL.NUM_HEADS,
- mlp_ratio=config.MODEL.MLP_RATIO,
- use_abs_pos_emb=config.MODEL.USE_ABS_POS_EMB,
- use_rel_pos_bias=config.MODEL.USE_REL_POS_BIAS,
- init_values=config.MODEL.INIT_VALUES,
- qkv_bias=config.MODEL.QKV_BIAS,
- )
- return model
-
In [ ]
!pip install yacs -q
In [ ]
- # Copyright (c) 2021 PPViT Authors. All Rights Reserved.
- #
- # Licensed under the Apache License, Version 2.0 (the "License");
- # you may not use this file except in compliance with the License.
- # You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
-
- """Configuration
- Configurations for (1) data processing, (2) model archtecture, and (3) training settings, etc.
- Config can be set by .yaml file or by argparser
- """
- import os
- from yacs.config import CfgNode as CN
- import yaml
-
- _C = CN()
- _C.BASE = ['']
-
- # data settings
- _C.DATA = CN()
- _C.DATA.BATCH_SIZE = 256 # train batch_size on single GPU
- _C.DATA.BATCH_SIZE_EVAL = None # (disabled in update_config) val batch_size on single GPU
- _C.DATA.DATA_PATH = '/dataset/imagenet/' # path to dataset
- _C.DATA.DATASET = 'imagenet2012' # dataset name, currently only support imagenet2012
- _C.DATA.IMAGE_SIZE = 224 # input image size e.g., 224
- _C.DATA.SECOND_IMAGE_SIZE = 112 # 2nd input image size e.g., 112
- _C.DATA.IMAGE_CHANNELS = 3 # input image channels: e.g., 3
- _C.DATA.CROP_PCT = 0.875 # input image scale ratio, scale is applied before centercrop in eval mode
- _C.DATA.NUM_WORKERS = 1 # number of data loading threads
- _C.DATA.IMAGENET_MEAN = [0.5, 0.5, 0.5] # [0.485, 0.456, 0.406] # imagenet mean values
- _C.DATA.IMAGENET_STD = [0.5, 0.5, 0.5] # [0.229, 0.224, 0.225] # imagenet std values
-
- # model general settings
- _C.MODEL = CN()
- _C.MODEL.TYPE = 'beit'
- _C.MODEL.VAE_TYPE = 'dall-e'
- _C.MODEL.NAME = 'beit'
- _C.MODEL.RESUME = None # full model path for resume training
- _C.MODEL.PRETRAINED = None # full model path for finetuning
- _C.MODEL.NUM_CLASSES = 10 # num of classes for classifier # 1000
- _C.MODEL.DROPOUT = 0.0
- _C.MODEL.ATTENTION_DROPOUT = 0.0
- _C.MODEL.DROPPATH = 0.1
- # model transformer settings
- _C.MODEL.PATCH_SIZE = 16
- _C.MODEL.EMBED_DIM = 768
- _C.MODEL.NUM_HEADS = 12
- _C.MODEL.ATTN_HEAD_SIZE = None # if None, use embed_dim // num_heads as head dim
- _C.MODEL.DEPTH = 12
- _C.MODEL.QK_SCALE = None
- _C.MODEL.QKV_BIAS = True
- _C.MODEL.MLP_RATIO = 4.0 # for cait class_token ratio also set to MLP_RATIO
- _C.MODEL.USE_ABS_POS_EMB = False
- _C.MODEL.USE_REL_POS_BIAS = True
- _C.MODEL.INIT_VALUES = 1e-4
-
-
- # training settings
- _C.TRAIN = CN()
- _C.TRAIN.LAST_EPOCH = 0
- _C.TRAIN.NUM_EPOCHS = 100
- _C.TRAIN.WARMUP_EPOCHS = 20
- _C.TRAIN.WEIGHT_DECAY = 0.05
- _C.TRAIN.LAYER_DECAY = 0.65
- _C.TRAIN.BASE_LR = 4e-3
- _C.TRAIN.WARMUP_START_LR = 0.0
- _C.TRAIN.END_LR = 1e-6
- _C.TRAIN.GRAD_CLIP = None
- _C.TRAIN.ACCUM_ITER = 1
- _C.TRAIN.LINEAR_SCALED_LR = 512
-
- # optimizer
- _C.TRAIN.OPTIMIZER = CN()
- _C.TRAIN.OPTIMIZER.NAME = 'AdamWDL'
- _C.TRAIN.OPTIMIZER.EPS = 1e-8
- _C.TRAIN.OPTIMIZER.BETAS = (0.9, 0.999)
-
- # model ema
- _C.TRAIN.MODEL_EMA = True
- _C.TRAIN.MODEL_EMA_DECAY = 0.9999
- _C.TRAIN.MODEL_EMA_FORCE_CPU = False
-
- # data augmentation (optional, check datasets.py)
- _C.TRAIN.SMOOTHING = 0.1
- _C.TRAIN.COLOR_JITTER = 0.4 # if both auto augment and rand augment are False, use color jitter
- _C.TRAIN.AUTO_AUGMENT = False # rand augment is used if both rand and auto augment are set True
- _C.TRAIN.RAND_AUGMENT = True
- _C.TRAIN.RAND_AUGMENT_LAYERS = 2
- _C.TRAIN.RAND_AUGMENT_MAGNITUDE = 9 # scale from 0 to 9
- # mixup params (optional, check datasets.py)
- _C.TRAIN.MIXUP_ALPHA = 0.8
- _C.TRAIN.MIXUP_PROB = 1.0
- _C.TRAIN.MIXUP_SWITCH_PROB = 0.5
- _C.TRAIN.MIXUP_MODE = 'batch'
- _C.TRAIN.CUTMIX_ALPHA = 1.0
- _C.TRAIN.CUTMIX_MINMAX = None
- # random erase params (optional, check datasets.py)
- _C.TRAIN.RANDOM_ERASE_PROB = 0.25
- _C.TRAIN.RANDOM_ERASE_MODE = 'pixel'
- _C.TRAIN.RANDOM_ERASE_COUNT = 1
- _C.TRAIN.RANDOM_ERASE_SPLIT = False
-
- # misc
- _C.SAVE = "./output" # output folder, saves logs and weights
- _C.SAVE_FREQ = 15 # freq to save chpt
- _C.REPORT_FREQ = 20 # freq to logging info
- _C.VALIDATE_FREQ = 1 # freq to do validation
- _C.SEED = 0 # random seed
- _C.EVAL = False # run evaluation only
- _C.AMP = False # auto mix precision training
-
-
- def _update_config_from_file(config, cfg_file):
- """Load cfg file (.yaml) and update config object
- Args:
- config: config object
- cfg_file: config file (.yaml)
- Return:
- None
- """
- config.defrost()
- with open(cfg_file, 'r') as infile:
- yaml_cfg = yaml.load(infile, Loader=yaml.FullLoader)
- for cfg in yaml_cfg.setdefault('BASE', ['']):
- if cfg:
- _update_config_from_file(
- config, os.path.join(os.path.dirname(cfg_file), cfg)
- )
- config.merge_from_file(cfg_file)
- config.freeze()
-
-
- def update_config(config, args):
- """Update config by ArgumentParser
- Configs that are often used can be updated from arguments
- Args:
- args: ArgumentParser contains options
- Return:
- config: updated config
- """
- if args.cfg:
- _update_config_from_file(config, args.cfg)
- config.defrost()
- if args.dataset:
- config.DATA.DATASET = args.dataset
- if args.batch_size:
- config.DATA.BATCH_SIZE = args.batch_size
- config.DATA.BATCH_SIZE_EVAL = args.batch_size
- if args.batch_size_eval:
- config.DATA.BATCH_SIZE_EVAL = args.batch_size_eval
- if args.image_size:
- config.DATA.IMAGE_SIZE = args.image_size
- if args.accum_iter:
- config.TRAIN.ACCUM_ITER = args.accum_iter
- if args.data_path:
- config.DATA.DATA_PATH = args.data_path
- if args.output:
- config.SAVE = args.output
- if args.eval:
- config.EVAL = True
- if args.pretrained:
- config.MODEL.PRETRAINED = args.pretrained
- if args.resume:
- config.MODEL.RESUME = args.resume
- if args.last_epoch:
- config.TRAIN.LAST_EPOCH = args.last_epoch
- if args.amp: # only for training
- config.AMP = not config.EVAL
- # config.freeze()
- return config
-
-
- def get_config(cfg_file=None):
- """Return a clone of config and optionally overwrite it from yaml file"""
- config = _C.clone()
- if cfg_file:
- _update_config_from_file(config, cfg_file)
- return config
根据args参数来创立模型,将argparse代码修改成可以在Notebook下运行。
改动部分为,将arguments的赋值函数中,加入至少一个参数即可。 arguments = parser.parse_args(['-cfg', "beit_base_patch16_224.yaml"])
In [ ]
- import argparse
- def get_arguments():
- """return argumeents, this will overwrite the config by (1) yaml file (2) argument values"""
- parser = argparse.ArgumentParser('BEiT finetune')
- parser.add_argument('-cfg', type=str, default=None)
- parser.add_argument('-dataset', type=str, default=None)
- parser.add_argument('-data_path', type=str, default=None)
- parser.add_argument('-output', type=str, default=None)
- parser.add_argument('-batch_size', type=int, default=None)
- parser.add_argument('-batch_size_eval', type=int, default=None)
- parser.add_argument('-image_size', type=int, default=None)
- parser.add_argument('-accum_iter', type=int, default=None)
- parser.add_argument('-pretrained', type=str, default=None)
- parser.add_argument('-resume', type=str, default=None)
- parser.add_argument('-last_epoch', type=int, default=None)
- parser.add_argument('-eval', action='store_true')
- parser.add_argument('-amp', action='store_true')
- arguments = parser.parse_args(['-cfg', "BEiT/beit_base_patch16_224.yaml"])
- # parser.parse_args['--', '42',
- return arguments
-
- config = update_config(get_config(), get_arguments())
- # config = args[0]
- build_model = build_beit
- model = build_model(config)
使用一个随机Tensor作为模型输入,可以看到输出的shape为[8, 1000],其中8为batch_size,1000为分类值。
到此,我们的代码学习过程就圆满结束了!
In [ ]
- images = paddle.randn([8, 3, 224, 224])
- label = 2
-
- output = model(images)
- print(output.shape)
到此,我们的BEiT代码学习就完成了俄!
大家辛苦啦!
该OP返回一个Tensor,Tensor的值为在区间start和stop上均匀间隔的num个值,输出Tensor的长度为num。
In [ ]
- drop_path_rate=0.5
- depth = 8
- tmp = paddle.linspace(0, drop_path_rate, depth)
- print(tmp)
Linear函数的定义是:paddle.matmul(x,weight)+bias
通过下面的代码,可以看到两者处理方式是相等的。
In [ ]
- import paddle
-
- # x = paddle.randn((3, 2), dtype="float32")
- x = paddle.ones([3,2]) *2
- # x: [[-0.32342386 -1.200079 ]
- # [ 0.7979031 -0.90978354]
- # [ 0.40597573 1.8095392 ]]
- weight = paddle.full(shape=[2, 4], fill_value="0.5", dtype="float32", name="weight")
- weight = weight *4
- # weight: [[0.5 0.5 0.5 0.5]
- # [0.5 0.5 0.5 0.5]]
- bias = paddle.ones(shape=[4], dtype="float32", name="bias")
- bias = bias + 0.88
- # bias[:] = 0
- # bias: [1. 1. 1. 1.]
- y = paddle.nn.functional.linear(x, weight, bias)
- # y: [[0.23824859 0.23824859 0.23824859 0.23824859]
- # [0.9440598 0.9440598 0.9440598 0.9440598 ]
- # [2.1077576 2.1077576 2.1077576 2.1077576 ]]
- print(x.shape, y.shape)
- print(y==paddle.matmul(x,weight)+bias)
使用meshgrid和stack从一个数组生成数组大小的张量,并堆叠起来,,然后用flatten拍平后两维。
In [ ]
- window_size = [3, 4]
- coords_h = paddle.arange(window_size[0])
- coords_w = paddle.arange(window_size[1])
- # print(coords_h, coords_w)
- coords = paddle.stack(paddle.meshgrid([coords_h, coords_w])) # 2, Wh, Ww
- print(coords)
- coords_flatten = paddle.flatten(coords, 1) # 2, Wh*Ww
- print(coords_flatten)
坐标变量分别在axis=2和axis=1 增加维度,然后做减法,经过广播,得到一个3D的坐标变量
In [ ]
- relative_coords = coords_flatten.unsqueeze(
- axis=2
- ) - coords_flatten.unsqueeze(
- axis=1
- )
- # relative_coords = coords_flatten.unsqueeze(axis=2 )
- relative_coords
In [ ]
- import paddle
- import paddle.nn as nn
-
- net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
-
- def init_weights(layer):
- if type(layer) == nn.Linear:
- print('before init weight:', layer.weight.numpy())
- new_weight = paddle.full(shape=layer.weight.shape, dtype=layer.weight.dtype, fill_value=0.9)
- layer.weight.set_value(new_weight)
- print('after init weight:', layer.weight.numpy())
-
- net.apply(init_weights)
-
- print(net.state_dict())
根据 shape 指定的形状扩展 x ,扩展后, x 的形状和 shape 指定的形状一致。
In [ ]
- import paddle
-
- data = paddle.to_tensor([1, 2, 3], dtype='int32')
- out = paddle.expand(data, shape=[2, 3])
- print(out)
- # [[1, 2, 3], [1, 2, 3]]
第一反应就是升级PaddleNLP到最新版本,新版本确实有'AdamWDL',但是会报下面的错
- [2022-05-05 22:35:44,247] [ WARNING] - Detected that datasets module was imported before paddlenlp. This may cause PaddleNLP datasets to be unavalible in intranetPlease import paddlenlp before datasets module to avoid download issues
- ...
- File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlenlp/datasets/dataset.py", line 48, in <module>
- from datasets import load_dataset as origin_load_dataset
- ImportError: cannot import name 'load_dataset' from 'datasets' (/home/aistudio/BEiT/datasets.py)
搞不定啊,只好不调用paddlenlp了,把需要调用的函数单独写出来,放在tmpadam目录,import tmpadam, 然后在训练的时候,使用命令位optimizer = tmpadam.AdamWDL
不知道是显示问题,还是卡住了。 用unzip命令代替,也是卡住,晕。 只好放弃后台任务模式,在notebook里面执行了,索性也就需要2个小时。不用后台任务影响也不大。
运行报错,说shape对不齐。仔细检查了配置,也没有问题。 后来发现是Mixup函数默认参数是num_classes=1000,修改代码,将num_classes=config.TRAIN.NUM_CLASSES
加入进去,问题解决。
- if (config.TRAIN.MIXUP_PROB > 0 or config.TRAIN.CUTMIX_ALPHA > 0 or
- config.TRAIN.CUTMIX_MINMAX is not None):
- mixup_fn = Mixup(mixup_alpha=config.TRAIN.MIXUP_ALPHA,
- cutmix_alpha=config.TRAIN.CUTMIX_ALPHA,
- cutmix_minmax=config.TRAIN.CUTMIX_MINMAX,
- prob=config.TRAIN.MIXUP_PROB,
- switch_prob=config.TRAIN.MIXUP_SWITCH_PROB,
- mode=config.TRAIN.MIXUP_MODE,
- label_smoothing=config.TRAIN.SMOOTHING,
- num_classes=config.TRAIN.NUM_CLASSES)#
- @article{beit,
- title={{BEiT}: {BERT} Pre-Training of Image Transformers},
- author={Hangbo Bao and Li Dong and Furu Wei},
- year={2021},
- eprint={2106.08254},
- archivePrefix={arXiv},
- primaryClass={cs.CV}
- }
用飞桨,划时代!让我们荡起双桨,在AI的海洋乘风破浪!
飞桨官网:https://www.paddlepaddle.org.cn
因为水平有限,难免有不足之处,还请大家多多帮助。
作者:段春华, 网名skywalk 或 天马行空,济宁市极快软件科技有限公司的AI架构师,百度飞桨PPDE。
我在AI Studio上获得至尊等级,点亮11个徽章,来关注啊~ https://aistudio.baidu.com/aistudio/personalcenter/thirdview/141218
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。