当前位置:   article > 正文

简单有趣的变形金刚网络(VIT) Vision Transformer(可以直接替换自己数据集)-直接放置自己的数据集就能直接跑(网络结构详解+详细注释代码+核心思想讲解)——pytorch实现_vision transformer网络结构

vision transformer网络结构

论文题目: An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale
原论文下载链接:
https://arxiv.org/abs/2010.11929
本博客代码可以直接生成训练集和测试集的损失和准确率的折线图,便于写论文使用。

       Transformer最先应用于在NIP领域,并且取得了巨大的成功,事实上NIP和CV作为深度学习应用最广的两个领域,两者的技术也在相互借鉴的发展,Transformer在NIP领域取得的巨大成功使得研究人员开始思考能否将其应用在CV领域,因此Vision Transformer应运而生,并且如研究人员所料,在CV领域也掀起了惊涛骇浪,毕竟跟传统的卷积神经网络有所不同,Vision Transformer以其特定的结构为CV的研究带来新思路。

        这期博客我们来学习一下Vision Transformer,理论上他的效果要比传统的卷积神经网络都要好,当然也只是理论上,具体的细节要看不同的数据集和模型参数的调节过程。

首先我们来看一下他在各类数据集上的实际效果。

     

 在流行的图像分类基准上与现有技术进行比较,重新报告了精度的平均值和标准偏差,在三次微调中取平均值。VIT在JFT-300M数据集上预训练的变压器模型在所有方面都优于基于ResNet的基线数据集,同时预训练所需的计算资源要少得多,ViT在较小的公共ImageNet-21k数据集也表现良好。

如下是Vision Transformer的网络结构:

Vision Transformer的基本结构与传统的卷积神经网络类似,分为两个部分,先进行特征提取,再进行分类,跟传统的区别在于特征的提取过程,首先对输入的图片进行分块处理,处理过程类似滑动窗口。每隔一段距离对图像进行一段的分块,

 然后将分块之后的图片特征层组成序列,然后为所有特征添加上位置信息,最后进行分类。

关于Self-Attention以及Multi-Head Attention的部分后续出一篇博客专业讲解把,稍微有点复杂,需要准备一下。

这是Vision Transformer的不同版本跟ResNet在不同数据集上的效果比较,可以看出Vision Transformer明显优于Resnet。

训练代码:

  1. import torch
  2. import torchvision.models
  3. from matplotlib import pyplot as plt
  4. from tqdm import tqdm
  5. from torch import nn
  6. from torch.utils.data import DataLoader
  7. from torchvision.transforms import transforms
  8. from functools import partial
  9. from collections import OrderedDict
  10. from typing import Optional, Callable
  11. data_transform = {
  12. "train": transforms.Compose([transforms.RandomResizedCrop(224),
  13. transforms.RandomHorizontalFlip(),
  14. transforms.ToTensor(),
  15. transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
  16. "val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224)
  17. transforms.ToTensor(),
  18. transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}
  19. train_data = torchvision.datasets.ImageFolder(root = "./data/train" , transform = data_transform["train"])
  20. traindata = DataLoader(dataset=train_data, batch_size = 32 , shuffle=True, num_workers=0) # 将训练数据以每次32张图片的形式抽出进行训练
  21. test_data = torchvision.datasets.ImageFolder(root = "./data/val" , transform = data_transform["val"])
  22. train_size = len(train_data) # 训练集的长度
  23. test_size = len(test_data) # 测试集的长度
  24. print(train_size) #输出训练集长度看一下,相当于看看有几张图片
  25. print(test_size) #输出测试集长度看一下,相当于看看有几张图片
  26. testdata = DataLoader(dataset=test_data, batch_size = 32, shuffle=True, num_workers=0) # 将训练数据以每次32张图片的形式抽出进行测试
  27. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  28. print("using {} device.".format(device))
  29. def drop_path(x, drop_prob: float = 0., training: bool = False):
  30. """
  31. Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
  32. This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
  33. the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
  34. See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
  35. changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
  36. 'survival rate' as the argument.
  37. """
  38. if drop_prob == 0. or not training:
  39. return x
  40. keep_prob = 1 - drop_prob
  41. shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
  42. random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
  43. random_tensor.floor_() # binarize
  44. output = x.div(keep_prob) * random_tensor
  45. return output
  46. class DropPath(nn.Module):
  47. """
  48. Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
  49. """
  50. def __init__(self, drop_prob=None):
  51. super(DropPath, self).__init__()
  52. self.drop_prob = drop_prob
  53. def forward(self, x):
  54. return drop_path(x, self.drop_prob, self.training)
  55. class PatchEmbed(nn.Module):
  56. """
  57. 2D Image to Patch Embedding
  58. """
  59. def __init__(self, img_size=224, patch_size=16, in_c=3, embed_dim=768, norm_layer=None):
  60. super().__init__()
  61. img_size = (img_size, img_size)
  62. patch_size = (patch_size, patch_size)
  63. self.img_size = img_size
  64. self.patch_size = patch_size
  65. self.grid_size = (img_size[0] // patch_size[0], img_size[1] // patch_size[1])
  66. self.num_patches = self.grid_size[0] * self.grid_size[1]
  67. self.proj = nn.Conv2d(in_c, embed_dim, kernel_size=patch_size, stride=patch_size)
  68. self.norm = norm_layer(embed_dim) if norm_layer else nn.Identity()
  69. def forward(self, x):
  70. B, C, H, W = x.shape
  71. assert H == self.img_size[0] and W == self.img_size[1], \
  72. f"Input image size ({H}*{W}) doesn't match model ({self.img_size[0]}*{self.img_size[1]})."
  73. # flatten: [B, C, H, W] -> [B, C, HW]
  74. # transpose: [B, C, HW] -> [B, HW, C]
  75. x = self.proj(x).flatten(2).transpose(1, 2)
  76. x = self.norm(x)
  77. return x
  78. class Attention(nn.Module):
  79. def __init__(self,
  80. dim, # 输入token的dim
  81. num_heads=8,
  82. qkv_bias=False,
  83. qk_scale=None,
  84. attn_drop_ratio=0.,
  85. proj_drop_ratio=0.):
  86. super(Attention, self).__init__()
  87. self.num_heads = num_heads
  88. head_dim = dim // num_heads
  89. self.scale = qk_scale or head_dim ** -0.5
  90. self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
  91. self.attn_drop = nn.Dropout(attn_drop_ratio)
  92. self.proj = nn.Linear(dim, dim)
  93. self.proj_drop = nn.Dropout(proj_drop_ratio)
  94. def forward(self, x):
  95. # [batch_size, num_patches + 1, total_embed_dim]
  96. B, N, C = x.shape
  97. # qkv(): -> [batch_size, num_patches + 1, 3 * total_embed_dim]
  98. # reshape: -> [batch_size, num_patches + 1, 3, num_heads, embed_dim_per_head]
  99. # permute: -> [3, batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  100. qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
  101. # [batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  102. q, k, v = qkv[0], qkv[1], qkv[2] # make torchscript happy (cannot use tensor as tuple)
  103. # transpose: -> [batch_size, num_heads, embed_dim_per_head, num_patches + 1]
  104. # @: multiply -> [batch_size, num_heads, num_patches + 1, num_patches + 1]
  105. attn = (q @ k.transpose(-2, -1)) * self.scale
  106. attn = attn.softmax(dim=-1)
  107. attn = self.attn_drop(attn)
  108. # @: multiply -> [batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  109. # transpose: -> [batch_size, num_patches + 1, num_heads, embed_dim_per_head]
  110. # reshape: -> [batch_size, num_patches + 1, total_embed_dim]
  111. x = (attn @ v).transpose(1, 2).reshape(B, N, C)
  112. x = self.proj(x)
  113. x = self.proj_drop(x)
  114. return x
  115. class Mlp(nn.Module):
  116. """
  117. MLP as used in Vision Transformer, MLP-Mixer and related networks
  118. """
  119. def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.):
  120. super().__init__()
  121. out_features = out_features or in_features
  122. hidden_features = hidden_features or in_features
  123. self.fc1 = nn.Linear(in_features, hidden_features)
  124. self.act = act_layer()
  125. self.fc2 = nn.Linear(hidden_features, out_features)
  126. self.drop = nn.Dropout(drop)
  127. def forward(self, x):
  128. x = self.fc1(x)
  129. x = self.act(x)
  130. x = self.drop(x)
  131. x = self.fc2(x)
  132. x = self.drop(x)
  133. return x
  134. class Block(nn.Module):
  135. def __init__(self,
  136. dim,
  137. num_heads,
  138. mlp_ratio=4.,
  139. qkv_bias=False,
  140. qk_scale=None,
  141. drop_ratio=0.,
  142. attn_drop_ratio=0.,
  143. drop_path_ratio=0.,
  144. act_layer=nn.GELU,
  145. norm_layer=nn.LayerNorm):
  146. super(Block, self).__init__()
  147. self.norm1 = norm_layer(dim)
  148. self.attn = Attention(dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale,
  149. attn_drop_ratio=attn_drop_ratio, proj_drop_ratio=drop_ratio)
  150. # NOTE: drop path for stochastic depth, we shall see if this is better than dropout here
  151. self.drop_path = DropPath(drop_path_ratio) if drop_path_ratio > 0. else nn.Identity()
  152. self.norm2 = norm_layer(dim)
  153. mlp_hidden_dim = int(dim * mlp_ratio)
  154. self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop_ratio)
  155. def forward(self, x):
  156. x = x + self.drop_path(self.attn(self.norm1(x)))
  157. x = x + self.drop_path(self.mlp(self.norm2(x)))
  158. return x
  159. class VisionTransformer(nn.Module):
  160. def __init__(self, img_size=224, patch_size=16, in_c=3, num_classes=1000,
  161. embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True,
  162. qk_scale=None, representation_size=None, distilled=False, drop_ratio=0.,
  163. attn_drop_ratio=0., drop_path_ratio=0., embed_layer=PatchEmbed, norm_layer=None,
  164. act_layer=None):
  165. """
  166. Args:
  167. img_size (int, tuple): input image size
  168. patch_size (int, tuple): patch size
  169. in_c (int): number of input channels
  170. num_classes (int): number of classes for classification head
  171. embed_dim (int): embedding dimension
  172. depth (int): depth of transformer
  173. num_heads (int): number of attention heads
  174. mlp_ratio (int): ratio of mlp hidden dim to embedding dim
  175. qkv_bias (bool): enable bias for qkv if True
  176. qk_scale (float): override default qk scale of head_dim ** -0.5 if set
  177. representation_size (Optional[int]): enable and set representation layer (pre-logits) to this value if set
  178. distilled (bool): model includes a distillation token and head as in DeiT models
  179. drop_ratio (float): dropout rate
  180. attn_drop_ratio (float): attention dropout rate
  181. drop_path_ratio (float): stochastic depth rate
  182. embed_layer (nn.Module): patch embedding layer
  183. norm_layer: (nn.Module): normalization layer
  184. """
  185. super(VisionTransformer, self).__init__()
  186. self.num_classes = num_classes
  187. self.num_features = self.embed_dim = embed_dim # num_features for consistency with other models
  188. self.num_tokens = 2 if distilled else 1
  189. norm_layer = norm_layer or partial(nn.LayerNorm, eps=1e-6)
  190. act_layer = act_layer or nn.GELU
  191. self.patch_embed = embed_layer(img_size=img_size, patch_size=patch_size, in_c=in_c, embed_dim=embed_dim)
  192. num_patches = self.patch_embed.num_patches
  193. self.cls_token = nn.Parameter(torch.zeros(1, 1, embed_dim))
  194. self.dist_token = nn.Parameter(torch.zeros(1, 1, embed_dim)) if distilled else None
  195. self.pos_embed = nn.Parameter(torch.zeros(1, num_patches + self.num_tokens, embed_dim))
  196. self.pos_drop = nn.Dropout(p=drop_ratio)
  197. dpr = [x.item() for x in torch.linspace(0, drop_path_ratio, depth)] # stochastic depth decay rule
  198. self.blocks = nn.Sequential(*[
  199. Block(dim=embed_dim, num_heads=num_heads, mlp_ratio=mlp_ratio, qkv_bias=qkv_bias, qk_scale=qk_scale,
  200. drop_ratio=drop_ratio, attn_drop_ratio=attn_drop_ratio, drop_path_ratio=dpr[i],
  201. norm_layer=norm_layer, act_layer=act_layer)
  202. for i in range(depth)
  203. ])
  204. self.norm = norm_layer(embed_dim)
  205. # Representation layer
  206. if representation_size and not distilled:
  207. self.has_logits = True
  208. self.num_features = representation_size
  209. self.pre_logits = nn.Sequential(OrderedDict([
  210. ("fc", nn.Linear(embed_dim, representation_size)),
  211. ("act", nn.Tanh())
  212. ]))
  213. else:
  214. self.has_logits = False
  215. self.pre_logits = nn.Identity()
  216. # Classifier head(s)
  217. self.head = nn.Linear(self.num_features, num_classes) if num_classes > 0 else nn.Identity()
  218. self.head_dist = None
  219. if distilled:
  220. self.head_dist = nn.Linear(self.embed_dim, self.num_classes) if num_classes > 0 else nn.Identity()
  221. # Weight init
  222. nn.init.trunc_normal_(self.pos_embed, std=0.02)
  223. if self.dist_token is not None:
  224. nn.init.trunc_normal_(self.dist_token, std=0.02)
  225. nn.init.trunc_normal_(self.cls_token, std=0.02)
  226. self.apply(_init_vit_weights)
  227. def forward_features(self, x):
  228. # [B, C, H, W] -> [B, num_patches, embed_dim]
  229. x = self.patch_embed(x) # [B, 196, 768]
  230. # [1, 1, 768] -> [B, 1, 768]
  231. cls_token = self.cls_token.expand(x.shape[0], -1, -1)
  232. if self.dist_token is None:
  233. x = torch.cat((cls_token, x), dim=1) # [B, 197, 768]
  234. else:
  235. x = torch.cat((cls_token, self.dist_token.expand(x.shape[0], -1, -1), x), dim=1)
  236. x = self.pos_drop(x + self.pos_embed)
  237. x = self.blocks(x)
  238. x = self.norm(x)
  239. if self.dist_token is None:
  240. return self.pre_logits(x[:, 0])
  241. else:
  242. return x[:, 0], x[:, 1]
  243. def forward(self, x):
  244. x = self.forward_features(x)
  245. if self.head_dist is not None:
  246. x, x_dist = self.head(x[0]), self.head_dist(x[1])
  247. if self.training and not torch.jit.is_scripting():
  248. # during inference, return the average of both classifier predictions
  249. return x, x_dist
  250. else:
  251. return (x + x_dist) / 2
  252. else:
  253. x = self.head(x)
  254. return x
  255. def _init_vit_weights(m):
  256. """
  257. ViT weight initialization
  258. :param m: module
  259. """
  260. if isinstance(m, nn.Linear):
  261. nn.init.trunc_normal_(m.weight, std=.01)
  262. if m.bias is not None:
  263. nn.init.zeros_(m.bias)
  264. elif isinstance(m, nn.Conv2d):
  265. nn.init.kaiming_normal_(m.weight, mode="fan_out")
  266. if m.bias is not None:
  267. nn.init.zeros_(m.bias)
  268. elif isinstance(m, nn.LayerNorm):
  269. nn.init.zeros_(m.bias)
  270. nn.init.ones_(m.weight)
  271. def vit_base_patch16_224(num_classes: int = 1000):
  272. """
  273. ViT-Base model (ViT-B/16) from original paper (https://arxiv.org/abs/2010.11929).
  274. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  275. weights ported from official Google JAX impl:
  276. 链接: https://pan.baidu.com/s/1zqb08naP0RPqqfSXfkB2EA 密码: eu9f
  277. """
  278. model = VisionTransformer(img_size=224,
  279. patch_size=16,
  280. embed_dim=768,
  281. depth=12,
  282. num_heads=12,
  283. representation_size=None,
  284. num_classes=num_classes)
  285. return model
  286. def vit_base_patch16_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  287. """
  288. ViT-Base model (ViT-B/16) from original paper (https://arxiv.org/abs/2010.11929).
  289. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  290. weights ported from official Google JAX impl:
  291. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_patch16_224_in21k-e5005f0a.pth
  292. """
  293. model = VisionTransformer(img_size=224,
  294. patch_size=16,
  295. embed_dim=768,
  296. depth=12,
  297. num_heads=12,
  298. representation_size=768 if has_logits else None,
  299. num_classes=num_classes)
  300. return model
  301. def vit_base_patch32_224(num_classes: int = 1000):
  302. """
  303. ViT-Base model (ViT-B/32) from original paper (https://arxiv.org/abs/2010.11929).
  304. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  305. weights ported from official Google JAX impl:
  306. 链接: https://pan.baidu.com/s/1hCv0U8pQomwAtHBYc4hmZg 密码: s5hl
  307. """
  308. model = VisionTransformer(img_size=224,
  309. patch_size=32,
  310. embed_dim=768,
  311. depth=12,
  312. num_heads=12,
  313. representation_size=None,
  314. num_classes=num_classes)
  315. return model
  316. def vit_base_patch32_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  317. """
  318. ViT-Base model (ViT-B/32) from original paper (https://arxiv.org/abs/2010.11929).
  319. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  320. weights ported from official Google JAX impl:
  321. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_patch32_224_in21k-8db57226.pth
  322. """
  323. model = VisionTransformer(img_size=224,
  324. patch_size=32,
  325. embed_dim=768,
  326. depth=12,
  327. num_heads=12,
  328. representation_size=768 if has_logits else None,
  329. num_classes=num_classes)
  330. return model
  331. def vit_large_patch16_224(num_classes: int = 1000):
  332. """
  333. ViT-Large model (ViT-L/16) from original paper (https://arxiv.org/abs/2010.11929).
  334. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  335. weights ported from official Google JAX impl:
  336. 链接: https://pan.baidu.com/s/1cxBgZJJ6qUWPSBNcE4TdRQ 密码: qqt8
  337. """
  338. model = VisionTransformer(img_size=224,
  339. patch_size=16,
  340. embed_dim=1024,
  341. depth=24,
  342. num_heads=16,
  343. representation_size=None,
  344. num_classes=num_classes)
  345. return model
  346. def vit_large_patch16_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  347. """
  348. ViT-Large model (ViT-L/16) from original paper (https://arxiv.org/abs/2010.11929).
  349. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  350. weights ported from official Google JAX impl:
  351. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_large_patch16_224_in21k-606da67d.pth
  352. """
  353. model = VisionTransformer(img_size=224,
  354. patch_size=16,
  355. embed_dim=1024,
  356. depth=24,
  357. num_heads=16,
  358. representation_size=1024 if has_logits else None,
  359. num_classes=num_classes)
  360. return model
  361. def vit_large_patch32_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  362. """
  363. ViT-Large model (ViT-L/32) from original paper (https://arxiv.org/abs/2010.11929).
  364. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  365. weights ported from official Google JAX impl:
  366. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_large_patch32_224_in21k-9046d2e7.pth
  367. """
  368. model = VisionTransformer(img_size=224,
  369. patch_size=32,
  370. embed_dim=1024,
  371. depth=24,
  372. num_heads=16,
  373. representation_size=1024 if has_logits else None,
  374. num_classes=num_classes)
  375. return model
  376. def vit_huge_patch14_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  377. """
  378. ViT-Huge model (ViT-H/14) from original paper (https://arxiv.org/abs/2010.11929).
  379. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  380. NOTE: converted weights not currently available, too large for github release hosting.
  381. """
  382. model = VisionTransformer(img_size=224,
  383. patch_size=14,
  384. embed_dim=1280,
  385. depth=32,
  386. num_heads=16,
  387. representation_size=1280 if has_logits else None,
  388. num_classes=num_classes)
  389. return model
  390. vision_transformer = vit_base_patch16_224(num_classes = 2) #将模型命名为vision_transformer,这里num_classes是数据集的种类,我用的猫狗数据集两类,所以等于2 你设置成你数据集的种类即可
  391. #上面用的是vision_transformer,如果想用efficientnet其他系列的,直接把上面的efficientnet_b其他系列就行即可
  392. vision_transformer.to(device)
  393. print(vision_transformer.to(device)) #输出模型结构
  394. #
  395. # test1 = torch.ones(32, 3, 224, 224) # 测试一下输出的形状大小 输入一个32,3,224,224的向量
  396. #
  397. # test1 = vision_transformer(test1.to(device)) #将向量打入神经网络进行测试
  398. # print(test1.shape) #查看输出的结果
  399. epoch = 10 # 迭代次数即训练次数
  400. learning = 0.0001 # 学习率
  401. optimizer = torch.optim.Adam(vision_transformer.parameters(), lr=learning) # 使用Adam优化器-写论文的话可以具体查一下这个优化器的原理
  402. loss = nn.CrossEntropyLoss() # 损失计算方式,交叉熵损失函数
  403. train_loss_all = [] # 存放训练集损失的数组
  404. train_accur_all = [] # 存放训练集准确率的数组
  405. test_loss_all = [] # 存放测试集损失的数组
  406. test_accur_all = [] # 存放测试集准确率的数组
  407. for i in range(epoch): #开始迭代
  408. train_loss = 0 #训练集的损失初始设为0
  409. train_num = 0.0 #
  410. train_accuracy = 0.0 #训练集的准确率初始设为0
  411. vision_transformer.train() #将模型设置成 训练模式
  412. train_bar = tqdm(traindata) #用于进度条显示,没啥实际用处
  413. for step, data in enumerate(train_bar): #开始迭代跑, enumerate这个函数不懂可以查查,将训练集分为 data是序号,data是数据
  414. img, target = data #将data 分位 img图片,target标签
  415. optimizer.zero_grad() # 清空历史梯度
  416. outputs = vision_transformer(img.to(device)) # 将图片打入网络进行训练,outputs是输出的结果
  417. loss1 = loss(outputs, target.to(device)) # 计算神经网络输出的结果outputs与图片真实标签target的差别-这就是我们通常情况下称为的损失
  418. outputs = torch.argmax(outputs, 1) #会输出10个值,最大的值就是我们预测的结果 求最大值
  419. loss1.backward() #神经网络反向传播
  420. optimizer.step() #梯度优化 用上面的abam优化
  421. train_loss = train_loss + loss1.item() #将所有损失的绝对值加起来
  422. accuracy = torch.sum(outputs == target.to(device)) #outputs == target的 即使预测正确的,统计预测正确的个数,从而计算准确率
  423. train_accuracy = train_accuracy + accuracy #求训练集的准确率
  424. train_num += img.size(0) #
  425. print("epoch:{} , train-Loss:{} , train-accuracy:{}".format(i + 1, train_loss / train_num, #输出训练情况
  426. train_accuracy / train_num))
  427. train_loss_all.append(train_loss / train_num) #将训练的损失放到一个列表里 方便后续画图
  428. train_accur_all.append(train_accuracy.double().item() / train_num)#训练集的准确率
  429. test_loss = 0 #同上 测试损失
  430. test_accuracy = 0.0 #测试准确率
  431. test_num = 0
  432. vision_transformer.eval() #将模型调整为测试模型
  433. with torch.no_grad(): #清空历史梯度,进行测试 与训练最大的区别是测试过程中取消了反向传播
  434. test_bar = tqdm(testdata)
  435. for data in test_bar:
  436. img, target = data
  437. outputs = vision_transformer(img.to(device))
  438. loss2 = loss(outputs, target.to(device)).cpu()
  439. outputs = torch.argmax(outputs, 1)
  440. test_loss = test_loss + loss2.item()
  441. accuracy = torch.sum(outputs == target.to(device))
  442. test_accuracy = test_accuracy + accuracy
  443. test_num += img.size(0)
  444. print("test-Loss:{} , test-accuracy:{}".format(test_loss / test_num, test_accuracy / test_num))
  445. test_loss_all.append(test_loss / test_num)
  446. test_accur_all.append(test_accuracy.double().item() / test_num)
  447. #下面的是画图过程,将上述存放的列表 画出来即可,分别画出训练集和测试集的损失和准确率图
  448. plt.figure(figsize=(12, 4))
  449. plt.subplot(1, 2, 1)
  450. plt.plot(range(epoch), train_loss_all,
  451. "ro-", label="Train loss")
  452. plt.plot(range(epoch), test_loss_all,
  453. "bs-", label="test loss")
  454. plt.legend()
  455. plt.xlabel("epoch")
  456. plt.ylabel("Loss")
  457. plt.subplot(1, 2, 2)
  458. plt.plot(range(epoch), train_accur_all,
  459. "ro-", label="Train accur")
  460. plt.plot(range(epoch), test_accur_all,
  461. "bs-", label="test accur")
  462. plt.xlabel("epoch")
  463. plt.ylabel("acc")
  464. plt.legend()
  465. plt.show()
  466. torch.save(vision_transformer, "vision_transformer.pth")
  467. print("模型已保存")

预测代码:

  1. import torch
  2. from PIL import Image
  3. from torch import nn
  4. from torchvision.transforms import transforms
  5. from typing import Callable, List, Optional
  6. from torch import nn, Tensor
  7. from torch.nn import functional as F
  8. image_path = "1.jpg"#相对路径 导入图片
  9. trans = transforms.Compose([transforms.Resize((224 , 224)),
  10. transforms.ToTensor()]) #将图片缩放为跟训练集图片的大小一样 方便预测,且将图片转换为张量
  11. image = Image.open(image_path) #打开图片
  12. # print(image) #输出图片 看看图片格式
  13. image = image.convert("RGB") #将图片转换为RGB格式
  14. image = trans(image) #上述的缩放和转张量操作在这里实现
  15. # print(image) #查看转换后的样子
  16. image = torch.unsqueeze(image, dim=0) #将图片维度扩展一维
  17. classes = ["cat" , "dog" ] #预测种类,我这里用的猫狗数据集,所以是这两种,你调成你的种类即可
  18. def drop_path(x, drop_prob: float = 0., training: bool = False):
  19. """
  20. Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
  21. This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
  22. the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
  23. See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
  24. changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
  25. 'survival rate' as the argument.
  26. """
  27. if drop_prob == 0. or not training:
  28. return x
  29. keep_prob = 1 - drop_prob
  30. shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
  31. random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
  32. random_tensor.floor_() # binarize
  33. output = x.div(keep_prob) * random_tensor
  34. return output
  35. class DropPath(nn.Module):
  36. """
  37. Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
  38. """
  39. def __init__(self, drop_prob=None):
  40. super(DropPath, self).__init__()
  41. self.drop_prob = drop_prob
  42. def forward(self, x):
  43. return drop_path(x, self.drop_prob, self.training)
  44. class PatchEmbed(nn.Module):
  45. """
  46. 2D Image to Patch Embedding
  47. """
  48. def __init__(self, img_size=224, patch_size=16, in_c=3, embed_dim=768, norm_layer=None):
  49. super().__init__()
  50. img_size = (img_size, img_size)
  51. patch_size = (patch_size, patch_size)
  52. self.img_size = img_size
  53. self.patch_size = patch_size
  54. self.grid_size = (img_size[0] // patch_size[0], img_size[1] // patch_size[1])
  55. self.num_patches = self.grid_size[0] * self.grid_size[1]
  56. self.proj = nn.Conv2d(in_c, embed_dim, kernel_size=patch_size, stride=patch_size)
  57. self.norm = norm_layer(embed_dim) if norm_layer else nn.Identity()
  58. def forward(self, x):
  59. B, C, H, W = x.shape
  60. assert H == self.img_size[0] and W == self.img_size[1], \
  61. f"Input image size ({H}*{W}) doesn't match model ({self.img_size[0]}*{self.img_size[1]})."
  62. # flatten: [B, C, H, W] -> [B, C, HW]
  63. # transpose: [B, C, HW] -> [B, HW, C]
  64. x = self.proj(x).flatten(2).transpose(1, 2)
  65. x = self.norm(x)
  66. return x
  67. class Attention(nn.Module):
  68. def __init__(self,
  69. dim, # 输入token的dim
  70. num_heads=8,
  71. qkv_bias=False,
  72. qk_scale=None,
  73. attn_drop_ratio=0.,
  74. proj_drop_ratio=0.):
  75. super(Attention, self).__init__()
  76. self.num_heads = num_heads
  77. head_dim = dim // num_heads
  78. self.scale = qk_scale or head_dim ** -0.5
  79. self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
  80. self.attn_drop = nn.Dropout(attn_drop_ratio)
  81. self.proj = nn.Linear(dim, dim)
  82. self.proj_drop = nn.Dropout(proj_drop_ratio)
  83. def forward(self, x):
  84. # [batch_size, num_patches + 1, total_embed_dim]
  85. B, N, C = x.shape
  86. # qkv(): -> [batch_size, num_patches + 1, 3 * total_embed_dim]
  87. # reshape: -> [batch_size, num_patches + 1, 3, num_heads, embed_dim_per_head]
  88. # permute: -> [3, batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  89. qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
  90. # [batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  91. q, k, v = qkv[0], qkv[1], qkv[2] # make torchscript happy (cannot use tensor as tuple)
  92. # transpose: -> [batch_size, num_heads, embed_dim_per_head, num_patches + 1]
  93. # @: multiply -> [batch_size, num_heads, num_patches + 1, num_patches + 1]
  94. attn = (q @ k.transpose(-2, -1)) * self.scale
  95. attn = attn.softmax(dim=-1)
  96. attn = self.attn_drop(attn)
  97. # @: multiply -> [batch_size, num_heads, num_patches + 1, embed_dim_per_head]
  98. # transpose: -> [batch_size, num_patches + 1, num_heads, embed_dim_per_head]
  99. # reshape: -> [batch_size, num_patches + 1, total_embed_dim]
  100. x = (attn @ v).transpose(1, 2).reshape(B, N, C)
  101. x = self.proj(x)
  102. x = self.proj_drop(x)
  103. return x
  104. class Mlp(nn.Module):
  105. """
  106. MLP as used in Vision Transformer, MLP-Mixer and related networks
  107. """
  108. def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.):
  109. super().__init__()
  110. out_features = out_features or in_features
  111. hidden_features = hidden_features or in_features
  112. self.fc1 = nn.Linear(in_features, hidden_features)
  113. self.act = act_layer()
  114. self.fc2 = nn.Linear(hidden_features, out_features)
  115. self.drop = nn.Dropout(drop)
  116. def forward(self, x):
  117. x = self.fc1(x)
  118. x = self.act(x)
  119. x = self.drop(x)
  120. x = self.fc2(x)
  121. x = self.drop(x)
  122. return x
  123. class Block(nn.Module):
  124. def __init__(self,
  125. dim,
  126. num_heads,
  127. mlp_ratio=4.,
  128. qkv_bias=False,
  129. qk_scale=None,
  130. drop_ratio=0.,
  131. attn_drop_ratio=0.,
  132. drop_path_ratio=0.,
  133. act_layer=nn.GELU,
  134. norm_layer=nn.LayerNorm):
  135. super(Block, self).__init__()
  136. self.norm1 = norm_layer(dim)
  137. self.attn = Attention(dim, num_heads=num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale,
  138. attn_drop_ratio=attn_drop_ratio, proj_drop_ratio=drop_ratio)
  139. # NOTE: drop path for stochastic depth, we shall see if this is better than dropout here
  140. self.drop_path = DropPath(drop_path_ratio) if drop_path_ratio > 0. else nn.Identity()
  141. self.norm2 = norm_layer(dim)
  142. mlp_hidden_dim = int(dim * mlp_ratio)
  143. self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop_ratio)
  144. def forward(self, x):
  145. x = x + self.drop_path(self.attn(self.norm1(x)))
  146. x = x + self.drop_path(self.mlp(self.norm2(x)))
  147. return x
  148. class VisionTransformer(nn.Module):
  149. def __init__(self, img_size=224, patch_size=16, in_c=3, num_classes=1000,
  150. embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True,
  151. qk_scale=None, representation_size=None, distilled=False, drop_ratio=0.,
  152. attn_drop_ratio=0., drop_path_ratio=0., embed_layer=PatchEmbed, norm_layer=None,
  153. act_layer=None):
  154. """
  155. Args:
  156. img_size (int, tuple): input image size
  157. patch_size (int, tuple): patch size
  158. in_c (int): number of input channels
  159. num_classes (int): number of classes for classification head
  160. embed_dim (int): embedding dimension
  161. depth (int): depth of transformer
  162. num_heads (int): number of attention heads
  163. mlp_ratio (int): ratio of mlp hidden dim to embedding dim
  164. qkv_bias (bool): enable bias for qkv if True
  165. qk_scale (float): override default qk scale of head_dim ** -0.5 if set
  166. representation_size (Optional[int]): enable and set representation layer (pre-logits) to this value if set
  167. distilled (bool): model includes a distillation token and head as in DeiT models
  168. drop_ratio (float): dropout rate
  169. attn_drop_ratio (float): attention dropout rate
  170. drop_path_ratio (float): stochastic depth rate
  171. embed_layer (nn.Module): patch embedding layer
  172. norm_layer: (nn.Module): normalization layer
  173. """
  174. super(VisionTransformer, self).__init__()
  175. self.num_classes = num_classes
  176. self.num_features = self.embed_dim = embed_dim # num_features for consistency with other models
  177. self.num_tokens = 2 if distilled else 1
  178. norm_layer = norm_layer or partial(nn.LayerNorm, eps=1e-6)
  179. act_layer = act_layer or nn.GELU
  180. self.patch_embed = embed_layer(img_size=img_size, patch_size=patch_size, in_c=in_c, embed_dim=embed_dim)
  181. num_patches = self.patch_embed.num_patches
  182. self.cls_token = nn.Parameter(torch.zeros(1, 1, embed_dim))
  183. self.dist_token = nn.Parameter(torch.zeros(1, 1, embed_dim)) if distilled else None
  184. self.pos_embed = nn.Parameter(torch.zeros(1, num_patches + self.num_tokens, embed_dim))
  185. self.pos_drop = nn.Dropout(p=drop_ratio)
  186. dpr = [x.item() for x in torch.linspace(0, drop_path_ratio, depth)] # stochastic depth decay rule
  187. self.blocks = nn.Sequential(*[
  188. Block(dim=embed_dim, num_heads=num_heads, mlp_ratio=mlp_ratio, qkv_bias=qkv_bias, qk_scale=qk_scale,
  189. drop_ratio=drop_ratio, attn_drop_ratio=attn_drop_ratio, drop_path_ratio=dpr[i],
  190. norm_layer=norm_layer, act_layer=act_layer)
  191. for i in range(depth)
  192. ])
  193. self.norm = norm_layer(embed_dim)
  194. # Representation layer
  195. if representation_size and not distilled:
  196. self.has_logits = True
  197. self.num_features = representation_size
  198. self.pre_logits = nn.Sequential(OrderedDict([
  199. ("fc", nn.Linear(embed_dim, representation_size)),
  200. ("act", nn.Tanh())
  201. ]))
  202. else:
  203. self.has_logits = False
  204. self.pre_logits = nn.Identity()
  205. # Classifier head(s)
  206. self.head = nn.Linear(self.num_features, num_classes) if num_classes > 0 else nn.Identity()
  207. self.head_dist = None
  208. if distilled:
  209. self.head_dist = nn.Linear(self.embed_dim, self.num_classes) if num_classes > 0 else nn.Identity()
  210. # Weight init
  211. nn.init.trunc_normal_(self.pos_embed, std=0.02)
  212. if self.dist_token is not None:
  213. nn.init.trunc_normal_(self.dist_token, std=0.02)
  214. nn.init.trunc_normal_(self.cls_token, std=0.02)
  215. self.apply(_init_vit_weights)
  216. def forward_features(self, x):
  217. # [B, C, H, W] -> [B, num_patches, embed_dim]
  218. x = self.patch_embed(x) # [B, 196, 768]
  219. # [1, 1, 768] -> [B, 1, 768]
  220. cls_token = self.cls_token.expand(x.shape[0], -1, -1)
  221. if self.dist_token is None:
  222. x = torch.cat((cls_token, x), dim=1) # [B, 197, 768]
  223. else:
  224. x = torch.cat((cls_token, self.dist_token.expand(x.shape[0], -1, -1), x), dim=1)
  225. x = self.pos_drop(x + self.pos_embed)
  226. x = self.blocks(x)
  227. x = self.norm(x)
  228. if self.dist_token is None:
  229. return self.pre_logits(x[:, 0])
  230. else:
  231. return x[:, 0], x[:, 1]
  232. def forward(self, x):
  233. x = self.forward_features(x)
  234. if self.head_dist is not None:
  235. x, x_dist = self.head(x[0]), self.head_dist(x[1])
  236. if self.training and not torch.jit.is_scripting():
  237. # during inference, return the average of both classifier predictions
  238. return x, x_dist
  239. else:
  240. return (x + x_dist) / 2
  241. else:
  242. x = self.head(x)
  243. return x
  244. def _init_vit_weights(m):
  245. """
  246. ViT weight initialization
  247. :param m: module
  248. """
  249. if isinstance(m, nn.Linear):
  250. nn.init.trunc_normal_(m.weight, std=.01)
  251. if m.bias is not None:
  252. nn.init.zeros_(m.bias)
  253. elif isinstance(m, nn.Conv2d):
  254. nn.init.kaiming_normal_(m.weight, mode="fan_out")
  255. if m.bias is not None:
  256. nn.init.zeros_(m.bias)
  257. elif isinstance(m, nn.LayerNorm):
  258. nn.init.zeros_(m.bias)
  259. nn.init.ones_(m.weight)
  260. def vit_base_patch16_224(num_classes: int = 1000):
  261. """
  262. ViT-Base model (ViT-B/16) from original paper (https://arxiv.org/abs/2010.11929).
  263. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  264. weights ported from official Google JAX impl:
  265. 链接: https://pan.baidu.com/s/1zqb08naP0RPqqfSXfkB2EA 密码: eu9f
  266. """
  267. model = VisionTransformer(img_size=224,
  268. patch_size=16,
  269. embed_dim=768,
  270. depth=12,
  271. num_heads=12,
  272. representation_size=None,
  273. num_classes=num_classes)
  274. return model
  275. def vit_base_patch16_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  276. """
  277. ViT-Base model (ViT-B/16) from original paper (https://arxiv.org/abs/2010.11929).
  278. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  279. weights ported from official Google JAX impl:
  280. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_patch16_224_in21k-e5005f0a.pth
  281. """
  282. model = VisionTransformer(img_size=224,
  283. patch_size=16,
  284. embed_dim=768,
  285. depth=12,
  286. num_heads=12,
  287. representation_size=768 if has_logits else None,
  288. num_classes=num_classes)
  289. return model
  290. def vit_base_patch32_224(num_classes: int = 1000):
  291. """
  292. ViT-Base model (ViT-B/32) from original paper (https://arxiv.org/abs/2010.11929).
  293. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  294. weights ported from official Google JAX impl:
  295. 链接: https://pan.baidu.com/s/1hCv0U8pQomwAtHBYc4hmZg 密码: s5hl
  296. """
  297. model = VisionTransformer(img_size=224,
  298. patch_size=32,
  299. embed_dim=768,
  300. depth=12,
  301. num_heads=12,
  302. representation_size=None,
  303. num_classes=num_classes)
  304. return model
  305. def vit_base_patch32_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  306. """
  307. ViT-Base model (ViT-B/32) from original paper (https://arxiv.org/abs/2010.11929).
  308. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  309. weights ported from official Google JAX impl:
  310. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_patch32_224_in21k-8db57226.pth
  311. """
  312. model = VisionTransformer(img_size=224,
  313. patch_size=32,
  314. embed_dim=768,
  315. depth=12,
  316. num_heads=12,
  317. representation_size=768 if has_logits else None,
  318. num_classes=num_classes)
  319. return model
  320. def vit_large_patch16_224(num_classes: int = 1000):
  321. """
  322. ViT-Large model (ViT-L/16) from original paper (https://arxiv.org/abs/2010.11929).
  323. ImageNet-1k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  324. weights ported from official Google JAX impl:
  325. 链接: https://pan.baidu.com/s/1cxBgZJJ6qUWPSBNcE4TdRQ 密码: qqt8
  326. """
  327. model = VisionTransformer(img_size=224,
  328. patch_size=16,
  329. embed_dim=1024,
  330. depth=24,
  331. num_heads=16,
  332. representation_size=None,
  333. num_classes=num_classes)
  334. return model
  335. def vit_large_patch16_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  336. """
  337. ViT-Large model (ViT-L/16) from original paper (https://arxiv.org/abs/2010.11929).
  338. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  339. weights ported from official Google JAX impl:
  340. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_large_patch16_224_in21k-606da67d.pth
  341. """
  342. model = VisionTransformer(img_size=224,
  343. patch_size=16,
  344. embed_dim=1024,
  345. depth=24,
  346. num_heads=16,
  347. representation_size=1024 if has_logits else None,
  348. num_classes=num_classes)
  349. return model
  350. def vit_large_patch32_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  351. """
  352. ViT-Large model (ViT-L/32) from original paper (https://arxiv.org/abs/2010.11929).
  353. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  354. weights ported from official Google JAX impl:
  355. https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_large_patch32_224_in21k-9046d2e7.pth
  356. """
  357. model = VisionTransformer(img_size=224,
  358. patch_size=32,
  359. embed_dim=1024,
  360. depth=24,
  361. num_heads=16,
  362. representation_size=1024 if has_logits else None,
  363. num_classes=num_classes)
  364. return model
  365. def vit_huge_patch14_224_in21k(num_classes: int = 21843, has_logits: bool = True):
  366. """
  367. ViT-Huge model (ViT-H/14) from original paper (https://arxiv.org/abs/2010.11929).
  368. ImageNet-21k weights @ 224x224, source https://github.com/google-research/vision_transformer.
  369. NOTE: converted weights not currently available, too large for github release hosting.
  370. """
  371. model = VisionTransformer(img_size=224,
  372. patch_size=14,
  373. embed_dim=1280,
  374. depth=32,
  375. num_heads=16,
  376. representation_size=1280 if has_logits else None,
  377. num_classes=num_classes)
  378. return model
  379. #以上是神经网络结构,因为读取了模型之后代码还得知道神经网络的结构才能进行预测
  380. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #将代码放入GPU进行训练
  381. print("using {} device.".format(device))
  382. model = torch.load("vision_transformer.pth") #读取模型
  383. model.eval() #关闭梯度,将模型调整为测试模式
  384. with torch.no_grad(): #梯度清零
  385. outputs = model(image.to(device)) #将图片打入神经网络进行测试
  386. # print(model) #输出模型结构
  387. # print(outputs) #输出预测的张量数组
  388. ans = (outputs.argmax(1)).item() #最大的值即为预测结果,找出最大值在数组中的序号,
  389. # 对应找其在种类中的序号即可然后输出即为其种类
  390. print("该图片的种类为:",classes[ans])
  391. # print(classes[ans])

网络结构搭建部分注释的很详细,有问题朋友欢迎在评论区指出,感谢!

不懂我代码使用方法的可以看看我之前开源的代码,更为详细:手撕Resnet卷积神经网络-pytorch-详细注释版(可以直接替换自己数据集)-直接放置自己的数据集就能直接跑。跑的代码有问题的可以在评论区指出,看到了会回复。训练代码和预测代码均有。_pytorch更换数据集需要改代码吗_小馨馨的小翟的博客-CSDN博客

本代码使用的数据集是猫狗数据集,已经分好训练集和测试集了,下面给出数据集的下载链接。

链接:https://pan.baidu.com/s/1_gUznMQnzI0UhvsV7wPgzw 
提取码:3ixd

Vision Transformer代码下载链接:https://pan.baidu.com/s/11ViyEvr8Wcj-ubfGlz_LUg 
提取码:eg43

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/人工智能uu/article/detail/764340
推荐阅读
相关标签
  

闽ICP备14008679号