当前位置:   article > 正文

Pytorch实例----DCGAN图像生成_dcgan kernel 5*5 stride3

dcgan kernel 5*5 stride3

背景

       深度学习对数据量的需求明显高于传统机器学习方法,当真实的数据量较少或难以满足实际网络收敛情况下,需用通过数据增强方法生成更多可用于训练的更多图片,数据增强方法通过对图像进行随机剪裁,翻转等变换,提升图像输入的丰富度,但数据增强本质上是在原图上做“线性变化”,无法产生原图中没有的图像。在这种条件下,GAN(Generative Adversarial Networks )能够在有限数据条件下生成更加丰富的数据资源,提升网络训练的有效性。

DCGAN:使用卷积神经网络替代GAN 中的多层感知机(MLP)并对网络做微调处理,显著提升了图像生成的质量。

论文链接:Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

DCGAN设计技巧

  1. 所有的pooling层使用步幅卷积(判别网络)和微步幅度卷积(生成网络)进行替换;
  2. 在生成网络和判别网络上使用批处理规范化;
  3. 对于更深的架构移除全连接隐藏层;
  4. 在生成网络的所有层上使用RelU激活函数,除了输出层使用Tanh激活函数;
  5. 在判别网络的所有层上使用LeakyReLU激活函数;

基于Pytorch的DCGAN数据增强方法:

生成器(Generator): 

  1. class NetG(nn.Module):
  2. def __init__(self, ngf, nz):
  3. super(NetG, self).__init__()
  4. # layer1输入的是一个100x1x1的随机噪声, 输出尺寸(ngf*8)x4x4
  5. self.layer1 = nn.Sequential(
  6. nn.ConvTranspose2d(nz, ngf * 8, kernel_size=4, stride=1, padding=0, bias=False),
  7. nn.BatchNorm2d(ngf * 8),
  8. nn.ReLU(inplace=True)
  9. )
  10. # layer2输出尺寸(ngf*4)x8x8
  11. self.layer2 = nn.Sequential(
  12. nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
  13. nn.BatchNorm2d(ngf * 4),
  14. nn.ReLU(inplace=True)
  15. )
  16. # layer3输出尺寸(ngf*2)x16x16
  17. self.layer3 = nn.Sequential(
  18. nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
  19. nn.BatchNorm2d(ngf * 2),
  20. nn.ReLU(inplace=True)
  21. )
  22. # layer4输出尺寸(ngf)x32x32
  23. self.layer4 = nn.Sequential(
  24. nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
  25. nn.BatchNorm2d(ngf),
  26. nn.ReLU(inplace=True)
  27. )
  28. # layer5输出尺寸 3x96x96
  29. self.layer5 = nn.Sequential(
  30. nn.ConvTranspose2d(ngf, 3, 5, 3, 1, bias=False),
  31. nn.Tanh()
  32. )
  33. # 定义NetG的前向传播
  34. def forward(self, x):
  35. out = self.layer1(x)
  36. out = self.layer2(out)
  37. out = self.layer3(out)
  38. out = self.layer4(out)
  39. out = self.layer5(out)
  40. return out

 判别器(Discriminator):(生成器的逆过程)

  1. # 定义鉴别器网络D
  2. class NetD(nn.Module):
  3. def __init__(self, ndf):
  4. super(NetD, self).__init__()
  5. # layer1 输入 3 x 96 x 96, 输出 (ndf) x 32 x 32
  6. self.layer1 = nn.Sequential(
  7. nn.Conv2d(3, ndf, kernel_size=5, stride=3, padding=1, bias=False),
  8. nn.BatchNorm2d(ndf),
  9. nn.LeakyReLU(0.2, inplace=True)
  10. )
  11. # layer2 输出 (ndf*2) x 16 x 16
  12. self.layer2 = nn.Sequential(
  13. nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
  14. nn.BatchNorm2d(ndf * 2),
  15. nn.LeakyReLU(0.2, inplace=True)
  16. )
  17. # layer3 输出 (ndf*4) x 8 x 8
  18. self.layer3 = nn.Sequential(
  19. nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
  20. nn.BatchNorm2d(ndf * 4),
  21. nn.LeakyReLU(0.2, inplace=True)
  22. )
  23. # layer4 输出 (ndf*8) x 4 x 4
  24. self.layer4 = nn.Sequential(
  25. nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
  26. nn.BatchNorm2d(ndf * 8),
  27. nn.LeakyReLU(0.2, inplace=True)
  28. )
  29. # layer5 输出一个数(概率)
  30. self.layer5 = nn.Sequential(
  31. nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
  32. nn.Sigmoid()
  33. )
  34. # 定义NetD的前向传播
  35. def forward(self,x):
  36. out = self.layer1(x)
  37. out = self.layer2(out)
  38. out = self.layer3(out)
  39. out = self.layer4(out)
  40. out = self.layer5(out)
  41. return out

损失函数(Loss function):

  1. criterion = nn.BCELoss()
  2. optimizerG = torch.optim.Adam(netG.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))
  3. optimizerD = torch.optim.Adam(netD.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))
  4. #每次训练中
  5. for epoch in range(1, opt.epoch + 1):
  6. for i, (imgs,_) in enumerate(dataloader):
  7. # 固定生成器G,训练鉴别器D
  8. optimizerD.zero_grad()
  9. ## 让D尽可能的把真图片判别为1
  10. imgs=imgs.to(device)
  11. output = netD(imgs)
  12. label.data.fill_(real_label)
  13. label=label.to(device)
  14. #判别图像和标签的loss
  15. errD_real = criterion(output, label)
  16. errD_real.backward()
  17. ## 让D尽可能把假图片判别为0
  18. label.data.fill_(fake_label)
  19. noise = torch.randn(opt.batchSize, opt.nz, 1, 1)
  20. noise=noise.to(device)
  21. fake = netG(noise) # 生成假图
  22. output = netD(fake.detach()) #避免梯度传到G,因为G不用更新
  23. errD_fake = criterion(output, label)
  24. errD_fake.backward()
  25. errD = errD_fake + errD_real
  26. optimizerD.step()
  27. # 固定鉴别器D,训练生成器G
  28. optimizerG.zero_grad()
  29. # 让D尽可能把G生成的假图判别为1
  30. label.data.fill_(real_label)
  31. label = label.to(device)
  32. output = netD(fake)
  33. errG = criterion(output, label)
  34. errG.backward()
  35. optimizerG.step()

 

模型: model.py

  1. import torch.nn as nn
  2. # 定义生成器网络G
  3. class NetG(nn.Module):
  4. def __init__(self, ngf, nz):
  5. super(NetG, self).__init__()
  6. # layer1输入的是一个100x1x1的随机噪声, 输出尺寸(ngf*8)x4x4
  7. self.layer1 = nn.Sequential(
  8. nn.ConvTranspose2d(nz, ngf * 8, kernel_size=4, stride=1, padding=0, bias=False),
  9. nn.BatchNorm2d(ngf * 8),
  10. nn.ReLU(inplace=True)
  11. )
  12. # layer2输出尺寸(ngf*4)x8x8
  13. self.layer2 = nn.Sequential(
  14. nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
  15. nn.BatchNorm2d(ngf * 4),
  16. nn.ReLU(inplace=True)
  17. )
  18. # layer3输出尺寸(ngf*2)x16x16
  19. self.layer3 = nn.Sequential(
  20. nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
  21. nn.BatchNorm2d(ngf * 2),
  22. nn.ReLU(inplace=True)
  23. )
  24. # layer4输出尺寸(ngf)x32x32
  25. self.layer4 = nn.Sequential(
  26. nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
  27. nn.BatchNorm2d(ngf),
  28. nn.ReLU(inplace=True)
  29. )
  30. # layer5输出尺寸 3x96x96
  31. self.layer5 = nn.Sequential(
  32. nn.ConvTranspose2d(ngf, 3, 5, 3, 1, bias=False),
  33. nn.Tanh()
  34. )
  35. # 定义NetG的前向传播
  36. def forward(self, x):
  37. out = self.layer1(x)
  38. out = self.layer2(out)
  39. out = self.layer3(out)
  40. out = self.layer4(out)
  41. out = self.layer5(out)
  42. return out
  43. # 定义鉴别器网络D
  44. class NetD(nn.Module):
  45. def __init__(self, ndf):
  46. super(NetD, self).__init__()
  47. # layer1 输入 3 x 96 x 96, 输出 (ndf) x 32 x 32
  48. self.layer1 = nn.Sequential(
  49. nn.Conv2d(3, ndf, kernel_size=5, stride=3, padding=1, bias=False),
  50. nn.BatchNorm2d(ndf),
  51. nn.LeakyReLU(0.2, inplace=True)
  52. )
  53. # layer2 输出 (ndf*2) x 16 x 16
  54. self.layer2 = nn.Sequential(
  55. nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
  56. nn.BatchNorm2d(ndf * 2),
  57. nn.LeakyReLU(0.2, inplace=True)
  58. )
  59. # layer3 输出 (ndf*4) x 8 x 8
  60. self.layer3 = nn.Sequential(
  61. nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
  62. nn.BatchNorm2d(ndf * 4),
  63. nn.LeakyReLU(0.2, inplace=True)
  64. )
  65. # layer4 输出 (ndf*8) x 4 x 4
  66. self.layer4 = nn.Sequential(
  67. nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
  68. nn.BatchNorm2d(ndf * 8),
  69. nn.LeakyReLU(0.2, inplace=True)
  70. )
  71. # layer5 输出一个数(概率)
  72. self.layer5 = nn.Sequential(
  73. nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
  74. nn.Sigmoid()
  75. )
  76. # 定义NetD的前向传播
  77. def forward(self,x):
  78. out = self.layer1(x)
  79. out = self.layer2(out)
  80. out = self.layer3(out)
  81. out = self.layer4(out)
  82. out = self.layer5(out)
  83. return out

训练:train.py

  1. import os
  2. import argparse
  3. import torch
  4. import torchvision
  5. import torchvision.utils as vutils
  6. from torchvision import transforms
  7. from torchvision.transforms import ToPILImage
  8. import torch.nn as nn
  9. from random import randint
  10. from model import NetD, NetG
  11. from PIL import Image
  12. from utils import tensor_to_PIL
  13. parser = argparse.ArgumentParser()
  14. parser.add_argument('--batchSize', type=int, default=32)
  15. parser.add_argument('--imageSize', type=int, default=96)
  16. parser.add_argument('--nz', type=int, default=100, help='size of the latent z vector')
  17. parser.add_argument('--ngf', type=int, default=64)
  18. parser.add_argument('--ndf', type=int, default=64)
  19. parser.add_argument('--epoch', type=int, default=40000, help='number of epochs to train for')
  20. parser.add_argument('--lr', type=float, default=0.0002, help='learning rate, default=0.0002')
  21. parser.add_argument('--beta1', type=float, default=0.5, help='beta1 for adam. default=0.5')
  22. parser.add_argument('--data_path', default='data/', help='folder to train data')
  23. parser.add_argument('--outf', default='imgs/', help='folder to output images and model checkpoints')
  24. opt = parser.parse_args()
  25. # 定义是否使用GPU
  26. device = torch.device("cuda:3" if torch.cuda.is_available() else "cpu")
  27. #图像读入与预处理
  28. transforms = torchvision.transforms.Compose([
  29. torchvision.transforms.Scale(opt.imageSize),
  30. torchvision.transforms.ToTensor(),
  31. torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), ])
  32. dataset = torchvision.datasets.ImageFolder(opt.data_path, transform=transforms)
  33. dataloader = torch.utils.data.DataLoader(
  34. dataset=dataset,
  35. batch_size=opt.batchSize,
  36. shuffle=True,
  37. drop_last=True,
  38. )
  39. netG = NetG(opt.ngf, opt.nz).to(device)
  40. netD = NetD(opt.ndf).to(device)
  41. criterion = nn.BCELoss()
  42. optimizerG = torch.optim.Adam(netG.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))
  43. optimizerD = torch.optim.Adam(netD.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999))
  44. label = torch.FloatTensor(opt.batchSize)
  45. real_label = 1
  46. fake_label = 0
  47. save_path = './imgs/epoch{:s}'
  48. j = 0
  49. for epoch in range(1, opt.epoch + 1):
  50. for i, (imgs,_) in enumerate(dataloader):
  51. # 固定生成器G,训练鉴别器D
  52. optimizerD.zero_grad()
  53. ## 让D尽可能的把真图片判别为1
  54. imgs=imgs.to(device)
  55. output = netD(imgs)
  56. label.data.fill_(real_label)
  57. label=label.to(device)
  58. errD_real = criterion(output, label)
  59. errD_real.backward()
  60. ## 让D尽可能把假图片判别为0
  61. label.data.fill_(fake_label)
  62. noise = torch.randn(opt.batchSize, opt.nz, 1, 1)
  63. noise=noise.to(device)
  64. fake = netG(noise) # 生成假图
  65. output = netD(fake.detach()) #避免梯度传到G,因为G不用更新
  66. errD_fake = criterion(output, label)
  67. errD_fake.backward()
  68. errD = errD_fake + errD_real
  69. optimizerD.step()
  70. # 固定鉴别器D,训练生成器G
  71. optimizerG.zero_grad()
  72. # 让D尽可能把G生成的假图判别为1
  73. label.data.fill_(real_label)
  74. label = label.to(device)
  75. output = netD(fake)
  76. errG = criterion(output, label)
  77. errG.backward()
  78. optimizerG.step()
  79. print('[%d/%d][%d/%d] Loss_D: %.3f Loss_G %.3f'
  80. % (epoch, opt.epoch, i, len(dataloader), errD.item(), errG.item()))
  81. if epoch % 1000 == 0:
  82. os.mkdir(save_path.format(str(j)))
  83. for i in range(len(fake.data)):
  84. #im = tensor_to_PIL(fake.data[i])
  85. #im = im.convert('RGB')
  86. #im.save(os.path.join(save_path.format(str(j)),str(i)+'.png'))
  87. vutils.save_image(fake.data[i],
  88. '%s/%d.png' % (save_path.format(str(j)), i),
  89. normalize=True)
  90. torch.save(netG.state_dict(), '%s/netG_%03d.pth' % (opt.outf, epoch))
  91. torch.save(netD.state_dict(), '%s/netD_%03d.pth' % (opt.outf, epoch))
  92. j = j+1

参考博客:Pytorch版DCGAN图像生成技术

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/711238
推荐阅读
相关标签
  

闽ICP备14008679号