当前位置:   article > 正文

Python——图像缺失弥补_图像补全csdn

图像补全csdn

        最近开始确认自己想要在Python和深度学习学习的一个方向,就是图像处理,自己对这部分还是很有兴趣的,所以最近看视频,然后根据代码做了一个图像缺失弥补的程序。这个课程我2年前是看过的,但是因为那时候的笔记本没办法跑这种吃资源的项目,所以工作后自己凑了一台3060的笔记本和2060的台式,专门用来跑程序。以下是对程序的理解。

        一、模型解读

        这个项目来源于一篇论文Globally and Locally Consistent Image Completion,如果想要理解这个模型,需要先大致了解一下这个论文。论文的中心思想是:先给图片挖掉一部分区域——用
这个图片去跑global completion网络,并且把网络参数保存——然后在completion基础上,用global completion得到的全局图片和生成的local图片分别跑Global Discriminator和local Discriminator,项目模型可以看下图:注意这里的图片输入,一个是完整未动过的图片(completion生成的),一个是从网络自己生成的图片中截取的local图片。我们本文的模型是跑一个completion和一个completion+discriminator,然后结果可以比较。

        二、网络解读

        通过模型可以看到这里面有两个大网络:completion和discriminator,而discriminator又分为global和local两部分,论文中对网络组成进行了详细描述,如下图:

其中的dilated conv是指空洞卷积,其目的是为了增加感受野,而deconv conv是反卷积,目的是把图片进行还原。在tensorflow中用空洞卷积使用tf.nn.atrous_conv2d(x, filters, dilation, padding='SAME'),而反卷积使用tf.nn.conv2d_transpose(x, filters, output_shape, [1, stride, stride, 1]),这个是比较方便的。网络的定义如下代码所示:

        

  1. from layer import *
  2. import tensorflow as tf
  3. class Network:
  4. def __init__(self, x, mask, local_x, global_completion, local_completion, is_training, batch_size):
  5. self.batch_size = batch_size
  6. self.imitation = self.generator(x * (1 - mask), is_training)
  7. self.completion = self.imitation * mask + x * (1 - mask)
  8. # 由真的图片上截取下来的local_x跟原图x,输出结果就是True
  9. self.real = self.discriminator(x, local_x, reuse=False)
  10. # 由completion自己补的图片跟local discriminator补出来的图片,输出结果就是Fake
  11. self.fake = self.discriminator(global_completion, local_completion, reuse=True)
  12. self.g_loss = self.calc_g_loss(x, self.completion)
  13. self.d_loss = self.calc_d_loss(self.real, self.fake)
  14. self.g_variables = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.TRAINABLE_VARIABLES, scope='generator')
  15. self.d_variables = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.TRAINABLE_VARIABLES, scope='discriminator')
  16. def generator(self, x, is_training):
  17. with tf.compat.v1.variable_scope('generator'):
  18. with tf.compat.v1.variable_scope('conv1'):
  19. x = conv_layer(x, [5, 5, 3, 64], 1)
  20. x = batch_normalize(x, is_training)
  21. x = tf.nn.relu(x)
  22. with tf.compat.v1.variable_scope('conv2'):
  23. x = conv_layer(x, [3, 3, 64, 128], 2)
  24. x = batch_normalize(x, is_training)
  25. x = tf.nn.relu(x)
  26. with tf.compat.v1.variable_scope('conv3'):
  27. x = conv_layer(x, [3, 3, 128, 128], 1)
  28. x = batch_normalize(x, is_training)
  29. x = tf.nn.relu(x)
  30. with tf.compat.v1.variable_scope('conv4'):
  31. x = conv_layer(x, [3, 3, 128, 256], 2)
  32. x = batch_normalize(x, is_training)
  33. x = tf.nn.relu(x)
  34. with tf.compat.v1.variable_scope('conv5'):
  35. x = conv_layer(x, [3, 3, 256, 256], 1)
  36. x = batch_normalize(x, is_training)
  37. x = tf.nn.relu(x)
  38. with tf.compat.v1.variable_scope('conv6'):
  39. x = conv_layer(x, [3, 3, 256, 256], 1)
  40. x = batch_normalize(x, is_training)
  41. x = tf.nn.relu(x)
  42. with tf.compat.v1.variable_scope('dilated1'):
  43. x = dilated_conv_layer(x, [3, 3, 256, 256], 2)
  44. x = batch_normalize(x, is_training)
  45. x = tf.nn.relu(x)
  46. with tf.compat.v1.variable_scope('dilated2'):
  47. x = dilated_conv_layer(x, [3, 3, 256, 256], 4)
  48. x = batch_normalize(x, is_training)
  49. x = tf.nn.relu(x)
  50. with tf.compat.v1.variable_scope('dilated3'):
  51. x = dilated_conv_layer(x, [3, 3, 256, 256], 8)
  52. x = batch_normalize(x, is_training)
  53. x = tf.nn.relu(x)
  54. with tf.compat.v1.variable_scope('dilated4'):
  55. x = dilated_conv_layer(x, [3, 3, 256, 256], 16)
  56. x = batch_normalize(x, is_training)
  57. x = tf.nn.relu(x)
  58. with tf.compat.v1.variable_scope('conv7'):
  59. x = conv_layer(x, [3, 3, 256, 256], 1)
  60. x = batch_normalize(x, is_training)
  61. x = tf.nn.relu(x)
  62. with tf.compat.v1.variable_scope('conv8'):
  63. x = conv_layer(x, [3, 3, 256, 256], 1)
  64. x = batch_normalize(x, is_training)
  65. x = tf.nn.relu(x)
  66. with tf.compat.v1.variable_scope('deconv1'):
  67. x = deconv_layer(x, [4, 4, 128, 256], [self.batch_size, 64, 64, 128], 2)
  68. x = batch_normalize(x, is_training)
  69. x = tf.nn.relu(x)
  70. with tf.compat.v1.variable_scope('conv9'):
  71. x = conv_layer(x, [3, 3, 128, 128], 1)
  72. x = batch_normalize(x, is_training)
  73. x = tf.nn.relu(x)
  74. with tf.compat.v1.variable_scope('deconv2'):
  75. x = deconv_layer(x, [4, 4, 64, 128], [self.batch_size, 128, 128, 64], 2)
  76. x = batch_normalize(x, is_training)
  77. x = tf.nn.relu(x)
  78. with tf.compat.v1.variable_scope('conv10'):
  79. x = conv_layer(x, [3, 3, 64, 32], 1)
  80. x = batch_normalize(x, is_training)
  81. x = tf.nn.relu(x)
  82. with tf.compat.v1.variable_scope('conv11'):
  83. x = conv_layer(x, [3, 3, 32, 3], 1)
  84. x = tf.nn.tanh(x)
  85. return x
  86. def discriminator(self, global_x, local_x, reuse):
  87. def global_discriminator(x):
  88. is_training = tf.constant(True)
  89. with tf.compat.v1.variable_scope('global'):
  90. with tf.compat.v1.variable_scope('conv1'):
  91. x = conv_layer(x, [5, 5, 3, 64], 2)
  92. x = batch_normalize(x, is_training)
  93. x = tf.nn.relu(x)
  94. with tf.compat.v1.variable_scope('conv2'):
  95. x = conv_layer(x, [5, 5, 64, 128], 2)
  96. x = batch_normalize(x, is_training)
  97. x = tf.nn.relu(x)
  98. with tf.compat.v1.variable_scope('conv3'):
  99. x = conv_layer(x, [5, 5, 128, 256], 2)
  100. x = batch_normalize(x, is_training)
  101. x = tf.nn.relu(x)
  102. with tf.compat.v1.variable_scope('conv4'):
  103. x = conv_layer(x, [5, 5, 256, 512], 2)
  104. x = batch_normalize(x, is_training)
  105. x = tf.nn.relu(x)
  106. with tf.compat.v1.variable_scope('conv5'):
  107. x = conv_layer(x, [5, 5, 512, 512], 2)
  108. x = batch_normalize(x, is_training)
  109. x = tf.nn.relu(x)
  110. with tf.compat.v1.variable_scope('fc'):
  111. x = flatten_layer(x)
  112. x = full_connection_layer(x, 1024)
  113. return x
  114. def local_discriminator(x):
  115. is_training = tf.constant(True)
  116. with tf.compat.v1.variable_scope('local'):
  117. with tf.compat.v1.variable_scope('conv1'):
  118. x = conv_layer(x, [5, 5, 3, 64], 2)
  119. x = batch_normalize(x, is_training)
  120. x = tf.nn.relu(x)
  121. with tf.compat.v1.variable_scope('conv2'):
  122. x = conv_layer(x, [5, 5, 64, 128], 2)
  123. x = batch_normalize(x, is_training)
  124. x = tf.nn.relu(x)
  125. with tf.compat.v1.variable_scope('conv3'):
  126. x = conv_layer(x, [5, 5, 128, 256], 2)
  127. x = batch_normalize(x, is_training)
  128. x = tf.nn.relu(x)
  129. with tf.compat.v1.variable_scope('conv4'):
  130. x = conv_layer(x, [5, 5, 256, 512], 2)
  131. x = batch_normalize(x, is_training)
  132. x = tf.nn.relu(x)
  133. with tf.compat.v1.variable_scope('fc'):
  134. x = flatten_layer(x)
  135. x = full_connection_layer(x, 1024)
  136. return x
  137. with tf.compat.v1.variable_scope('discriminator', reuse=reuse):
  138. global_output = global_discriminator(global_x)
  139. local_output = local_discriminator(local_x)
  140. with tf.compat.v1.variable_scope('concatenation'):
  141. output = tf.compat.v1.concat((global_output, local_output), 1)
  142. output = full_connection_layer(output, 1)
  143. return output
  144. def calc_g_loss(self, x, completion):
  145. loss = tf.compat.v1.nn.l2_loss(x - completion)
  146. return tf.reduce_mean(loss)
  147. def calc_d_loss(self, real, fake):
  148. alpha = 4e-4
  149. d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=real, labels=tf.ones_like(real)))
  150. d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=fake, labels=tf.zeros_like(fake)))
  151. return tf.add(d_loss_real, d_loss_fake) * alpha

        关于loss值的选取:对于completion比较简单,采用MSE值来计算,就是简单地用生成的图片和真实图片做一个减法,就可以得出loss值;而discriminator则比较复杂一点,我理解了很久,因为论文提及用fake和real来进行判别,使用tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=fake, labels=tf.zeros_like(fake)))来进行计算,但是对于谁是fake谁是real并不是很清晰,我分析代码后得出:

由真的图片上截取下来的local_x跟原图x,输出结果就是True
由completion自己补的图片跟local discriminator补出来的图片,输出结果就是Fake

在程序中,它们的定义为

self.real = self.discriminator(x, local_x, reuse=False)
self.fake = self.discriminator(global_completion, local_completion, reuse=True)

        三、程序分析

        程序的架构是:数据处理——网络定义——建立模型——模型计算——结果展示

        数据处理:这次补全的图片采用人像,整个数据有20万+,如果每次都导入这么多数据,将会非常浪费时间跟资源,因此,程序先将这些图片进行压缩,并且转为npy格式,同时,为了节省资源,只选取其中5000张图片,x_train为95%。

        网络定义:已经在上面讲过,这里不再赘述。

        建立模型:模型建立就是把一些过程量定义出来,这里需要解释一下mask,它本身是一个黑白图,把要被填充的地方,置为1,其它地方置为0,而这部分是通过get_points函数来实现,这个函数计算出local的大小和坐标,然后对该部分进行填充,代码如下所示。

  1. x = tf.compat.v1.placeholder(tf.float32,[BATCH_SIZE,IMAGE_SIZE,IMAGE_SIZE,3])
  2. mask = tf.compat.v1.placeholder(tf.float32,[BATCH_SIZE,IMAGE_SIZE,IMAGE_SIZE,1])
  3. local_x = tf.compat.v1.placeholder(tf.float32,[BATCH_SIZE,LOCAL_SIZE,LOCAL_SIZE,3])
  4. global_completion = tf.compat.v1.placeholder(tf.float32,[BATCH_SIZE,IMAGE_SIZE,IMAGE_SIZE,3])
  5. local_completion = tf.compat.v1.placeholder(tf.float32,[BATCH_SIZE,LOCAL_SIZE,LOCAL_SIZE,3])
  6. is_training = tf.compat.v1.placeholder(tf.bool,[])
  7. model = Network(x, mask, local_x, global_completion, local_completion, is_training, batch_size=BATCH_SIZE)
  8. sess = tf.compat.v1.Session()
  9. global_step = tf.compat.v1.Variable(0,name='global_step',trainable=False)
  10. epoch = tf.compat.v1.Variable(0,name='epoch',trainable=False)
  11. opt = tf.compat.v1.train.AdamOptimizer(learning_rate=LEARNING_RATE)
  12. # var_list:默认是GraphKeys.TRAINABLE_VARIABLES
  13. g_train_op = opt.minimize(model.g_loss, global_step=global_step, var_list=model.g_variables)
  14. d_train_op = opt.minimize(model.d_loss, global_step=global_step, var_list=model.d_variables)
  15. init_opt = tf.compat.v1.global_variables_initializer()
  16. sess.run(init_opt)

  1. def get_points():
  2. points = []
  3. mask = []
  4. for i in range(BATCH_SIZE):
  5. # starting coordinate of the hole
  6. x1, y1 = np.random.randint(0, IMAGE_SIZE - LOCAL_SIZE + 1, 2)
  7. x2, y2 = np.array([x1, y1]) + LOCAL_SIZE
  8. points.append([x1, y1, x2, y2])
  9. # weight,height
  10. w, h = np.random.randint(HOLE_MIN, HOLE_MAX + 1, 2)
  11. p1 = x1 + np.random.randint(0, LOCAL_SIZE - w)
  12. q1 = y1 + np.random.randint(0, LOCAL_SIZE - h)
  13. p2 = p1 + w
  14. q2 = q1 + h
  15. m = np.zeros((IMAGE_SIZE, IMAGE_SIZE, 1), dtype=np.uint8)
  16. m[q1:q2 + 1, p1:p2 + 1] = 1
  17. mask.append(m)
  18. return np.array(points), np.array(mask)

         模型计算:通过模型定义,我们给定一个PRETRAIN_EPOCH值,如果epoch超过这个值,就停止completion的计算,保存模型参数,然后开始计算discriminator,而这部分源程序中没有给出停止的条件,所以我给定一个stop_loss:1e-4,当loss值低于这个数就保存模型跳出。这里就是跑的最久的地方,如果batch_size给的太大,电脑资源容易不够,我用台式电脑:6g 2060显卡跑这个模型,只能用batch_size=16,不然就会算不下去。

        模型展示:最后,我们通过x_test来看一下计算结果,结果分为两个,一个是completion完成的,一个是completion+discriminator完成的。

        下图是completion最后出来的图,效果还可以,有点像打了马赛克;

        下图是原始图和模型图的对照,结果也还不错,如果模型继续训练可以得到更好的结果,论文中的图是跑了好几天的:

        

        下图是一个是completion(上)完成的,一个是completion+discriminator(下)完成的,下面那张的肤色比上面的偏白。

 

 代码下载链接python图像缺失弥补源码资源-CSDN文库

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/587657
推荐阅读
相关标签
  

闽ICP备14008679号