赞
踩
1.GAN基本原理
latent space
中进行随机采样作为网络的输入,使得输出结果尽量类似于真是样本;判别网络的输入为真实样本或者是生成网络的输出,其目标是尽量的从真实样本中识别出由生成网络得到的生成样本。在训练的过程中,生成网络和判别网络相互对抗,最终实现无法从真实样本中辨别出生成样本。
G
(Generator),判别网络记做D
(Discriminator):
G
接受一个随机的噪声z
,通过这个噪声生成图片,记做G(z)
D
判断一张图片是不是真实的,输入是一张图片x
,D(x)
表示是真实图片的概率D(G(z))=0.5
,即无法判断是真实图片还是生成图片。GAN的基本数学原理如下:
x
表示真实图片,z
表示G网络的随机噪音D(X)
表示D网络判断真实样本是否真实的概率D(G(z))
表示D网络判断生成样本是否真实的概率D
的目的是使得D(x)
尽量大,但是D(G(z))
尽量小,所以V(D,G)
会变大,所以是求max
G
的目的是使得D(G(z))
尽量大,所以V(D,G)
会变小,所以是求min
D
网络,所以D
是希望V(D,G)
越大越好,会加上梯度(ascending),但是对于训练G
网络时,G
希望V(D,G)
变小,所以会减去梯度(descending)2.DCGAN基本原理
CNN
与GAN
进行结合,得到的一个较好的算法是DCGAN
,也就是将D
网络和G
网络换成是一个CNN
结构,但是在DCGAN
中对CNN
进行一定程度的修改,来提高样本的质量和收敛速度:
pooling
层,G
网络使用转置卷积的方式进行上采样,D
网络中加入Stride
的卷积替代pooling
D
和G
中均使用Batch Normalization
G
网络中使用ReLU
作为激活函数,最后一层使用tanh
D
网络中使用LeakyReLU
作为激活函数DCGAN-G
如图:
3.GAN代码实现
(1)导入关键包并加载数据
import tensorflow as tf
import tflearn
import numpy as np
import tflearn.datasets.mnist as mnist
X, Y, X_test, Y_test = mnist.load_data()
image_dim = 784
z_dim = 200
total_sample = len(X)
(2)构建生成器与判别器
def generate(x, reuse=tf.AUTO_REUSE):
with tf.variable_scope('Generate', reuse=reuse):
x = tflearn.fully_connected(x, 256, activation='relu')
x = tflearn.fully_connected(x, image_dim, activation='sigmoid')
return x
def discriminator(x, reuse=tf.AUTO_REUSE):
with tf.variable_scope('Discriminator', reuse=reuse):
x = tflearn.fully_connected(x, 256, activation='relu')
x = tflearn.fully_connected(x, 1, activation='sigmoid')
return x
(3)网络构建
gen_input = tflearn.input_data(shape=[None, z_dim], name='input_noise')
disc_input = tflearn.input_data(shape=[None, 784], name='disc_input')
# first is generate and second is discriminator
gen_sample = generate(gen_input)
disc_real = discriminator(disc_input)
disc_fake = discriminator(gen_sample)
disc_loss = -tf.reduce_mean(tf.log(disc_real) + tf.log(1. - disc_fake))
gen_loss = -tf.reduce_mean(tf.log(disc_fake))
gen_vars = tflearn.get_layer_variables_by_scope('Generate')
gen_model = tflearn.regression(gen_sample, placeholder=None, optimizer='adam',
loss=gen_loss, trainable_vars=gen_vars,
batch_size=64, name='target_gen', op_name='GEN')
disc_vars = tflearn.get_layer_variables_by_scope('Discriminator')
disc_model = tflearn.regression(disc_real, placeholder=None, optimizer='adam',
loss=disc_loss, trainable_vars=disc_vars,
batch_size=64, name='target_disc', op_name='DISC')
gan = tflearn.DNN(gen_model)
z_dim
表示噪声点,generate
会通过上采样生成数据,对于网络迭代过程中,G
网络和D
网络的参数要分开优化。使用get_layer_variables_by_scope
获取scope
内所有网络层参数。对于placeholder
设置为None
,定义trainable_vars
。构建DNN
时只需要传入生成网络。
generate
的输出激活函数中不要使用tanh
,改为使用sigmoid
即可。
(4)训练并绘制图像
z = np.random.uniform(-1., 1., [total_sample, z_dim])
gan.fit(X_inputs={gen_input: z, disc_input: X},
Y_targets=None,
n_epoch=200)
import matplotlib.pyplot as plt
f, a = plt.subplots(2, 10, figsize=(10, 4))
for i in range(10):
for j in range(2):
# Noise input.
z = np.random.uniform(-1., 1., size=[1, z_dim])
# Generate image from noise. Extend to 3 channels for matplot figure.
temp = [[ii, ii, ii] for ii in list(gan.predict([z])[0])]
print(temp)
a[j][i].imshow(np.reshape(temp, (28, 28, 3)))
f.show()
plt.show()
4.DCGAN代码实现
(1)导入包和数据
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tflearn
import tflearn.datasets.mnist as mnist
X, Y, testX, testY = mnist.load_data()
X = np.reshape(X, newshape=[-1, 28, 28, 1])
z_dim = 200 # Noise data points
total_samples = len(X)
(2)构建生成器和判别器
def generator(x, reuse=False):
with tf.variable_scope('Generator', reuse=reuse):
x = tflearn.fully_connected(x, n_units=7 * 7 * 128)
x = tflearn.batch_normalization(x)
x = tf.nn.tanh(x)
x = tf.reshape(x, shape=[-1, 7, 7, 128])
x = tflearn.upsample_2d(x, 2)
x = tflearn.conv_2d(x, 64, 5, activation='tanh')
x = tflearn.upsample_2d(x, 2)
x = tflearn.conv_2d(x, 1, 5, activation='sigmoid')
return x
def discriminator(x, reuse=False):
with tf.variable_scope('Discriminator', reuse=reuse):
x = tflearn.conv_2d(x, 64, 5, activation='tanh')
x = tflearn.avg_pool_2d(x, 2)
x = tflearn.conv_2d(x, 128, 5, activation='tanh')
x = tflearn.avg_pool_2d(x, 2)
x = tflearn.fully_connected(x, 1024, activation='tanh')
x = tflearn.fully_connected(x, 2)
x = tf.nn.softmax(x)
return x
(3)构建网络
# Input Data
gen_input = tflearn.input_data(shape=[None, z_dim], name='input_gen_noise')
input_disc_noise = tflearn.input_data(shape=[None, z_dim], name='input_disc_noise')
input_disc_real = tflearn.input_data(shape=[None, 28, 28, 1], name='input_disc_real')
disc_fake = discriminator(generator(input_disc_noise))
disc_real = discriminator(input_disc_real, reuse=True)
disc_net = tf.concat([disc_fake, disc_real], axis=0)
gen_net = generator(gen_input, reuse=True)
stacked_gan_net = discriminator(gen_net, reuse=True)
disc_vars = tflearn.get_layer_variables_by_scope('Discriminator')
disc_target = tflearn.multi_target_data(['target_disc_fake', 'target_disc_real'],shape=[None, 2])
disc_model = tflearn.regression(disc_net, optimizer='adam',
placeholder=disc_target,
loss='categorical_crossentropy',
trainable_vars=disc_vars,
batch_size=64, name='target_disc',
op_name='DISC')
gen_vars = tflearn.get_layer_variables_by_scope('Generator')
gan_model = tflearn.regression(stacked_gan_net, optimizer='adam',
loss='categorical_crossentropy',
trainable_vars=gen_vars,
batch_size=64, name='target_gen',
op_name='GEN')
gan = tflearn.DNN(gan_model)
(4)训练网络
disc_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
y_disc_fake = np.zeros(shape=[total_samples])
y_disc_real = np.ones(shape=[total_samples])
y_disc_fake = tflearn.data_utils.to_categorical(y_disc_fake, 2)
y_disc_real = tflearn.data_utils.to_categorical(y_disc_real, 2)
gen_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
y_gen = np.ones(shape=[total_samples])
y_gen = tflearn.data_utils.to_categorical(y_gen, 2)
# Start training, feed both noise and real images.
gan.fit(X_inputs={'input_gen_noise': gen_noise,
'input_disc_noise': disc_noise,
'input_disc_real': X},
Y_targets={'target_gen': y_gen,
'target_disc_fake': y_disc_fake,
'target_disc_real': y_disc_real},
n_epoch=10)
gen = tflearn.DNN(gen_net, session=gan.session)
(5)绘制图像
f, a = plt.subplots(4, 10, figsize=(10, 4))
for i in range(10):
# Noise input.
z = np.random.uniform(-1., 1., size=[4, z_dim])
g = np.array(gen.predict({'input_gen_noise': z}))
for j in range(4):
# Generate image from noise. Extend to 3 channels for matplot figure.
img = np.reshape(np.repeat(g[j][:, :, np.newaxis], 3, axis=2),
newshape=(28, 28, 3))
a[j][i].imshow(img)
f.show()
plt.show()
参考链接:
https://zhuanlan.zhihu.com/p/24767059
https://zh.wikipedia.org/wiki/%E7%94%9F%E6%88%90%E5%AF%B9%E6%8A%97%E7%BD%91%E7%BB%9C
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。