赞
踩
Glow: Generative Flow with Invertible 1×1 Convolutions
paper: https://arxiv.org/pdf/1807.03039.pdf
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood , tractability of exact latent-variable inference, and parallelizability of both training and synthesis。
流式生成模型,具有可解释性,和模型的并行训练和生成能力。因为其log似然函数和隐变量有可解性。
The discipline of generative modeling has experienced enormous leaps in capabilities in recent years, mostly with likelihood-based methods and generative adversarial networks (GANs) .Likelihood-based methods can be divided into three categories:
生成模型主要有likelihood-based 方法和GAN方法(备注:还有diffusion方法)。likelihood-based 可以分为以下三种
① . Autoregressive models
② Variational autoencoders (VAEs)
③ Flow-based generative models,
注:在加上④ GAN,⑤ Diffusion,共有5中常见的生成方法。
We collect an i.i.d. dataset D, and choose a model pθ(x) with parameters θ. In case of discrete data x, the log-likelihood objective is then equivalent to minimizing:
x服从 pθ(x)分布,pθ(x) 为神经网络模型,其中θ为模型可学习参数。最小化X的log似然函数如下式。
L
(
D
)
=
1
N
∑
i
=
1
N
−
l
o
g
p
θ
(
x
(
i
)
)
L(D)= \frac{1}{N}\sum_{i=1}^N-logp_\theta (x^{(i)})
L(D)=N1i=1∑N−logpθ(x(i))
In most flow-based generative models, the generative process is defined as
大多数flow-based生成模型,生成过程如下。
z是隐变量,pθ(x)是可解的密度函数,比如多元高斯分布,函数gθ(z)是存在反函数的,及双射(bijective)函数,.给定数据x
z
=
f
θ
(
x
)
=
g
θ
−
1
(
x
)
z=f_\theta (x)=g_\theta ^{-1}(x)
z=fθ(x)=gθ−1(x)
We focus on functions where f (and, likewise, g) is composed of a sequence of transformations: f = f1 ◦ f2 ◦ · · · ◦ fK, such that the relationship between x and z can be written as:
函数f是有一个序列函数组成, f = f1 ◦ f2 ◦ · · · ◦ fK,因此x 和z的关系如下所示
Such a sequence of invertible transformations is also called a (normalizing) flow
如上可逆变换序列成为normalizing flow.
根据
x
=
g
θ
(
z
)
x=g_\theta(z)
x=gθ(z)
带入
l
o
g
p
θ
(
x
)
logp_\theta(x)
logpθ(x)
得到如下公式
det(dz/dx)为行列式,对于行列式的求解,我们可以选择简单的 Z 函数。
公式(6)以下是个人直观解释。
当为一维时,det(dz/dx)=dz/dx
p
θ
(
x
)
=
p
θ
(
z
)
∗
(
d
z
/
d
x
)
p_\theta(x)=p_\theta(z)*(dz/dx)
pθ(x)=pθ(z)∗(dz/dx)
两边同事乘 dx,等式天然成立
d
x
∗
p
θ
(
x
)
=
p
θ
(
z
)
∗
d
z
dx*p_\theta(x)=p_\theta(z)*dz
dx∗pθ(x)=pθ(z)∗dz
当为二维时。det(dz/dx) 为面积,及dz0,dz1在 dx1,dx2方向上面面积的改变。
p
θ
(
x
)
/
p
θ
(
z
)
=
(
d
z
/
d
x
)
=
z
因
x
的变换,而产生的面积的变换
Δ
x
p_\theta(x)/p_\theta(z)=(dz/dx)=z因x的变换,而产生的面积的变换\Delta x
pθ(x)/pθ(z)=(dz/dx)=z因x的变换,而产生的面积的变换Δx
当为三维时,及z因x的体积改变,产生的值改变
直观理解,具体证明可以看相关文章
模型架构如上所示,
左边图由,actnorm 层,可逆1*1 convolution层,放射变换层组成
有图有多层多尺度架构组成,具体细节见下表。
table1: 我们提出的flow模型有三个主要组成部分,函数,反函数及log行列式如下图所示。
We propose to replace this fixed permutation with a (learned) invertible 1 × 1 convolution, where the weight matrix is initialized as a random rotation matrix. Note that a 1×1 convolution with equal number of input and output channels is a generalization of a permutation operation.
我们提出了一个(学习的)可逆的1 × 1卷积来代替这个固定的排列,其中权重矩阵被初始化为随机旋转矩阵。请注意,具有相等数量的输入和输出通道的1×1卷积是置换操作的泛化。
1 × 1 convolution的log 行列式可以直接进行计算。公式如下
行列式的计算可以使用LU分解来降低计算复杂度。
P是一个排列矩阵,L是下三角矩阵,U是上三角矩阵,s是一个向量。log行列式计算如下:
A powerful reversible transformation where the forward function, the reverse function and the logdeterminant are computationally efficient, is the affine coupling layer introduced in (Dinh et al., 2014,2016). See Table 1. An additive coupling layer is a special case with s = 1 and a log-determinant of 0.
一个强大的可逆变换函数,其中正函数,反函数和对数行列式的计算效率很高效的,
Dinh et al 在他的论文中引入的仿射耦合层,仿射耦合层是一种强大的可逆变换,其中正向函数、逆函数和对数行列式的计算效率很高。公式见表1。具体结构如下图,详细公式计算及实现见代码实现部分:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。