赞
踩
原文:DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation
代码:Github
计算机图形学已经提出多种方法来表示3D几何图形。本文提出 DeepSDF (一个可学习有符号的连续距离函数SDF) 来表示3D形状,它能从部分和有噪声的3D输入数据中生成高质量的shape。
原理:通过连续的体积场表示形状的表面,场内点的大小即幅值表示其到表面的距离,符号表示在场外(+) 还是场内(-);解析形式或离散的体素形式的传统SDF通常只能表示某个形状,而DeepSDF可以表示一类形状,即:拿手来举例,SDF只能表示某个手的某个形态;而DeepSDF可以将手的所有形态都能表示出来。
该章节通过神经网络来得到SDF,由于SDF本身是一个把离散位置的点映射到一个距离值上的函数,该值的符号表示了该点在表面的内部还是外部. 一般用SDF = 0处表示surface。 因此该方法主要是通过深度网络来生成一个从离散点到 连续有符号距离函数(SDF) 的映射函数,没有voxelization的过程。由于离散化受 卷积参数大小 和 grid精度 所约束, 相比之下DeepSDF方法更加灵活。
S D F ( x ) = s : x ∈ R 3 , s ∈ R S D F(x)=s: x \in \mathbb{R}^{3}, s \in \mathbb{R} SDF(x)=s:x∈R3,s∈R
关键思想则是利用深度神经网络从点直接回归到连续SDF,输出给定位置的SDF值,表面的值为0。
理论上说,深度前馈神经网络能学到任意精度的完全连续函数,如Fig. 3a 所示:
给定目标形状,用3D点位置 x \boldsymbol{x} x 以及SDF值 s s s 来组成 X X X:
X = { ( x , s ) : S D F ( x ) = s } X=\{(\boldsymbol{x}, s): S D F(\boldsymbol{x})=s\} X={(x,s):SDF(x)=s}
在训练集 S 上训练多层全连接神经网络 f θ f_θ fθ 的参数 θ θ θ,使 f θ f_θ fθ 成为目标域Ω中给定 SDF 的良好逼近器,即:
f θ ( x ) ≈ SDF ( x ) , ∀ x ∈ Ω f_{\theta}(\boldsymbol{x}) \approx \operatorname{SDF}(\boldsymbol{x}), \forall \boldsymbol{x} \in \Omega fθ(x)≈SDF(x),∀x∈Ω
用L1损失函数来进行:
L ( f θ ( x ) , s ) = ∣ clamp ( f θ ( x ) , δ ) − clamp ( s , δ ) ∣ \mathcal{L}\left(f_{\theta}(\boldsymbol{x}), s\right)=\left|\operatorname{clamp}\left(f_{\theta}(\boldsymbol{x}), \delta\right)-\operatorname{clamp}(s, \delta)\right| L(fθ(x),s)=∣clamp(fθ(x),δ)−clamp(s,δ)∣,其中 clamp ( x , δ ) : = min ( δ , max ( − δ , x ) ) \operatorname{clamp}(x, \delta):=\min (\delta, \max (-\delta, x)) clamp(x,δ):=min(δ,max(−δ,x))
该章节则是学习形状隐式空间,即 c o d e code code,由于针对一个shape来学习一个网络是很不实用的,因此本文引入隐式向量 z z z 代表目标形状的隐式密码,此时如Fig. 3b所示:用3D点位置 x \boldsymbol{x} x 以及 隐式密码 z z z 作为输入,此处与上面同理,使得:
f θ ( z i , x ) ≈ SDF i ( x ) f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}\right) \approx \operatorname{SDF}^{i}(\boldsymbol{x}) fθ(zi,x)≈SDFi(x),其中 i i i 指的是第 i i i个目标形状
这一部分的隐式密码 z i \boldsymbol{z}_{i} zi 采用自解码器来得到,给定N个shape,每个shape有K个点,输入的为: X i = { ( x j , s j ) : s j = S D F i ( x j ) } X_{i}=\left\{\left(\boldsymbol{x}_{j}, s_{j}\right): s_{j}=S D F^{i}\left(\boldsymbol{x}_{j}\right)\right\} Xi={(xj,sj):sj=SDFi(xj)},那么根据概率论则有:
p θ ( z i ∣ X i ) = p ( z i ) ∏ ( x j , s j ) ∈ X i p θ ( s j ∣ z i ; x j ) p_{\theta}\left(\boldsymbol{z}_{i} \mid X_{i}\right)=p\left(\boldsymbol{z}_{i}\right) \prod_{\left(\boldsymbol{x}_{j}, \boldsymbol{s}_{j}\right) \in X_{i}} p_{\theta}\left(\boldsymbol{s}_{j} \mid z_{i} ; \boldsymbol{x}_{j}\right) pθ(zi∣Xi)=p(zi)∏(xj,sj)∈Xipθ(sj∣zi;xj)
p θ ( s j ∣ z i ; x j ) = exp ( − L ( f θ ( z i , x j ) , s j ) ) p_{\theta}\left(\boldsymbol{s}_{j} \mid z_{i} ; \boldsymbol{x}_{j}\right)=\exp \left(-\mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}_{j}\right), s_{j}\right)\right) pθ(sj∣zi;xj)=exp(−L(fθ(zi,xj),sj))
在训练阶段,通过下式来优化 θ \theta θ 以及 z i \boldsymbol{z}_{i} zi:
arg min θ , { z i } i = 1 N ∑ i = 1 N ( ∑ j = 1 K L ( f θ ( z i , x j ) , s j ) + 1 σ 2 ∥ z i ∥ 2 2 ) \underset{\theta,\left\{\boldsymbol{z}_{i}\right\}_{i=1}^{N}}{\arg \min } \sum_{i=1}^{N}\left(\sum_{j=1}^{K} \mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}_{j}\right), \boldsymbol{s}_{j}\right)+\frac{1}{\sigma^{2}}\left\|\boldsymbol{z}_{i}\right\|_{2}^{2}\right) θ,{zi}i=1Nargmin∑i=1N(∑j=1KL(fθ(zi,xj),sj)+σ21∥zi∥22)
在推理阶段则是固定 θ \theta θ , z i \boldsymbol{z}_{i} zi可以通过 Maximum-aPosterior (MAP) estimation来得到,即:
z ^ = arg min z ∑ ( x j , s j ) ∈ X L ( f θ ( z , x j ) , s j ) + 1 σ 2 ∥ z ∥ 2 2 \hat{\boldsymbol{z}}=\underset{\boldsymbol{z}}{\arg \min } \sum_{\left(\boldsymbol{x}_{j}, \boldsymbol{s}_{j}\right) \in X} \mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}, \boldsymbol{x}_{j}\right), s_{j}\right)+\frac{1}{\sigma^{2}}\|\boldsymbol{z}\|_{2}^{2} z^=zargmin∑(xj,sj)∈XL(fθ(z,xj),sj)+σ21∥z∥22
import torch.nn as nn import torch import torch.nn.functional as F class Decoder(nn.Module): def __init__( self, latent_size, dims, dropout=None, dropout_prob=0.0, norm_layers=(), latent_in=(), weight_norm=False, xyz_in_all=None, use_tanh=False, latent_dropout=False, ): super(Decoder, self).__init__() def make_sequence(): return [] dims = [latent_size + 3] + dims + [1] self.num_layers = len(dims) self.norm_layers = norm_layers self.latent_in = latent_in self.latent_dropout = latent_dropout if self.latent_dropout: self.lat_dp = nn.Dropout(0.2) self.xyz_in_all = xyz_in_all self.weight_norm = weight_norm for layer in range(0, self.num_layers - 1): if layer + 1 in latent_in: out_dim = dims[layer + 1] - dims[0] else: out_dim = dims[layer + 1] if self.xyz_in_all and layer != self.num_layers - 2: out_dim -= 3 if weight_norm and layer in self.norm_layers: setattr( self, "lin" + str(layer), nn.utils.weight_norm(nn.Linear(dims[layer], out_dim)), ) else: setattr(self, "lin" + str(layer), nn.Linear(dims[layer], out_dim)) if ( (not weight_norm) and self.norm_layers is not None and layer in self.norm_layers ): setattr(self, "bn" + str(layer), nn.LayerNorm(out_dim)) self.use_tanh = use_tanh if use_tanh: self.tanh = nn.Tanh() self.relu = nn.ReLU() self.dropout_prob = dropout_prob self.dropout = dropout self.th = nn.Tanh() # input: N x (L+3) def forward(self, input): xyz = input[:, -3:] if input.shape[1] > 3 and self.latent_dropout: latent_vecs = input[:, :-3] latent_vecs = F.dropout(latent_vecs, p=0.2, training=self.training) x = torch.cat([latent_vecs, xyz], 1) else: x = input for layer in range(0, self.num_layers - 1): lin = getattr(self, "lin" + str(layer)) if layer in self.latent_in: x = torch.cat([x, input], 1) elif layer != 0 and self.xyz_in_all: x = torch.cat([x, xyz], 1) x = lin(x) # last layer Tanh if layer == self.num_layers - 2 and self.use_tanh: x = self.tanh(x) if layer < self.num_layers - 2: if ( self.norm_layers is not None and layer in self.norm_layers and not self.weight_norm ): bn = getattr(self, "bn" + str(layer)) x = bn(x) x = self.relu(x) if self.dropout is not None and layer in self.dropout: x = F.dropout(x, p=self.dropout_prob, training=self.training) if hasattr(self, "th"): x = self.th(x) return x
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。