赞
踩
What kind of thing is this article going to do (from the abstract and conclusion, try to summarize it in one sentence)
To simultaneously satisfy the requirements of efficiency and quality, this article begins by establishing a foundation with sparse points using 3D Gaussian distributions to preserve desirable space. It then progresses to optimizing anisotropic covariance to achieve an accurate representation. Lastly, it introduces a cutting-edge, visibility-aware rendering algorithm designed for rapid processing, thereby achieving state-of-the-art results in the field.
Under what conditions or needs this research plan was proposed (Intro), what problems/deficiencies should be solved at the core, what others have done, and what are the innovation points? (From Introduction and related work)
Maybe contain Background, Question, Others, Innovation:
Three aspects of related work can explain this question.
Traditional reconstructions such as SfM and MVS need to re-project and
blend the input images into the novel view camera, and use the
geometry to guide this re-projection(From 2D to 3D).
Sad: Cannot completely recover from unreconstructed regions, or from “over-reconstruction”, when MVS generates inexistent geometry.
Neural Rendering and Radiance Fields
Neural rendering represents a broader category of techniques that leverage deep learning for image synthesis, while radiance field is a specific technique within neural rendering focused on the scene representation of light and color in 3D spaces.
Deep Learning was mainly used on MVS-based geometry before, which is also its major drawback.
Nerf is along the way of volumetric representation, which introduced positional encoding and importance sampling.
Faster training methods focus on the use of spatial data structures to store (neural) features that are subsequently interpolated during volumetric ray-marching, different encodings, and MLP capacity.
Today, notable works include InstantNGP and Plenoxels both rely on Spherical Harmonics.
Understand Spherical Harmonics as a set of basic functions to fit a geometry in a 3D spherical coordinate system.
Through the Gradient Flow in this paper’s pipeline, we are trying to connect Part4, 5, and 6 in this paper.
Firstly, start from the loss function, which is combined by a L 1 {\mathcal L}_{1} L1 loss and a S S I M SSIM SSIM index, just as shown below:
L = ( 1 − λ ) L 1 + λ L D − S S I M . (1) {\mathcal L}=(1-\lambda){\mathcal L}_{1}+\lambda{\mathcal L}_{\mathrm{D-SSIM}}.\tag{1} L=(1−λ)L1+λLD−SSIM.(1)
It found a relation between the actual image and the rendering image. So to finish the optimization, we need to dive into the process of rendering. From the chapter on related work, we know Point-based α \alpha α-blending and NeRF-style volumetric rendering share essentially the same image formation model. That is
C = ∑ i = 1 N T i ( 1 − exp ( − σ i δ i ) ) c i w i t h T i = exp ( − ∑ j = 1 i − 1 σ j δ j ) . (2) C=\sum_{i=1}^{N}T_{i}(1-\exp(-\sigma_{i}\delta_{i}))c_{i}\quad\mathrm{with}\quad T_{i}=\exp\left(-\sum_{j=1}^{i-1}\sigma_{j}\delta_{j}\right).\tag{2} C=i=1∑NTi(1−exp(−σiδi))ciwithTi=exp(−j=1∑i−1σjδj).(2)
And this paper actually uses a typical neural point-based approach just like (2), which can be represented as:
C = ∑ i ∈ N c i α i ∏ j = 1 i − 1 ( 1 − α j ) (3) C=\sum_{i\in N}c_{i}\alpha_{i}\prod_{j=1}^{i-1}(1-\alpha_{j}) \tag{3} C=i∈N∑ciαij=1∏i−1(1−αj)(3)
From this formulation, we can know what the representation of volume should contain the information of color c c c and transparency α \alpha α. These are attached to the gaussian, where Spherical Harmonics was used to represent color, just like Plenoxels. The other attributes used are the position and covariance matrix. So, now we have introduced the four attributes to represent the scene, that is positions 本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。