IT小白

这个屌丝很懒，什么也没留下！

热门标签

CVPR2024 | 3D视觉感知相关工作汇总

作者：IT小白 | 2024-05-06 14:57:05

踩

selective-stereo

CVPR2024文章陆续出来了，今天为大家盘点下3D视觉感知相关的一些优秀工作，建议收藏！如果您对3D视觉感知相关工作感兴趣，欢迎关注公众号【3D视觉之心】，日常分享SLAM、三维重建、Nerf、Gaussian Splatting、传感器标定融合等内容。

0）三维重建

3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface

Paper：https://arxiv.org/abs/2403.08768

BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image

Paper：https://arxiv.org/abs/2403.08262

Bayesian Diffusion Models for 3D Shape Reconstruction

Paper：https://arxiv.org/abs/2403.06973

UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and UnFavOrable Sets

Paper：https://arxiv.org/abs/2403.05086

DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction

Paper：https://arxiv.org/abs/2403.05005

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

Paper：https://arxiv.org/abs/2403.03447

G3DR: Generative 3D Reconstruction in ImageNet

Paper：https://arxiv.org/abs/2403.00939

1）语义场景补全

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Paper: https://arxiv.org/pdf/2306.15670.pdf
Code: https://github.com/hustvl/Symphonies

PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

Paper: https://arxiv.org/pdf/2312.02158.pdf
Code: https://github.com/astra-vision/PaSCo

2）Occupancy

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

Paper: https://arxiv.org/pdf/2311.12754.pdf
Code: https://github.com/huang-yh/SelfOcc

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

Paper: https://arxiv.org/pdf/2311.17663.pdf
Code: https://github.com/haomo-ai/Cam4DOcc

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

Paper: https://arxiv.org/pdf/2306.10013.pdf
Code: https://github.com/Robertwyq/PanoOcc

3）3D Object Detection

PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection

Paper: https://arxiv.org/pdf/2312.08371.pdf
Code: https://github.com/KuanchihHuang/PTT

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Code: https://github.com/skmhrk1209/VSRD

CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection

Code: https://github.com/zhnxjtu/CaKDP

CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

Paper：https://arxiv.org/abs/2403.04198
Code：https://github.com/SerCharles/CN-RMA

UniMODE: Unified Monocular 3D Object Detection

Paper：https://arxiv.org/abs/2402.18573

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

Paper：https://arxiv.org/abs/2403.06093
Code：https://github.com/nullmax-vision/QAF2D

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

Paper：https://arxiv.org/abs/2403.05817
Code：https://github.com/zhanggang001/HEDNet

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Paper：https://arxiv.org/pdf/2403.05061

4）Stereo

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

Code: https://github.com/ZYangChen/MoCha-Stereo

Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching

Paper：https://arxiv.org/abs/2402.19270
Code：https://github.com/DFSDDDDD1199/ICGNet

Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching

Paper：https://arxiv.org/abs/2403.00486
Code：https://github.com/Windsrain/Selective-Stereo

Robust Synthetic-to-Real Transfer for Stereo Matching

Paper：https://arxiv.org/abs/2403.07705

5）SLAM与导航

SNI-SLAM: SemanticNeurallmplicit SLAM

Paper: https://arxiv.org/pdf/2311.11016.pdf

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

Paper：https://arxiv.org/abs/2402.19231
Code：https://github.com/Lu-Feng/CricaVPR

MemoNav: Working Memory Model for Visual Navigation

Paper：https://arxiv.org/abs/2402.19161

6）Point Cloud

Point Transformer V3: Simpler, Faster, Stronger

Paper: https://arxiv.org/pdf/2312.10035.pdf
Code: https://github.com/Pointcept/PointTransformerV3

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Paper: https://arxiv.org/pdf/2403.00592.pdf
Code: https://github.com/ZhaochongAn/COSeg

PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation

Code: https://github.com/JinfengX/PointCloudPDF

Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds

Paper：https://arxiv.org/abs/2403.05247

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Paper：https://arxiv.org/abs/2403.01439

Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching

Paper：https://arxiv.org/abs/2402.17372

7）深度估计

Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

Paper：https://arxiv.org/abs/2403.07535

8）3D理解

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

Paper：https://arxiv.org/abs/2403.09639

TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding

Paper：https://arxiv.org/abs/2402.18490

9）6D Pose

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

Paper：https://arxiv.org/abs/2311.15707

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation

Paper：https://arxiv.org/abs/2403.08019

FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation

Paper：https://arxiv.org/abs/2403.03221

10）Nerf与Gaussian Splatting

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Paper: https://arxiv.org/pdf/2312.05247.pdf
Code: https://github.com/prs-eth/Dynamic-LiDAR-Resimulation

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

Paper：https://arxiv.org/abs/2403.03608

NARUTO: Neural Active Reconstruction from Uncertain Target Observations

Paper：https://arxiv.org/abs/2402.18771

DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization

Paper：https://arxiv.org/abs/2403.06912

S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes

Paper：https://arxiv.org/pdf/2403.06205

DaReNeRF: Direction-aware Representation for Dynamic Scenes

Paper：https://arxiv.org/pdf/2403.02265

Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

Paper：https://arxiv.org/abs/2403.06092

NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

Paper：https://arxiv.org/abs/2403.03122

3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos

Paper：https://arxiv.org/abs/2403.01444

Neural Video Compression with Feature Modulation

Paper：https://arxiv.org/abs/2402.17414

11）其它

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

Paper：https://arxiv.org/abs/2403.09140

FSC: Few-point Shape Completion

Paper：https://arxiv.org/abs/2403.07359

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Paper：https://arxiv.org/abs/2403.01807

DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior

Paper：https://arxiv.org/abs/2312.06439

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/IT小白/article/detail/544746