赞
踩
点击下方卡片,关注“自动驾驶之心”公众号
戳我-> 领取自动驾驶近15个方向学习路线
编辑 | 自动驾驶之心
CVPR2024的工作陆续放出来了,自动驾驶Daily也一直再跟进,今天为大家盘点下会上优秀的工作,涉及端到端自动驾驶、大语言模型、Occupancy、SLAM、车道线检测、3D检测、协同感知、点云处理、MOT、毫米波雷达、Nerf、Gaussian Splatting等方向;
这里也推荐下我们的CVPR2024仓库链接:https://github.com/autodriving-heart/CVPR-2024-Papers-Autonomous-Driving,欢迎收藏点赞,第一时间掌握最新内容。
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
Paper: https://arxiv.org/pdf/2312.03031.pdf
Code: https://github.com/NVlabs/BEV-Planner
Visual Point Cloud Forecasting enables Scalable Autonomous Driving
Paper: https://arxiv.org/pdf/2312.17655.pdf
Code: https://github.com/OpenDriveLab/ViDAR
PlanKD: Compressing End-to-End Motion Planner for Autonomous Driving
Paper: https://arxiv.org/pdf/2403.01238.pdf
Code: https://github.com/tulerfeng/PlanKD
VLP: Vision Language Planning for Autonomous Driving
Paper:https://arxiv.org/abs/2401.05577
ChatSim: Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration
Paper: https://arxiv.org/pdf/2402.05746.pdf
Code: https://github.com/yifanlu0227/ChatSim
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Paper: https://arxiv.org/pdf/2312.07488.pdf
Code: https://github.com/opendilab/LMDrive
MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding
Code: https://github.com/LLVM-AD/MAPLM
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
Paper:https://arxiv.org/pdf/2403.01849.pdf
Code:https://github.com/TreeLLi/APT
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
Paper:https://arxiv.org/pdf/2403.02781
RegionGPT: Towards Region Understanding Vision Language Model
Paper:https://arxiv.org/pdf/2403.02330
Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
Paper: https://arxiv.org/pdf/2306.15670.pdf
Code: https://github.com/hustvl/Symphonies
PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness
Paper: https://arxiv.org/pdf/2312.02158.pdf
Code: https://github.com/astra-vision/PaSCo
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Paper: https://arxiv.org/pdf/2311.12754.pdf
Code: https://github.com/huang-yh/SelfOcc
Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications
Paper: https://arxiv.org/pdf/2311.17663.pdf
Code: https://github.com/haomo-ai/Cam4DOcc
PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation
Paper: https://arxiv.org/pdf/2306.10013.pdf
Code: https://github.com/Robertwyq/PanoOcc
Lane2Seq: Towards Unified Lane Detection via Sequence Generation
Paper:https://arxiv.org/abs/2402.17172
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Paper: https://arxiv.org/pdf/2310.08370.pdf
Code: https://github.com/Nightmare-n/UniPAD
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving
Paper: https://arxiv.org/pdf/2311.16813.pdf
Code: https://github.com/wenyuqing/panacea
SemCity: Semantic Scene Generation with Triplane Diffusion
Paper:
Code: https://github.com/zoomin-lee/SemCity
BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation
Paper: https://arxiv.org/pdf/2312.02136.pdf
Code: https://github.com/zqh0253/BerfScene
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection
Paper: https://arxiv.org/pdf/2312.08371.pdf
Code: https://github.com/KuanchihHuang/PTT
VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection
Code: https://github.com/skmhrk1209/VSRD
CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection
Code: https://github.com/zhnxjtu/CaKDP
CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images
Paper:https://arxiv.org/abs/2403.04198
Code:https://github.com/SerCharles/CN-RMA
UniMODE: Unified Monocular 3D Object Detection
Paper:https://arxiv.org/abs/2402.18573
Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors
Paper:https://arxiv.org/abs/2403.06093
Code:https://github.com/nullmax-vision/QAF2D
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection
Paper:https://arxiv.org/abs/2403.05817
Code:https://github.com/zhanggang001/HEDNet
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
Paper:https://arxiv.org/pdf/2403.05061
MoCha-Stereo: Motif Channel Attention Network for Stereo Matching
Code: https://github.com/ZYangChen/MoCha-Stereo
Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching
Paper:https://arxiv.org/abs/2402.19270
Code:https://github.com/DFSDDDDD1199/ICGNet
Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching
Paper:https://arxiv.org/abs/2403.00486
Code:https://github.com/Windsrain/Selective-Stereo
RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception
Code: https://github.com/ryhnhao/RCooper
SNI-SLAM: SemanticNeurallmplicit SLAM
Paper: https://arxiv.org/pdf/2311.11016.pdf
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition
Paper:https://arxiv.org/abs/2402.19231
Code:https://github.com/Lu-Feng/CricaVPR
DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement
Paper: https://arxiv.org/pdf/2311.17456.pdf
Code: https://github.com/IRMVLab/DifFlow3D
3DSFLabeling: Boosting 3D Scene Flow Estimation by Pseudo Auto Labeling
Paper: https://arxiv.org/pdf/2402.18146.pdf
Code: https://github.com/jiangchaokang/3DSFLabelling
Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency
Paper: https://arxiv.org/pdf/2312.08879.pdf
Code: https://github.com/vacany/sac-flow
Point Transformer V3: Simpler, Faster, Stronger
Paper: https://arxiv.org/pdf/2312.10035.pdf
Code: https://github.com/Pointcept/PointTransformerV3
Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Paper: https://arxiv.org/pdf/2403.00592.pdf
Code: https://github.com/ZhaochongAn/COSeg
PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation
Code: https://github.com/JinfengX/PointCloudPDF
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Paper: https://arxiv.org/pdf/2401.06197.pdf
RepViT: Revisiting Mobile CNN From ViT Perspective
Paper: https://arxiv.org/pdf/2307.09283.pdf
Code: https://github.com/THU-MIG/RepViT
OMG-Seg: Is One Model Good Enough For All Segmentation?
Paper: https://arxiv.org/pdf/2401.10229.pdf
Code: https://github.com/lxtGH/OMG-Seg
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
Paper: https://arxiv.org/pdf/2312.04265.pdf
Code: https://github.com/w1oves/Rein
SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation
Paper:https://arxiv.org/abs/2311.15707
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
Paper:https://arxiv.org/abs/2311.15537
Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning
Paper:https://arxiv.org/abs/2403.06122
DART: Doppler-Aided Radar Tomography
Code: https://github.com/thetianshuhuang/dart
Dynamic LiDAR Re-simulation using Compositional Neural Fields
Paper: https://arxiv.org/pdf/2312.05247.pdf
Code: https://github.com/prs-eth/Dynamic-LiDAR-Resimulation
GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding
Paper:https://arxiv.org/abs/2403.03608
NARUTO: Neural Active Reconstruction from Uncertain Target Observations
Paper:https://arxiv.org/abs/2402.18771
DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization
Paper:https://arxiv.org/abs/2403.06912
S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes
Paper:https://arxiv.org/pdf/2403.06205
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting
Paper:https://arxiv.org/pdf/2403.05087
DaReNeRF: Direction-aware Representation for Dynamic Scenes
Paper:https://arxiv.org/pdf/2403.02265
Delving into the Trajectory Long-tail Distribution for Muti-object Tracking
Code: https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT
DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking
Paper:https://arxiv.org/abs/2403.02767
Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes
Paper: https://arxiv.org/pdf/2311.17948.pdf
Code: https://github.com/HCIS-Lab/Action-slot
SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion Prediction
Code: https://github.com/opendilab/SmartRefine
CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective
Paper:https://arxiv.org/abs/2403.06676
Code:https://github.com/snskysk/CAM-Back-Again
投稿作者为『自动驾驶之心知识星球』特邀嘉宾,欢迎加入交流!
① 全网独家视频课程
BEV感知、毫米波雷达视觉融合、多传感器标定、多传感器融合、多模态3D目标检测、车道线检测、轨迹预测、在线高精地图、世界模型、点云3D目标检测、目标跟踪、Occupancy、cuda与TensorRT模型部署、大模型与自动驾驶、Nerf、语义分割、自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频(扫码即可学习)
网页端官网:www.zdjszx.com② 国内首个自动驾驶学习社区
国内最大最专业,近2700人的交流社区,已得到大多数自动驾驶公司的认可!涉及30+自动驾驶技术栈学习路线,从0到一带你入门自动驾驶感知(2D/3D检测、语义分割、车道线、BEV感知、Occupancy、多传感器融合、多传感器标定、目标跟踪)、自动驾驶定位建图(SLAM、高精地图、局部在线地图)、自动驾驶规划控制/轨迹预测等领域技术方案、大模型、端到端等,更有行业动态和岗位发布!欢迎扫描下方二维码,加入自动驾驶之心知识星球,这是一个真正有干货的地方,与领域大佬交流入门、学习、工作、跳槽上的各类难题,日常分享论文+代码+视频
③【自动驾驶之心】技术交流群
自动驾驶之心是首个自动驾驶开发者社区,聚焦2D/3D目标检测、语义分割、车道线检测、目标跟踪、BEV感知、多模态感知、Occupancy、多传感器融合、transformer、大模型、在线地图、点云处理、端到端自动驾驶、SLAM与高精地图、深度估计、轨迹预测、NeRF、Gaussian Splatting、规划控制、模型部署落地、cuda加速、自动驾驶仿真测试、产品经理、硬件配置、AI求职交流等方向。扫码添加汽车人助理微信邀请入群,备注:学校/公司+方向+昵称(快速入群方式)
④【自动驾驶之心】平台矩阵,欢迎联系我们!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。