CV计算机视觉每日开源代码Paper with code速览-2023.10.31

1.【基础网络架构】(NeurIPS2023)Fast Trainable Projection for Robust Fine-Tuning

2.【基础网络架构:Transformer】TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition

3.【图像分类】(NeurIPS2023)Analyzing Vision Transformers for Image Classification in Class Embedding Space

4.【目标检测】RGB-X Object Detection via Scene-Specific Fusion Modules

5.【目标检测】A High-Resolution Dataset for Instance Detection with Multi-View Instance Capture

6.【目标检测】PrObeD: Proactive Object Detection Wrapper

7.【异常检测】Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection

8.【异常检测】AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection

9.【语义分割】(NeurIPS2023)Revisiting Evaluation Metrics for Semantic Segmentation: Optimization and Evaluation of Fine-grained Intersection over Union

10.【语义分割】(NeurIPS2023)Switching Temporary Teachers for Semi-Supervised Semantic Segmentation

11.【Open-Vocabulary Segmentation】(NeurIPS2023)Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation

12.【视频语义分割】(NeurIPS2023)Mask Propagation for Efficient Video Semantic Segmentation

13.【超分辨率重建】(NeurIPS2023)Efficient Test-Time Adaptation for Super-Resolution with Second-Order Degradation and Reconstruction

14.【超分辨率重建】EDiffSR: An Efficient Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution

15.【领域泛化】(NeurIPS2023)SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

16.【领域泛化】(WACV2024)Domain Generalisation via Risk Distribution Matching

17.【多模态】Harvest Video Foundation Models via Efficient Post-Pretraining

18.【多模态】IterInv: Iterative Inversion for Pixel-Level T2I Models

19.【多模态】Generating Context-Aware Natural Answers for Questions in 3D Scenes

20.【多模态】Text-to-3D with Classifier Score Distillation

21.【多模态】Dynamic Task and Weight Prioritization Curriculum Learning for Multimodal Imagery

22.【多模态】TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

23.【多模态】Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models

24.【多模态】ROME: Evaluating Pre-trained Vision-Language Models on Reasoning beyond Visual Common Sense

25.【多模态】Apollo: Zero-shot MultiModal Reasoning with Multiple Experts

26.【自监督学习】Local-Global Self-Supervised Visual Representation Learning

27.【自监督学习】(NeurIPS2023)InstanT: Semi-supervised Learning with Instance-dependent Thresholds

28.【单目3D目标检测】ODM3D: Alleviating Foreground Sparsity for Enhanced Semi-Supervised Monocular 3D Object Detection

29.【自动驾驶:协同感知】Dynamic V2X Autonomous Perception from Road-to-Vehicle Vision

30.【自动驾驶:深度估计】(NeurIPS2023)Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes

31.【图像编辑】(EMNLP2023)Learning to Follow Object-Centric Image Editing Instructions Faithfully

32.【视频生成】VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

33.【知识蒸馏】One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

34.【Continual Learning】(NeurIPS2023)NPCL: Neural Processes for Uncertainty-Aware Continual Learning







