当前位置:   article > 正文

【论文合集】Awesome Diffusion Models 3_diffusion selfguidance for controllable image gene

diffusion selfguidance for controllable image generation

介绍使用diffusion来实现多模态学习、3D视觉、对抗攻击,以及语音领域的生成、增强等任务。

来源:https://github.com/diff-usion/Awesome-Diffusion-Models

目录

Multi-modal Learning

3D Vision

Adversarial Attack

Miscellany

Audio

Generation

Conversion

Enhancement

Separation

Text-to-Speech


Multi-modal Learning

Generating Realistic Images from In-the-wild Sounds
Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim, Taehwan Kim
ICCV 2023. [Paper]
5 Sep 2023

Generative-based Fusion Mechanism for Multi-Modal Tracking
Zhangyong Tang, Tianyang Xu, Xuefeng Zhu, Xiao-Jun Wu, Josef Kittler
arXiv 2023. [Paper]
4 Sep 2023

VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders
Xuyang Liu, Siteng Huang, Yachen Kang, Honggang Chen, Donglin Wang
arXiv 2023. [Paper]
3 Sep 2023

Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities
Shanyuan Liu, Dawei Leng, Yuhui Yin
arXiv 2023. [Paper]
2 Sep 2023

MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation
Hanshu Yan, Jun Hao Liew, Long Mai, Shanchuan Lin, Jiashi Feng
arXiv 2023. [Paper]
2 Sep 2023

Iterative Multi-granular Image Editing using Diffusion Models
K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan
arXiv 2023. [Paper]
1 Sep 2023

DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models
Michael Shenoda, Edward Kim
arXiv 2023. [Paper]
1 Sep 2023

PathLDM: Text conditioned Latent Diffusion Model for Histopathology
Srikar Yellapragada, Alexandros Graikos, Prateek Prasanna, Tahsin Kurc, Joel Saltz, Dimitris Samaras
arXiv 2023. [Paper]
1 Sep 2023

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li, Wenqing Chu, Ye Wu, Weihang Yuan, Fanglong Liu, Qi Zhang, Fu Li, Haocheng Feng, Errui Ding, Jingdong Wang
arXiv 2023. [Paper]
1 Sep 2023

Detecting Out-of-Context Image-Caption Pairs in News: A Counter-Intuitive Method
Eivind Moholdt, Sohail Ahmed Khan, Duc-Tien Dang-Nguyen
CBMI 2023. [Paper]
31 Aug 2023

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
Qingping Zheng, Yuanfan Guo, Jiankang Deng, Jianhua Han, Ying Li, Songcen Xu, Hang Xu
arXiv 2023. [Paper]
31 Aug 2023

MVDream: Multi-view Diffusion for 3D Generation
Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang
arXiv 2023. [Paper]
31 Aug 2023

Intriguing Properties of Diffusion Models: A Large-Scale Dataset for Evaluating Natural Attack Capability in Text-to-Image Generative Models
Takami Sato, Justin Yue, Nanze Chen, Ningfei Wang, Qi Alfred Chen
arXiv 2023. [Paper]
30 Aug 2023

DiffusionVMR: Diffusion Model for Video Moment Retrieval
Henghao Zhao, Kevin Qinghong Lin, Rui Yan, Zechao Li
ACM MM 2023. [Paper]
29 Aug 2023

C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model
Longbin Ji, Pengfei Wei, Yi Ren, Jinglin Liu, Chen Zhang, Xiang Yin
arXiv 2023. [Paper]
29 Aug 2023

360-Degree Panorama Generation from Few Unregistered NFoV Images
Jionghao Wang, Ziyu Chen, Jun Ling, Rong Xie, Li Song
ACM MM 2023. [Paper] [Github]
28 Aug 2023

Priority-Centric Human Motion Generation in Discrete Latent Space
Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang
arXiv 2023. [Paper]
28 Aug 2023

SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation
Zhiyu Qu, Tao Xiang, Yi-Zhe Song
BMVC 2023. [Paper] [Github]
27 Aug 2023

Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models
Hao Fei, Shengqiong Wu, Wei Ji, Hanwang Zhang, Tat-Seng Chua
arXiv 2023. [Paper] [Project]
26 Aug 2023

ORES: Open-vocabulary Responsible Visual Synthesis
Minheng Ni, Chenfei Wu, Xiaodong Wang, Shengming Yin, Lijuan Wang, Zicheng Liu, Nan Duan
arXiv 2023. [Paper]
26 Aug 2023

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023
Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai
ICMI 2023. [Paper] [Github]
26 Aug 2023

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Zhipeng Hu, Changjie Fan, Xin Yu
arXiv 2023. [Paper]
25 Aug 2023

Unified Concept Editing in Diffusion Models
Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau
arXiv 2023. [Paper] [Project] [Github]
25 Aug 2023

Dense Text-to-Image Generation with Attention Modulation
Yunji Kim, Jiyoung Lee, Jin-Hwa Kim, Jung-Woo Ha, Jun-Yan Zhu
ICCV 2023. [Paper] [Github]
24 Aug 2023

APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency
Yupu Yao, Shangqi Deng, Zihan Cao, Harry Zhang, Liang-Jian Deng
arXiv 2023. [Paper]
24 Aug 2023

Manipulating Embeddings of Stable Diffusion Prompts
Niklas Deckers, Julia Peters, Martin Potthast
arXiv 2023. [Paper]
23 Aug 2023

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
Yiwen Chen, Chi Zhang, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
arXiv 2023. [Paper] [Github]
22 Aug 2023

DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
Xujie Zhang, Binbin Yang, Michael C. Kampffmeyer, Wenqing Zhang, Shiyue Zhang, Guansong Lu, Liang Lin, Hang Xu, Xiaodan Liang
arXiv 2023. [Paper]
22 Aug 2023

MusicJam: Visualizing Music Insights via Generated Narrative Illustrations
Chuer Chen, Nan Cao, Jiani Hou, Yi Guo, Yulei Zhang, Yang Shi
arXiv 2023. [Paper]
22 Aug 2023

TADA! Text to Animatable Digital Avatars
Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxaing Tang, Yangyi Huang, Justus Thies, Michael J. Black
arXiv 2023. [Paper]
21 Aug 2023

EVE: Efficient zero-shot text-based Video Editing with Depth Map Guidance and Temporal Consistency Constraints
Yutao Chen, Xingning Dong, Tian Gan, Chunluan Zhou, Ming Yang, Qingpei Guo
arXiv 2023. [Paper]
21 Aug 2023

Backdooring Textual Inversion for Concept Censorship
Yutong Wu, Jie Zhang, Florian Kerschbaum, Tianwei Zhang
arXiv 2023. [Paper] [Project] [Github]
21 Aug 2023

AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Fulong Ye, Guang Liu, Xinya Wu, Ledell Wu
AAAI 2024. [Paper] [Github]
19 Aug 2023

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
Runhui Huang, Jianhua Han, Guansong Lu, Xiaodan Liang, Yihan Zeng, Wei Zhang, Hang Xu
ICCV 2023. [Paper]
18 Aug 2023

MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR
Xudong Xu, Zhaoyang Lyu, Xingang Pan, Bo Dai
arXiv 2023. [Paper] [Project]
18 Aug 2023

Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization
Soumik Mukhopadhyay, Saksham Suri, Ravi Teja Gadde, Abhinav Shrivastava
arXiv 2023. [Paper] [Project] [Github]
18 Aug 2023

Guide3D: Create 3D Avatars from Text and Image Guidance
Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
arXiv 2023. [Paper]
18 Aug 2023

Language-Guided Diffusion Model for Visual Grounding
Sijia Chen, Baochun Li
arXiv 2023. [Paper]
18 Aug 2023

SimDA: Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing, Qi Dai, Han Hu, Zuxuan Wu, Yu-Gang Jiang
arXiv 2023. [Paper] [Project]
18 Aug 2023

StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Wenhao Chai, Xun Guo, Gaoang Wang, Yan Lu
ICCV 2023. [Paper] [Github]
18 Aug 2023

Edit Temporal-Consistent Videos with Image Diffusion Model
Yuanzhi Wang, Yong Li, Xin Liu, Anbo Dai, Antoni Chan, Zhen Cui
arXiv 2023. [Paper]
17 Aug 2023

Watch Your Steps: Local Image and Scene Editing by Text Instructions
Ashkan Mirzaei, Tristan Aumentado-Armstrong, Marcus A. Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G. Derpanis, Igor Gilitschenski
arXiv 2023. [Paper] [Project]
17 Aug 2023

Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
Minho Park, Jooyeol Yun, Seunghwan Choi, Jaegul Choo
ICCV 2023. [Paper] [Project] [Github]
16 Aug 2023

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Shengming Yin, Chenfei Wu, Jian Liang, Jie Shi, Houqiang Li, Gong Ming, Nan Duan
arXiv 2023. [Paper] [Project]
16 Aug 2023

Dual-Stream Diffusion Net for Text-to-Video Generation
Binhui Liu, Xin Liu, Anbo Dai, Zhiyong Zeng, Zhen Cui, Jian Yang
arXiv 2023. [Paper]
16 Aug 2023

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
Jeongsoo Choi, Joanna Hong, Yong Man Ro
arXiv 2023. [Paper]
15 Aug 2023

SGDiff: A Style Guided Diffusion Model for Fashion Synthesis
Zhengwentai Sun, Yanghong Zhou, Honghong He, P. Y. Mok
ACM MM 2023. [Paper]
15 Aug 2023

Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model
Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang
arXiv 2023. [Paper]
15 Aug 2023

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage
Dario Cioni, Lorenzo Berlincioni, Federico Becattini, Alberto del Bimbo
ICCV Workshop 2023. [Paper]
14 Aug 2023

Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation
Alexander Martin, Haitian Zheng, Jie An, Jiebo Luo
ACM MM 2023. [Paper]
14 Aug 2023

UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity
Weijian Mai, Zhijun Zhang
arXiv 2023. [Paper]
14 Aug 2023

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks
David Junhao Zhang, Mutian Xu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou
arXiv 2023. [Paper]
13 Aug 2023

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
arXiv 2023. [Paper] [Project] [Github]
13 Aug 2023

LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts
Binbin Yang, Yi Luo, Ziliang Chen, Guangrun Wang, Xiaodan Liang, Liang Lin
arXiv 2023. [Paper]
13 Aug 2023

ModelScope Text-to-Video Technical Report
Jiuniu Wang, Hangjie Yuan, Dayou Chen, Yingya Zhang, Xiang Wang, Shiwei Zhang
arXiv 2023. [Paper]
12 Aug 2023

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models
Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen
arXiv 2023. [Paper] [Project] [Github]
11 Aug 2023

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning
Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, Wangmeng Zuo
ICCV 2023. [Paper] [Github]
11 Aug 2023

Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation
Yuki Endo
arXiv 2023. [Paper]
11 Aug 2023

Audio is all in one: speech-driven gesture synthetics using WavLM pre-trained model
Fan Zhang, Naye Ji, Fuxing Gao, Siyuan Zhao, Zhaohan Wang, Shunman Li
arXiv 2023. [Paper]
11 Aug 2023

Zero-shot Text-driven Physically Interpretable Face Editing
Yapeng Meng, Songru Yang, Xu Hu, Rui Zhao, Lincheng Li, Zhenwei Shi, Zhengxia Zou
arXiv 2023. [Paper]
11 Aug 2023

PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions
John Joon Young Chung, Eytan Adar
UIST 2023. [Paper]
9 Aug 2023

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-Seng Chua
arXiv 2023. [Paper] [Project]
9 Aug 2023

Cloth2Tex: A Customized Cloth Texture Generation Pipeline for 3D Virtual Try-On
Daiheng Gao, Xu Chen, Xindi Zhang, Qi Wang, Ke Sun, Bang Zhang, Liefeng Bo, Qixing Huang
arXiv 2023. [Paper]
8 Aug 2023

MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion
Yizhuo Lu, Changde Du, Qiongyi zhou, Dianpeng Wang, Huiguang He
arXiv 2023. [Paper]
8 Aug 2023

FLIRT: Feedback Loop In-context Red Teaming
Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
arXiv 2023. [Paper]
8 Aug 2023

DiffSynth: Latent In-Iteration Deflickering for Realistic Video Synthesis
Zhongjie Duan, Lizhou You, Chengyu Wang, Cen Chen, Ziheng Wu, Weining Qian, Jun Huang
arXiv 2023. [Paper] [Project] [Github]
7 Aug 2023

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose
Huichao Zhang, Bowen Chen, Hao Yang, Liao Qu, Xu Wang, Li Chen, Chao Long, Feida Zhu, Kang Du, Min Zheng
arXiv 2023. [Paper] [Project]
7 Aug 2023

Towards Scene-Text to Scene-Text Translation
Onkar Susladkar, Prajwal Gatti, Anand Mishra
arXiv 2023. [Paper]
6 Aug 2023

Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation
Zijie Wu, Yaonan Wang, Mingtao Feng, He Xie, Ajmal Mian
arXiv 2023. [Paper]
5 Aug 2023

ConceptLab: Creative Generation using Diffusion Prior Constraints
Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or
arXiv 2023. [Paper] [Project] [Github]
3 Aug 2023

DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models
Jianxin Lin, Peng Xiao, Yijun Wang, Rongju Zhang, Xiangxiang Zeng
arXiv 2023. [Paper]
3 Aug 2023

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling
Zhao Yang, Bing Su, Ji-Rong Wen
ACM MM 2023. [Paper] [Github]
3 Aug 2023

Reverse Stable Diffusion: What prompt was used to generate this image?
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, Mubarak Shah
arXiv 2023. [Paper]
2 Aug 2023

Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from Stable Diffusion
Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi Tian
arXiv 2023. [Paper]
2 Aug 2023

ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation
Yasheng Sun, Yifan Yang, Houwen Peng, Yifei Shen, Yuqing Yang, Han Hu, Lili Qiu, Hideki Koike
arXiv 2023. [Paper]
2 Aug 2023

The Bias Amplification Paradox in Text-to-Image Generation
Preethi Seshadri, Sameer Singh, Yanai Elazar
arXiv 2023. [Paper]
1 Aug 2023

BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models
Jordan Vice, Naveed Akhtar, Richard Hartley, Ajmal Mian
arXiv 2023. [Paper] [Github] [Dataset]
31 Jul 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text
Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo
arXiv 2023. [Paper]
31 Jul 2023

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models
Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu
arXiv 2023. [Paper]
31 Jul 2023

Contrastive Conditional Latent Diffusion for Audio-visual Segmentation
Yuxin Mao, Jing Zhang, Mochu Xiang, Yunqiu Lv, Yiran Zhong, Yuchao Dai
arXiv 2023. [Paper]
31 Jul 2023

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation
Jinbo Wu, Xiaobo Gao, Xing Liu, Zhengyang Shen, Chen Zhao, Haocheng Feng, Jingtuo Liu, Errui Ding
arXiv 2023. [Paper]
30 Jul 2023

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals
Yu-Ting Lan, Kan Ren, Yansen Wang, Wei-Long Zheng, Dongsheng Li, Bao-Liang Lu, Lili Qiu
arXiv 2023. [Paper]
27 Jul 2023

VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNet
Zhihao Hu, Dong Xu
arXiv 2023. [Paper] [Project]
26 Jul 2023

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation
Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang
arXiv 2023. [Paper]
26 Jul 2023

Visual Instruction Inversion: Image Editing via Visual Prompting
Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee
arXiv 2023. [Paper] [Project] [Github]
26 Jul 2023

Composite Diffusion | whole >= \Sigma parts
Vikram Jamwal, Ramaneswaran S
arXiv 2023. [Paper]
25 Jul 2023

Fashion Matrix: Editing Photos by Just Talking
Zheng Chong, Xujie Zhang, Fuwei Zhao, Zhenyu Xie, Xiaodan Liang
arXiv 2023. [Paper] [Project] [Github]
25 Jul 2023

Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry
Yong-Hyun Park, Mingi Kwon, Jaewoong Choi, Junghyo Jo, Youngjung Uh
arXiv 2023. [Paper]
24 Jul 2023

InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot Text-based Video Editing
Anant Khandelwal
ICCV Workshop 2023. [Paper]
22 Jul 2023

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
Jian Ma, Junhao Liang, Chen Chen, Haonan Lu
arXiv 2023. [Paper] [Project] [Github]
21 Jul 2023

Divide & Bind Your Attention for Improved Generative Semantic Nursing
Yumeng Li, Margret Keuper, Dan Zhang, Anna Khoreva
arXiv 2023. [Paper] [Project]
20 Jul 2023

AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models
Jiachun Pan, Jun Hao Liew, Vincent Y. F. Tan, Jiashi Feng, Hanshu Yan
arXiv 2023. [Paper]
20 Jul 2023

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou
arXiv 2023. [Paper] [Github]
20 Jul 2023

Text2Layer: Layered Image Generation using Latent Diffusion Model
Xinyang Zhang, Wentian Zhao, Xin Lu, Jeff Chien
arXiv 2023. [Paper]
19 Jul 2023

FABRIC: Personalizing Diffusion Models with Iterative Feedback
Dimitri von Rütte, Elisabetta Fedele, Jonathan Thomm, Lukas Wolf
arXiv 2023. [Paper]
19 Jul 2023

TokenFlow: Consistent Diffusion Features for Consistent Video Editing
Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel
arXiv 2023. [Paper] [Project] [Github]
19 Jul 2023

Multimodal Diffusion Segmentation Model for Object Segmentation from Manipulation Instructions
Yui Iioka, Yu Yoshida, Yuiga Wada, Shumpei Hatanaka, Komei Sugiura
arXiv 2023. [Paper]
17 Jul 2023

Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation
Luozhou Wang, Shuai Yang, Shu Liu, Ying-cong Chen
ICCV 2023. [Paper] [Github]
17 Jul 2023

Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection
Alessandro Flaborea, Luca Collorone, Guido D'Amely, Stefano D'Arrigo, Bardh Prenkaj, Fabio Galasso
arXiv 2023. [Paper]
14 Jul 2023

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman
arXiv 2023. [Paper] [Project] [Github]
13 Jul 2023

Exact Diffusion Inversion via Bi-directional Integration Approximation
Guoqiang Zhang, J. P. Lewis, W. Bastiaan Kleijn
arXiv 2023. [Paper]
10 Jul 2023

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Yuwei Guo, Ceyuan Yang, Anyi Rao, Yaohui Wang, Yu Qiao, Dahua Lin, Bo Dai
arXiv 2023. [Paper] [Project] [Github]
10 Jul 2023

Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback
Jaskirat Singh, Liang Zheng
arXiv 2023. [Paper] [Project] [Github]
10 Jul 2023

Augmenters at SemEval-2023 Task 1: Enhancing CLIP in Handling Compositionality and Ambiguity for Zero-Shot Visual WSD through Prompt Augmentation and Text-To-Image Diffusion
Jie S. Li, Yow-Ting Shiue, Yong-Siang Shih, Jonas Geiping
arXiv 2023. [Paper]
9 Jul 2023

Measuring the Success of Diffusion Models at Imitating Human Artists
Stephen Casper, Zifan Guo, Shreya Mogulothu, Zachary Marinov, Chinmay Deshpande, Rui-Jie Yew, Zheng Dai, Dylan Hadfield-Menell
ICML Workshop 2023. [Paper]
8 Jul 2023

How to Detect Unauthorized Data Usages in Text-to-image Diffusion Models
Zhenting Wang, Chen Chen, Yuchen Liu, Lingjuan Lyu, Dimitris Metaxas, Shiqing Ma
arXiv 2023. [Paper]
6 Jul 2023

Collaborative Score Distillation for Consistent Visual Synthesis
Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, Jinwoo Shin
arXiv 2023. [Paper] [Project] [Github]
4 Jul 2023

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, Robin Rombach
arXiv 2023. [Paper] [Github]
4 Jul 2023

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, Yasutaka Furukawa
arXiv 2023. [Paper] [Project]
3 Jul 2023

Counting Guidance for High Fidelity Text-to-Image Synthesis
Wonjun Kang, Kevin Galim, Hyung Il Koo
arXiv 2023. [Paper]
30 Jun 2023

Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua Gao
arXiv 2023. [Paper]
29 Jun 2023

Generate Anything Anywhere in Any Scene
Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee
arXiv 2023. [Paper] [Project]
29 Jun 2023

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Simian Luo, Chuanhao Yan, Chenxu Hu, Hang Zhao
arXiv 2023. [Paper] [Github]
29 Jun 2023

PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing
Wenjing Huang, Shikui Tu, Lei Xu
arXiv 2023. [Paper]
28 Jun 2023

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Ximing Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong Xu
arXiv 2023. [Paper]
26 Jun 2023

A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
Aishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan
arXiv 2023. [Paper]
26 Jun 2023

Decompose and Realign: Tackling Condition Misalignment in Text-to-Image Diffusion Models
Luozhou Wang, Guibao Shen, Yijun Li, Ying-cong Chen
arXiv 2023. [Paper]
26 Jun 2023

Zero-shot spatial layout conditioning for text-to-image diffusion models
Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek
arXiv 2023. [Paper]
23 Jun 2023

DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation
Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang
arXiv 2023. [Paper]
21 Jun 2023

Align, Adapt and Inject: Sound-guided Unified Image Generation
Yue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao, Ping Luo
arXiv 2023. [Paper]
20 Jun 2023

EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model
Lianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li, Xin Jin, Jianxin Lin
arXiv 2023. [Paper]
20 Jun 2023

RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model
Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin
arXiv 2023. [Paper]
20 Jun 2023

Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions
Yuqi Sun, Reian He, Weimin Tan, Bo Yan
arXiv 2023. [Paper]
19 Jun 2023

Conditional Text Image Generation with Diffusion Models
Yuanzhi Zhu, Zhaohai Li, Tianwei Wang, Mengchao He, Cong Yao
arXiv 2023. [Paper]
19 Jun 2023

Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
Yoni Kasten, Ohad Rahamim, Gal Chechik
arXiv 2023. [Paper]
18 Jun 2023

Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models
Geon Yeong Park, Jeongsol Kim, Beomsu Kim, Sang Wan Lee, Jong Chul Ye
arXiv 2023. [Paper]
16 Jun 2023

Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks
Hongcheng Gao, Hao Zhang, Yinpeng Dong, Zhijie Deng
arXiv 2023. [Paper]
16 Jun 2023

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian McAuley
arXiv 2023. [Paper]
16 Jun 2023

Taming Diffusion Models for Music-driven Conducting Motion Generation
Zhuoran Zhao, Jinbin Bai, Delong Chen, Debang Wang, Yubo Pan
arXiv 2023. [Paper]
15 Jun 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
arXiv 2023. [Paper]
15 Jun 2023

Diffusion Models for Zero-Shot Open-Vocabulary Segmentation
Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht
arXiv 2023. [Paper]
15 Jun 2023

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik
arXiv 2023. [Paper]
15 Jun 2023

Training Multimedia Event Extraction With Generated Images and Captions
Zilin Du, Yunxin Li, Xu Guo, Yidan Sun, Boyang Li
arXiv 2023. [Paper]
15 Jun 2023

VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
Paul Couairon, Clément Rambour, Jean-Emmanuel Haugeard, Nicolas Thome
arXiv 2023. [Paper]
14 Jun 2023

Norm-guided latent space exploration for text-to-image generation
Dvir Samuel, Rami Ben-Ari, Nir Darshan, Haggai Maron, Gal Chechik
arXiv 2023. [Paper]
14 Jun 2023

Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin, Xuli Shen, Bin Li, Xiangyang Xue
arXiv 2023. [Paper]
14 Jun 2023

GBSD: Generative Bokeh with Stage Diffusion
Jieren Deng, Xin Zhou, Hao Tian, Zhihong Pan, Derek Aguiar
arXiv 2023. [Paper]
14 Jun 2023

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Yongqi Yang, Ruoyu Wang, Zhihao Qian, Ye Zhu, Yu Wu
arXiv 2023. [Paper]
14 Jun 2023

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy
arXiv 2023. [Paper]
13 Jun 2023

Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa
arXiv 2023. [Paper]
13 Jun 2023

Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard Schölkopf
arXiv 2023. [Paper]
12 Jun 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images
Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu
arXiv 2023. [Paper]
12 Jun 2023

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions
Jiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua Gao
arXiv 2023. [Paper]
12 Jun 2023

Language-Guided Traffic Simulation via Scene-Level Diffusion
Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray
arXiv 2023. [Paper]
10 Jun 2023

BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping
Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Josh Susskind
arXiv 2023. [Paper]
8 Jun 2023

Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung, Songwei Ge, Jia-Bin Huang
arXiv 2023. [Paper]
8 Jun 2023

SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions
Yuseung Lee, Kunho Kim, Hyunjin Kim, Minhyuk Sung
arXiv 2023. [Paper] [Project] [Github]
8 Jun 2023

Improving Tuning-Free Real Image Editing with Proximal Guidance
Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Yuxiao Chen, Di Liu, Qilong Zhangli, Anastasis Stathopoulos, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas
arXiv 2023. [Paper]
8 Jun 2023

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
Changhoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, Yezhou Yang
arXiv 2023. [Paper]
7 Jun 2023

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang
arXiv 2023. [Paper]
7 Jun 2023

Designing a Better Asymmetric VQGAN for StableDiffusion
Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua
arXiv 2023. [Paper] [Github]
7 Jun 2023

Multi-modal Latent Diffusion
Mustapha Bounoua, Giulio Franzese, Pietro Michiardi
arXiv 2023. [Paper]
7 Jun 2023

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt
Kai Chen, Enze Xie, Zhe Chen, Lanqing Hong, Zhenguo Li, Dit-Yan Yeung
arXiv 2023. [Paper]
7 Jun 2023

Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance
Gihyun Kwon, Jong Chul Ye
arXiv 2023. [Paper]
7 Jun 2023

Stable Diffusion is Unstable
Chengbin Du, Yanxi Li, Zhongwei Qiu, Chang Xu
arXiv 2023. [Paper]
5 Jun 2023

LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
Yochai Yemini, Aviv Shamsian, Lior Bracha, Sharon Gannot, Ethan Fetaya
arXiv 2023. [Paper] [Project]
5 Jun 2023

HeadSculpt: Crafting 3D Head Avatars with Text
Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong
arXiv 2023. [Paper] [Project]
5 Jun 2023

Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions
Shaoxu Li
arXiv 2023. [Paper]
5 Jun 2023

Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
Shuyu Yang, Yinan Zhou, Yaxiong Wang, Yujiao Wu, Li Zhu, Zhedong Zheng
arXiv 2023. [Paper]
5 Jun 2023

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques
Sunwoo Kim, Wooseok Jang, Hyunsu Kim, Junho Kim, Yunjey Choi, Seungryong Kim, Gayeong Lee
arXiv 2023. [Paper]
5 Jun 2023

Detector Guidance for Multi-Object Text-to-Image Generation
Luping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao
arXiv 2023. [Paper]
4 Jun 2023

Word-Level Explanations for Analyzing Bias in Text-to-Image Models
Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju
arXiv 2023. [Paper]
3 Jun 2023

Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution
Yiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong Tang
arXiv 2023. [Paper]
3 Jun 2023

Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel
arXiv 2023. [Paper] [Project]
2 Jun 2023

Video Colorization with Pre-trained Text-to-Image Diffusion Models
Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
arXiv 2023. [Paper]
2 Jun 2023

Audio-Visual Speech Enhancement with Score-Based Generative Models
Julius Richter, Simone Frintrop, Timo Gerkmann
arXiv 2023. [Paper]
2 Jun 2023

Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models
Virginia Fernandez, Pedro Sanchez, Walter Hugo Lopez Pinaya, Grzegorz Jacenków, Sotirios A. Tsaftaris, Jorge Cardoso
arXiv 2023. [Paper]
2 Jun 2023

StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Yonglong Tian, Lijie Fan, Phillip Isola, Huiwen Chang, Dilip Krishnan
arXiv 2023. [Paper]
1 Jun 2023

Diffusion Self-Guidance for Controllable Image Generation
Dave Epstein, Allan Jabri, Ben Poole, Alexei A. Efros, Aleksander Holynski
arXiv 2023. [Paper] [Project]
1 Jun 2023

StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan
arXiv 2023. [Paper] [Project]
1 Jun 2023

Intriguing Properties of Text-guided Diffusion Models
Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille
arXiv 2023. [Paper]
1 Jun 2023

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
Chang Liu, Haoning Wu, Yujie Zhong, Xiaoyun Zhang, Weidi Xie
arXiv 2023. [Paper] [Project]
1 Jun 2023

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation
Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong
arXiv 2023. [Paper] [Github]
1 Jun 2023

The Hidden Language of Diffusion Models
Hila Chefer, Oran Lang, Mor Geva, Volodymyr Polosukhin, Assaf Shocher, Michal Irani, Inbar Mosseri, Lior Wolf
arXiv 2023. [Paper] [Project]
1 Jun 2023

Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation
Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao, Tat-Jen Cham
arXiv 2023. [Paper] [Project] [Github]
1 Jun 2023

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
arXiv 2023. [Paper] [Project]
1 Jun 2023

Inserting Anybody in Diffusion Models via Celeb Basis
Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng
arXiv 2023. [Paper] [Project]
1 Jun 2023

Wuerstchen: Efficient Pretraining of Text-to-Image Models
Pablo Pernias, Dominic Rampas, Marc Aubreville
arXiv 2023. [Paper]
1 Jun 2023

UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong, Runhui Huang, Xiaoyong Wei, Zequn Jie, Jianxing Yu, Jian Yin, Xiaodan Liang
arXiv 2023. [Paper]
1 Jun 2023

FigGen: Text to Scientific Figure Generation
Juan A. Rodriguez, David Vazquez, Issam Laradji, Marco Pedersoli, Pau Rodriguez
ICLR 2023. [Paper]
1 Jun 2023

Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images
Peyman Gholami, Robert Xiao
arXiv 2023. [Paper]
31 May 2023

Understanding and Mitigating Copying in Diffusion Models
Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
CVPR 2023. [Paper] [Github]
31 May 2023

Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor
Ruizhi Shao, Jingxiang Sun, Cheng Peng, Zerong Zheng, Boyao Zhou, Hongwen Zhang, Yebin Liu
arXiv 2023. [Paper] [Project]
31 May 2023

Boosting Text-to-Image Diffusion Models with Fine-Grained Semantic Rewards
Guian Fang, Zutao Jiang, Jianhua Han, Guansong Lu, Hang Xu, Xiaodan Liang
arXiv 2023. [Paper] [Github]
31 May 2023

Perturbation-Assisted Sample Synthesis: A Novel Approach for Uncertainty Quantification
Yifei Liu, Rex Shen, Xiaotong Shen
arXiv 2023. [Paper]
30 May 2023

PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li, Mohit Bansal
arXiv 2023. [Paper] [Project] [Github]
30 May 2023

Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models
Ernie Chu, Shuo-Yen Lin, Jun-Cheng Chen
arXiv 2023. [Paper]
30 May 2023

Nested Diffusion Processes for Anytime Image Generation
Noam Elata, Bahjat Kawar, Tomer Michaeli, Michael Elad
arXiv 2023. [Paper]
30 May 2023

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
arXiv 2023. [Paper]
30 May 2023

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance
Junzhe Zhu, Peiye Zhuang
arXiv 2023. [Paper]
30 May 2023

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li
arXiv 2023. [Paper]
30 May 2023

Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang
arXiv 2023. [Paper]
29 May 2023

Cognitively Inspired Cross-Modal Data Generation Using Diffusion Models
Zizhao Hu, Mohammad Rostami
NeurIPS 2023. [Paper]
28 May 2023

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu, Ping Luo
arXiv 2023. [Paper]
29 May 2023

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou
arXiv 2023. [Paper] [Project]
29 May 2023

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li
arXiv 2023. [Paper] [Github]
29 May 2023

Text-Only Image Captioning with Multi-Context Data Generation
Feipeng Ma, Yizhou Zhou, Fengyun Rao, Yueyi Zhang, Xiaoyan Sun
arXiv 2023. [Paper]
29 May 2023

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka
arXiv 2023. [Paper]
29 May 2023

Conditional Score Guidance for Text-Driven Image-to-Image Translation
Hyunsoo Lee, Minsoo Kang, Bohyung Han
arXiv 2023. [Paper]
29 May 2023

Text-to-image Editing by Image Information Removal
Zhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. Plummer
arXiv 2023. [Paper]
27 May 2023

Towards Consistent Video Editing with Text-to-Image Diffusion Models
Zicheng Zhang, Bonan Li, Xuecheng Nie, Congying Han, Tiande Guo, Luoqi Liu
arXiv 2023. [Paper]
27 May 2023

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference
Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui
arXiv 2023. [Paper]
27 May 2023

ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing
Min Zhao, Rongzhen Wang, Fan Bao, Chongxuan Li, Jun Zhu
arXiv 2023. [Paper] [Project]
26 May 2023

Improved Visual Story Generation with Adaptive Context Modeling
Zhangyin Feng, Yuchen Ren, Xinmiao Yu, Xiaocheng Feng, Duyu Tang, Shuming Shi, Bing Qin
arXiv 2023. [Paper]
26 May 2023

Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models
Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka
arXiv 2023. [Paper]
26 May 2023

Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer, Elinor Poole-Dayan, Vikram Voleti, Christopher Pal, Siva Reddy
arXiv 2023. [Paper] [Github]
25 May 2023

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
arXiv 2023. [Paper]
25 May 2023

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong
arXiv 2023. [Paper] [Project] [Github]
25 May 2023

Parallel Sampling of Diffusion Models
Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari
arXiv 2023. [Paper] [Github]
25 May 2023

Break-A-Scene: Extracting Multiple Concepts from a Single Image
Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen-Or, Dani Lischinski
arXiv 2023. [Paper] [Project]
25 May 2023

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell
arXiv 2023. [Paper] [Github]
25 May 2023

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi
arXiv 2023. [Paper] [Github]
25 May 2023

ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image Generation
Yuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
arXiv 2023. [Paper]
25 May 2023

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu
arXiv 2023. [Paper] [Project]
25 May 2023

On Architectural Compression of Text-to-Image Diffusion Models
Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi
arXiv 2023. [Paper]
25 May 2023

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models
Jooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim, Sungroh Yoon
arXiv 2023. [Paper]
25 May 2023

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach
arXiv 2023. [Paper]
24 May 2023

ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation
Dongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, Li Yuan
arXiv 2023. [Paper]
24 May 2023

DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models
Sungnyun Kim, Junsoo Lee, Kibeom Hong, Daesik Kim, Namhyuk Ahn
arXiv 2023. [Paper] [Github]
24 May 2023

I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
Tuhin Chakrabarty, Arkadiy Saakyan, Olivia Winn, Artemis Panagopoulou, Yue Yang, Marianna Apidianaki, Smaranda Muresan
arXiv 2023. [Paper]
24 May 2023

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Dongxu Li, Junnan Li, Steven C. H. Hoi
arXiv 2023. [Paper]
24 May 2023

Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo
arXiv 2023. [Paper]
22 May 2023

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong Lin
arXiv 2023. [Paper]
23 May 2023

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang
arXiv 2023. [Paper]
23 May 2023

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen, Jie Wu, Pan Xie, Hefeng Wu, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin
arXiv 2023. [Paper]
23 May 2023

Understanding Text-driven Motion Synthesis with Keyframe Collaboration via Diffusion Models
Dong Wei, Xiaoning Sun, Huaijiang Sun, Bin Li, Shengxiang Hu, Weiqing Li, Jianfeng Lu
arXiv 2023. [Paper]
23 May 2023

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian, Boyi Li, Adam Yala, Trevor Darrell
arXiv 2023. [Paper]
23 May 2023

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
Davide Morelli, Alberto Baldrati, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
arXiv 2023. [Paper]
22 May 2023

FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-Answering
Megha Chakraborty, Khusbu Pahwa, Anku Rani, Adarsh Mahor, Aditya Pakala, Arghya Sarkar, Harshit Dave, Ishan Paul, Janvita Reddy, Preethi Gurumurthy, Ritvik G, Samahriti Mukherjee, Shreyas Chatterjee, Kinjal Sensharma, Dwip Dalal, Suryavardan S, Shreyash Mishra, Parth Patwa, Aman Chadha, Amit Sheth, Amitava Das
arXiv 2023. [Paper]
22 May 2023

Training Diffusion Models with Reinforcement Learning
Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine
arXiv 2023. [Paper]
22 May 2023

If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
arXiv 2023. [Paper] [Project]
22 May 2023

ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, Wangmeng Zuo, Qi Tian
arXiv 2023. [Paper] [Github]
22 May 2023

AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Guy Yariv, Itai Gat, Lior Wolf, Yossi Adi, Idan Schwartz
arXiv 2023. [Paper]
22 May 2023

The CLIP Model is Secretly an Image-to-Prompt Converter
Yuxuan Ding, Chunna Tian, Haoxuan Ding, Lingqiao Liu
arXiv 2023. [Paper]
22 May 2023

InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
Bosheng Qin, Juncheng Li, Siliang Tang, Tat-Seng Chua, Yueting Zhuang
arXiv 2023. [Paper]
21 May 2023

SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters
Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi Cao
arXiv 2023. [Paper]
20 May 2023

Late-Constraint Diffusion Guidance for Controllable Image Synthesis
Chang Liu, Dong Liu
arXiv 2023. [Paper] [Project] [Github]
19 May 2023

Any-to-Any Generation via Composable Diffusion
Zineng Tang, Ziyi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal
arXiv 2023. [Paper] [Project] [Github]
19 May 2023

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields
Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing Liao
arXiv 2023. [Paper]
19 May 2023

Brain Captioning: Decoding human brain activity into images and text
Matteo Ferrante, Furkan Ozcelik, Tommaso Boccato, Rufin VanRullen, Nicola Toschi
arXiv 2023. [Paper]
19 May 2023

Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots
Jinyi Hu, Xu Han, Xiaoyuan Yi, Yutong Chen, Wenhao Li, Zhiyuan Liu, Maosong Sun
arXiv 2023. [Paper]
19 May 2023

Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
arXiv 2023. [Paper]
18 May 2023

Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via Personalization
Yihao Huang, Qing Guo, Felix Juefei-Xu
arXiv 2023. [Paper]
18 May 2023

AIwriting: Relations Between Image Generation and Digital Writing
Scott Rettberg, Talan Memmott, Jill Walker Rettberg, Jason Nelson, Patrick Lichty
ISEA 2023. [Paper]
18 May 2023

TextDiffuser: Diffusion Models as Text Painters
Jingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, Furu Wei
arXiv 2023. [Paper]
18 May 2023

VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying Liu
arXiv 2023. [Paper]
18 May 2023

LDM3D: Latent Diffusion Model for 3D
Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, Jean Yu, Estelle Aflalo, Shao-Yen Tseng, Fabio Nonato, Matthias Muller, Vasudev Lal
arXiv 2023. [Paper]
18 May 2023

X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models
Yixiong Chen
arXiv 2023. [Paper] [Github]
18 May 2023

Inspecting the Geographical Representativeness of Images from Text-to-Image Models
Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi
arXiv 2023. [Paper]
18 May 2023

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji
arXiv 2023. [Paper] [Project]
17 May 2023

AMD: Autoregressive Motion Diffusion
Bo Han, Hao Peng, Minjing Dong, Chang Xu, Yi Ren, Yixuan Shen, Yuheng Li
arXiv 2023. [Paper]
16 May 2023

Generating coherent comic with rich story using ChatGPT and Stable Diffusion
Ze Jin, Zorina Song
arXiv 2023. [Paper]
16 May 2023

Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
Samaneh Azadi, Akbar Shah, Thomas Hayes, Devi Parikh, Sonal Gupta
arXiv 2023. [Paper] [Project]
16 May 2023

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee
arXiv 2023. [Paper] [Project] [Github]
15 May 2023

Common Diffusion Noise Schedules and Sample Steps are Flawed
Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang
arXiv 2023. [Paper]
15 May 2023

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models
Krishna Sri Ipsit Mantri, Nevasini Sasikumar
arXiv 2023. [Paper]
15 May 2023

Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator
Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wanrong Huang, Wenjing Yang
arXiv 2023. [Paper] [Project] [Github]
11 May 2023

iEdit: Localised Text-guided Image Editing with Weak Supervision
Rumeysa Bodur, Erhan Gundogdu, Binod Bhattarai, Tae-Kyun Kim, Michael Donoser, Loris Bazzani
arXiv 2023. [Paper]
10 May 2023

SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models
Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
arXiv 2023. [Paper] [Github]
9 May 2023

Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
Nisha Huang, Yuxin Zhang, Weiming Dong
arXiv 2023. [Paper]
9 May 2023

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao
IJCAI 2023. [Paper] [Github]
8 May 2023

IIITD-20K: Dense captioning for Text-Image ReID
A V Subramanyam, Niranjan Sundararajan, Vibhu Dubey, Brejesh Lall
arXiv 2023. [Paper]
8 May 2023

ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
Yupei Lin, Sen Zhang, Xiaojun Yang, Xiao Wang, Yukai Shi
arXiv 2023. [Paper] [Project]
8 May 2023

Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
Wenkai Dong, Song Xue, Xiaoyue Duan, Shumin Han
arXiv 2023. [Paper]
8 May 2023

Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai, Yinpeng Dong, Qingni Shen, Shi Pu, Yuejian Fang, Hang Su
arXiv 2023. [Paper]
7 May 2023

AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
Seungwoo Lee, Chaerin Kong, Donghyeon Jeon, Nojun Kwak
arXiv 2023. [Paper]
6 May 2023

Data Curation for Image Captioning with Text-to-Image Generative Models
Wenyan Li, Jonas F. Lotz, Chen Qiu, Desmond Elliott
arXiv 2023. [Paper]
5 May 2023

DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
Hong Chen, Yipeng Zhang, Xin Wang, Xuguang Duan, Yuwei Zhou, Wenwu Zhu
arXiv 2023. [Paper] [Project]
5 May 2023

Guided Image Synthesis via Initial Image Editing in Diffusion Model
Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa
arXiv 2023. [Paper]
5 May 2023

Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion
Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Duen Horng Chau
arXiv 2023. [Paper] [Project]
4 May 2023

Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model
Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong Liu
arXiv 2023. [Paper]
4 May 2023

Multimodal Data Augmentation for Image Captioning using Diffusion Models
Changrong Xiao, Sean Xin Xu, Kunpeng Zhang
arXiv 2023. [Paper]
3 May 2023

In-Context Learning Unlocked for Diffusion Models
Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou
arXiv 2023. [Paper] [Project] [Github]
1 May 2023

SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis
Azade Farshad, Yousef Yeganeh, Yu Chi, Chengzhi Shen, Björn Ommer, Nassir Navab
arXiv 2023. [Paper]
28 Apr 2023

It is all about where you start: Text-to-image generation with seed selection
Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik
arXiv 2023. [Paper]
27 Apr 2023

Edit Everything: A Text-Guided Generative System for Images Editing
Defeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin
arXiv 2023. [Paper] [Github]
27 Apr 2023

Training-Free Location-Aware Text-to-Image Synthesis
Jiafeng Mao, Xueting Wang
arXiv 2023. [Paper]
26 Apr 2023

TextMesh: Generation of Realistic 3D Meshes From Text Prompts
Christina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, Federico Tombari
arXiv 2023. [Paper]
24 Apr 2023

Using Text-to-Image Generation for Architectural Design Ideation
Ville Paananen, Jonas Oppenlaender, Aku Visuri
arXiv 2023. [Paper]
20 Apr 2023

Anything-3D: Towards Single-view Anything Reconstruction in the Wild
Qiuhong Shen, Xingyi Yang, Xinchao Wang
arXiv 2023. [Paper] [Github]
19 Apr 2023

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
Soon Yau Cheong, Armin Mustafa, Andrew Gilbert
ICCV Workshop 2023. [Paper] [Github]
18 Apr 2023

TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models
Yuwei Yin, Jean Kaddour, Xiang Zhang, Yixin Nie, Zhenguang Liu, Lingpeng Kong, Qi Liu
arXiv 2023. [Paper]
18 Apr 2023

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis
CVPR 2023. [Paper] [Project]
18 Apr 2023

Text2Performer: Text-Driven Human Video Generation
Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu
arXiv 2023. [Paper] [Project]
17 Apr 2023

Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin
arXiv 2023. [Paper] [Project]
17 Apr 2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng
arXiv 2023. [Paper] [Github]
17 Apr 2023

Text-Conditional Contextualized Avatars For Zero-Shot Personalization
Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal Gupta
arXiv 2023. [Paper]
14 Apr 2023

Delta Denoising Score
Amir Hertz, Kfir Aberman, Daniel Cohen-Or
arXiv 2023. [Paper] [Project]
14 Apr 2023

Expressive Text-to-Image Generation with Rich Text
Songwei Ge, Taesung Park, Jun-Yan Zhu, Jia-Bin Huang
arXiv 2023. [Paper] [Project] [Github]
13 Apr 2023

Soundini: Sound-Guided Diffusion for Natural Video Editing
Seung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, Jinkyu Kim, Sangpil Kim
arXiv 2023. [Paper] [Project]
13 Apr 2023

Improving Diffusion Models for Scene Text Editing with Dual Encoders
Jiabao Ji, Guanhua Zhang, Zhaowen Wang, Bairu Hou, Zhifei Zhang, Brian Price, Shiyu Chang
arXiv 2023. [Paper] [Github]
12 Apr 2023

An Edit Friendly DDPM Noise Space: Inversion and Manipulations
Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli
arXiv 2023. [Paper]
12 Apr 2023

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, Hongxia Jin
arXiv 2023. [Paper] [Project]
12 Apr 2023

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny
arXiv 2023. [Paper] [Project]
11 Apr 2023

Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond
Mohammadreza Armandpour, Huangjie Zheng, Ali Sadeghian, Amir Sadeghian, Mingyuan Zhou
arXiv 2023. [Paper]
11 Apr 2023

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models
Nikita Starodubcev, Dmitry Baranchuk, Valentin Khrulkov, Artem Babenko
arXiv 2023. [Paper]
10 Apr 2023

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
Xuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu
arXiv 2023. [Paper] [Github]
9 Apr 2023

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang
arXiv 2023. [Paper] [Github]
7 Apr 2023

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang
CVPR 2023. [Paper] [Github]
6 Apr 2023

Training-Free Layout Control with Cross-Attention Guidance
Minghao Chen, Iro Laina, Andrea Vedaldi
arXiv 2023. [Paper] [Project] [Github]
6 Apr 2023

Benchmarking Robustness to Text-Guided Corruptions
Mohammadreza Mofayezi, Yasamin Medghalchi
arXiv 2023. [Paper]
6 Apr 2023

DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
Hoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun
arXiv 2023. [Paper] [Project]
6 Apr 2023

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
Xuhui Jia, Yang Zhao, Kelvin C.K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su
arXiv 2023. [Paper]
5 Apr 2023

A Diffusion-based Method for Multi-turn Compositional Image Generation
Chao Wang, Xiaoyu Yang, Jinmiao Huang, Kevin Ferreira
arXiv 2023. [Paper]
5 Apr 2023

viz2viz: Prompt-driven stylized visualization generation using a diffusion model
Jiaqi Wu, John Joon Young Chung, Eytan Adar
arXiv 2023. [Paper]
4 Apr 2023

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
Alberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
arXiv 2023. [Paper]
4 Apr 2023

PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion
Gwanghyun Kim, Ji Ha Jang, Se Young Chun
arXiv 2023. [Paper] [Project]
4 Apr 2023

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang
arXiv 2023. [Paper]
4 Apr 2023

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
arXiv 2023. [Paper] [Project] [Github]
3 Apr 2023

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models
Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
arXiv 2023. [Paper]
3 Apr 2023

DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Longwen Zhang, Qiwei Qiu, Hongyang Lin, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang, Lan Xu, Jingyi Yu
arXiv 2023. [Paper] [Project]
1 Apr 2023

GlyphDraw: Learning to Draw Chinese Characters in Image Synthesis Models Coherently
Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin
arXiv 2023. [Paper] [Project]
31 Mar 2023

AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
Ruixiang Jiang, Can Wang, Jingbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao
arXiv 2023. [Paper] [Project] [Github]
30 Mar 2023

PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models
Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
arXiv 2023. [Paper] [Github]
30 Mar 2023

Social Biases through the Text-to-Image Generation Lens
Ranjita Naik, Besmira Nushi
arXiv 2023. [Paper]
30 Mar 2023

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi
arXiv 2023. [Paper] [Github]
30 Mar 2023

DiffCollage: Parallel Generation of Large Content with Diffusion Models
Qinsheng Zhang, Jiaming Song, Xun Huang, Yongxin Chen, Ming-Yu Liu
CVPR 2023. [Paper] [Project]
30 Mar 2023

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Wen Wang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen
arXiv 2023. [Paper]
30 Mar 2023

Discriminative Class Tokens for Text-to-Image Diffusion Models
Idan Schwartz, Vésteinn Snæbjarnarson, Sagie Benaim, Hila Chefer, Ryan Cotterell, Lior Wolf, Serge Belongie
arXiv 2023. [Paper]
30 Mar 2023

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Chenpng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian
arXiv 2023. [Paper]
30 Mar 2023

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li
CVPR 2023. [Paper] [Github]
30 Mar 2023

4D Facial Expression Diffusion Model
Kaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo
arXiv 2023. [Paper] [Github]
29 Mar 2023

MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka
arXiv 2023. [Paper] [Github]
29 Mar 2023

Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion
Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira
arXiv 2023. [Paper] [Github]
28 Mar 2023

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang
arXiv 2023. [Paper]
28 Mar 2023

Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu, Chuan Wen, Jiaming Song, Yang Gao
CVPR Workshop 2023. [Paper]
27 Mar 2023

Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation
Susung Hong, Donghoon Ahn, Seungryong Kim
arXiv 2023. [Paper]
27 Mar 2023

Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
Thanh Van Le, Hao Phung, Thuan Hoang Nguyen, Quan Dao, Ngoc Tran, Anh Tran
SIGGRAPH 2023. [Paper] [Github]
27 Mar 2023

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
Tenglong Ao, Zeyi Zhang, Libin Liu
arXiv 2023. [Paper]
26 Mar 2023

Better Aligning Text-to-Image Models with Human Preference
Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li
arXiv 2023. [Paper] [Github]
25 Mar 2023

ISS++: Image as Stepping Stone for Text-Guided 3D Shape Generation
Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
ICLR 2023. [Paper]
24 Mar 2023

DiffuScene: Scene Graph Denoising Diffusion Probabilistic Model for Generative Indoor Scene Synthesis
Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner
arXiv 2023. [Paper] [Project]
24 Mar 2023

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
Yiqi Lin, Haotian Bai, Sijia Li, Haonan Lu, Xiaodong Lin, Hui Xiong, Lin Wang
arXiv 2023. [Paper] [Project]
24 Mar 2023

Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
Rui Chen, Yongwei Chen, Ningxin Jiao, Kui Jia
arXiv 2023. [Paper]
24 Mar 2023

ReVersion: Diffusion-Based Relation Inversion from Images
Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C.K. Chan, Ziwei Liu
arXiv 2023. [Paper] [Project] [Github] 23 Mar 2023

Ablating Concepts in Text-to-Image Diffusion Models
Nupur Kumari, Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu
arXiv 2023. [Paper] [Project] [Github]
23 Mar 2023

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
arXiv 2023. [Paper] [Github]
23 Mar 2023

MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wenjing Yang
arXiv 2023. [Paper] [Project] [Github]
23 Mar 2023

Pix2Video: Video Editing using Image Diffusion
Duygu Ceylan, Chun-Hao Paul Huang, Niloy J. Mitra
arXiv 2023. [Paper] [Project]
22 Mar 2023

Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa
arXiv 2023. [Paper] [Project]
22 Mar 2023

SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation
Juil Koo, Seungwoo Yoo, Minh Hieu Nguyen, Minhyuk Sung
arXiv 2023. [Paper] [Project]
21 Mar 2023

Vox-E: Text-guided Voxel Editing of 3D Objects
Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor
arXiv 2023. [Paper] [Project]
21 Mar 2023

CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun
arXiv 2023. [Paper]
21 Mar 2023

3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion
Yu-Jhe Li, Kris Kitani
arXiv 2023. [Paper]
21 Mar 2023

Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner
arXiv 2023. [Paper] [Project]
20 Mar 2023

Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or
arXiv 2023. [Paper] [Project]
20 Mar 2023

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang
arXiv 2023. [Paper]
20 Mar 2023

Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models
René Haas, Inbar Huberman-Spiegelglas, Rotem Mulayoff, Tomer Michaeli
arXiv 2023. [Paper]
20 Mar 2023

SKED: Sketch-guided Text-based 3D Editing
Aryan Mikaeili, Or Perel, Daniel Cohen-Or, Ali Mahdavi-Amiri
arxiv 2023. [Paper]
19 Mar 2023

DialogPaint: A Dialog-based Image Editing Model
Jingxuan Wei, Shiyu Wu, Xin Jiang, Yequan Wang
arXiv 2023. [Paper]
17 Mar 2023

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, Ran Xu
arXiv 2023. [Paper]
17 Mar 2023

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen
arXiv 2023. [Paper]
17 Mar 2023

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang
arXiv 2023. [Paper] [Github]
17 Mar 2023

Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation
Yiyang Ma, Huan Yang, Wenjing Wang, Jianlong Fu, Jiaying Liu
arXiv 2023. [Paper]
16 Mar 2023

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen
arXiv 2023. [Paper] [Project] [Github]
16 Mar 2023

HIVE: Harnessing Human Feedback for Instructional Visual Editing
Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, Ran Xu
arXiv 2023. [Paper]
16 Mar 2023

P+: Extended Textual Conditioning in Text-to-Image Generation
Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, Kfir Aberman
arXiv 2023. [Paper] [Project]
16 Mar 2023

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion
Inhwa Han, Serin Yang, Taesung Kwon, Jong Chul Ye
arXiv 2023. [Paper]
15 Mar 2023

Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
arXiv 2023. [Paper] [Github]
15 Mar 2023

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
Serin Yang, Hyunmin Hwang, Jong Chul Ye
arXiv 2023. [Paper]
15 Mar 2023

Edit-A-Video: Single Video Editing with Object-Aware Consistency
Chaehun Shin, Heeseung Kim, Che Hyun Lee, Sang-gil Lee, Sungroh Yoon
arXiv 2023. [Paper] [Project]
14 Mar 2023

Editing Implicit Assumptions in Text-to-Image Diffusion Models
Hadas Orgad, Bahjat Kawar, Yonatan Belinkov
arXiv 2023. [Paper] [Project] [Github]
14 Mar 2023

Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Jaehoon Ko, Hyeonsu Kim, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim
arXiv 2023. [Paper]
14 Mar 2023

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan
arXiv 2023. [Paper] [Github]
8 Mar 2023

Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
arXiv 2023. [Paper] [Project]
8 Mar 2023

Erasing Concepts from Diffusion Models
Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau
arXiv 2023. [Paper] [Project] [Github]
13 Mar 2023

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu
arXiv 2023. [Paper] [Github]
12 Mar 2023

Cones: Concept Neurons in Diffusion Models for Customized Generation
Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
arXiv 2023. [Paper]
9 Mar 2023

A Prompt Log Analysis of Text-to-Image Generation Systems
Yutong Xie, Zhaoying Pan, Jinge Ma, Jie Luo, Qiaozhu Mei
arXiv 2023. [Paper]
8 Mar 2023

Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles
Zhiwei Tang, Dmitry Rybin, Tsung-Hui Chang
arXiv 2023. [Paper] [Github]
7 Mar 2023

Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie Zhou, Jiwen Lu
arXiv 2023. [Paper] [Github]
3 Mar 2023

Collage Diffusion
Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Ré, Kayvon Fatahalian
arXiv 2023. [Paper]
1 Mar 2023

Towards Enhanced Controllability of Diffusion Models
Wonwoong Cho, Hareesh Ravi, Midhun Harikumar, Vinh Khuc, Krishna Kumar Singh, Jingwan Lu, David I. Inouye, Ajinkya Kale
arXiv 2023. [Paper]
28 Feb 2023

Directed Diffusion: Direct Control of Object Placement through Attention Guidance
Wan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn, Thomas Leung
arXiv 2023. [Paper]
25 Feb 2023

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Cusuh Ham, James Hays, Jingwan Lu, Krishna Kumar Singh, Zhifei Zhang, Tobias Hinz
arXiv 2023. [Paper]
24 Feb 2023

Region-Aware Diffusion for Zero-shot Text-driven Image Editing
Nisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu
arXiv 2023. [Paper] [Github]
23 Feb 2023

Controlled and Conditional Text to Image Generation with Diffusion Prior
Pranav Aggarwal, Hareesh Ravi, Naveen Marri, Sachin Kelkar, Fengbin Chen, Vinh Khuc, Midhun Harikumar, Ritiz Tambi, Sudharshan Reddy Kakumanu, Purvak Lapsiya, Alvin Ghouas, Sarah Saber, Malavika Ramprasad, Baldo Faieta, Ajinkya Kale
arXiv 2023. [Paper]
23 Feb 2023

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl
arXiv 2023. [Paper] [Project]
22 Feb 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan
arXiv 2023. [Paper]
21 Feb 2023

Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic Dimension
Henry Kvinge, Davis Brown, Charles Godfrey
arXiv 2023. [Paper]
16 Feb 2023

Text-driven Visual Synthesis with Latent Diffusion Prior
Ting-Hsuan Liao, Songwei Ge, Yiran Xu, Yao-Chih Lee, Badour AlBahar, Jia-Bin Huang
arXiv 2023. [Paper] [Project]
16 Feb 2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Chong Mou, Xintao Wang, Liangbin Xie, Jian Zhang, Zhongang Qi, Ying Shan, Xiaohu Qie
arXiv 2023. [Paper] [Github]
16 Feb 2023

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel
arXiv 2023. [PaperProject] [Github]
16 Feb 2023

Boundary Guided Mixing Trajectory for Semantic Control with Diffusion Models
Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan
arXiv 2023. [Paper]
16 Feb 2023

Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation
Joshua Vendrow, Saachi Jain, Logan Engstrom, Aleksander Madry
arXiv 2023. [Paper] [Github]
15 Feb 2023

PRedItOR: Text Guided Image Editing with Diffusion Prior
Hareesh Ravi, Sachin Kelkar, Midhun Harikumar, Ajinkya Kale
arXiv 2023. [Paper]
15 Feb 2023

Text-Guided Scene Sketch-to-Photo Synthesis
AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura
arXiv 2023. [Paper]
14 Feb 2023

Universal Guidance for Diffusion Models
Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein
arXiv 2023. [Paper] [Github]
14 Feb 2023

Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang, Maneesh Agrawala
arXiv 2023. [Paper] [Github]
10 Feb 2023

Analyzing Multimodal Objectives Through the Lens of Generative Diffusion Guidance
Chaerin Kong, Nojun Kwak
arXiv 2023. [Paper]
10 Feb 2023

Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation
Anton Voronov, Mikhail Khoroshikh, Artem Babenko, Max Ryabinin
arXiv 2023. [Paper]
9 Feb 2023

Q-Diffusion: Quantizing Diffusion Models
Xiuyu Li, Long Lian, Yijiang Liu, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer
arXiv 2023. [Paper] [Github]
8 Feb 2023

GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models
Shawn Shan, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, Ben Y. Zhao
arXiv 2023. [Paper]
8 Feb 2023

Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
Hyeonho Jeong, Gihyun Kwon, Jong Chul Ye
arXiv 2023. [Paper]
8 Feb 2023

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Felix Friedrich, Patrick Schramowski, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Sasha Luccioni, Kristian Kersting
arXiv 2023. [Paper]
7 Feb 2023

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein
arXiv 2023. [Paper] [Github]
7 Feb 2023

Zero-shot Image-to-Image Translation
Gaurav Parmar, Krishna Kumar Singh, Richard Zhang, Yijun Li, Jingwan Lu, Jun-Yan Zhu
arXiv 2023. [Paper]
6 Feb 2023

Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser, Johnathan Chiu, Parmida Atighehchian, Jonathan Granskog, Anastasis Germanidis
arXiv 2023. [Paper] [Project]
6 Feb 2023

Mixture of Diffusers for scene composition and high resolution image generation
Álvaro Barbero Jiménez
arXiv 2023. [Paper] [Github]
5 Feb 2023

ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval
Kexun Zhang, Xianjun Yang, William Yang Wang, Lei Li
arXiv 2023. [Paper]
5 Feb 2023

Eliminating Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion
Zuopeng Yang, Tianshu Chu, Xin Lin, Erdun Gao, Daqing Liu, Jie Yang, Chaoyue Wang
arXiv 2023. [Paper]
5 Feb 2023

Semantic-Guided Image Augmentation with Pre-trained Models
Bohan Li, Xinghao Wang, Xiao Xu, Yutai Hou, Yunlong Feng, Feng Wang, Wanxiang Che
SIGGRAPH 2023. [Paper] [Project]
4 Feb 2023

TEXTure: Text-Guided Texturing of 3D Shapes
Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or
arXiv 2023. [Paper] [Project] [Github]
3 Feb 2023

Dreamix: Video Diffusion Models are General Video Editors
Eyal Molad, Eliahu Horwitz, Dani Valevski, Alex Rav Acha, Yossi Matias, Yael Pritch, Yaniv Leviathan, Yedid Hoshen
arXiv 2023. [Paper] [Project]
2 Feb 2023

Trash to Treasure: Using text-to-image models to inform the design of physical artefacts
Amy Smith, Hope Schroeder, Ziv Epstein, Michael Cook, Simon Colton, Andrew Lippman
AAAI 2023. [Paper]
1 Feb 2023

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or
SIGGRAPH 2023. [Paper] [Project] [Github]
31 Jan 2023

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation
Bo Han, Yitong Liu, Yixuan Shen
arXiv 2023. [Paper]
31 Jan 2023

Shape-aware Text-driven Layered Video Editing
Yao-Chih Lee, Ji-Ze Genevieve Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang
arXiv 2023. [Paper] [Project]
30 Jan 2023

PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks
Arian Bakhtiarnia, Qi Zhang, Alexandros Iosifidis
arXiv 2023. [Paper] [Github]
30 Jan 2023

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu
CVPR 2023. [Paper] [Github]
30 Jan 2023

SEGA: Instructing Diffusion using Semantic Dimensions
Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting
arXiv 2023. [Paper]
28 Jan 2023

Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) Dataset
Zhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia Coleman, Peter Schaldenbrand, Jihie Kim, Jean Oh
arXiv 2023. [Paper]
28 Jan 2023

Text-To-4D Dynamic Scene Generation
Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman
arXiv 2023. [Paper]
26 Jan 2023

Guiding Text-to-Image Diffusion Model Towards Grounded Generation
Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
arXiv 2023. [Paper] [Project]
12 Jan 2023

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter Corcoran
arXiv 2023. [Paper] [Project] [Github]
10 Jan 2023

Visual Story Generation Based on Emotion and Keywords
Yuetian Chen, Ruohua Li, Bowen Shi, Peiru Liu, Mei Si
AIIDE INT 2022. [Paper]
7 Jan 2023

DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis
Shuai Shen, Wenliang Zhao, Zibin Meng, Wanhua Li, Zheng Zhu, Jie Zhou, Jiwen Lu
arXiv 2023. [Paper]
10 Jan 2023

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, Peter Corcoran
arXiv 2023. [Paper]
10 Jan 2023

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
Michał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic
arXiv 2023. [Paper] [Project]
6 Jan 2023

Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan
arXiv 2023. [Paper] [Project]
2 Jan 2023

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao
CVPR 2023. [Paper] [Project]
28 Dec 2022

Exploring Vision Transformers as Diffusion Learners
He Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan Yao, Lei Zhang
arXiv 2022. [Paper]
28 Dec 2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
arXiv 2022. [Paper] [Project]
22 Dec 2022

Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias
Robert Wolfe, Yiwei Yang, Bill Howe, Aylin Caliskan
arXiv 2022. [Paper]
21 Dec 2022

Optimizing Prompts for Text-to-Image Generation
Yaru Hao, Zewen Chi, Li Dong, Furu Wei
arXiv 2022. [Paper] [Project] [Github]
19 Dec 2022

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang
arXiv 2022. [Paper] [Github]
16 Dec 2022

TeTIm-Eval: a novel curated evaluation data set for comparing text-to-image models
Federico A. Galatolo, Mario G. C. A. Cimino, Edoardo Cogotti
arXiv 2022. [Paper]
15 Dec 2022

The Infinite Index: Information Retrieval on Generative Text-To-Image Models
Niklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, Martin Potthast
CHIIR 2023. [Paper]
14 Dec 2022

LidarCLIP or: How I Learned to Talk to Point Clouds
Georg Hess, Adam Tonderski, Christoffer Petersson, Lennart Svensson, Kalle Åström
arXiv 2022. [Paper] [Github]
13 Dec 2022

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan
CVPR 2023. [Paper]
13 Dec 2022

The Stable Artist: Steering Semantics in Diffusion Latent Space
Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting
arXiv 2022. [Paper]
12 Dec 2022

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang
arXiv 2022. [Paper]
9 Dec 2022

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang
ICLR 2023. [Paper] [Github]
9 Dec 2022

MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Rishabh Dabral, Muhammad Hamza Mughal, Vladislav Golyanik, Christian Theobalt
arXiv 2022. [Paper] [Project]
8 Dec 2022

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, Liangyan Gui
arXiv 2022. [Paper] [Project]
8 Dec 2022

SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren
arXiv 2022. [Paper] [Project] [Github]
8 Dec 2022

Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu
arXiv 2022. [Paper] [Project]
8 Dec 2022

Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
arXiv 2022. [Paper] [Project]
8 Dec 2022

Executing your Commands via Motion Diffusion in Latent Space
Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, Gang Yu
arXiv 2022. [Paper] [Project]
8 Dec 2022

Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang
arXiv 2022. [Paper] [Project]
7 Dec 2022

Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Xiu Li
arXiv 2022. [Paper]
7 Dec 2022

Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for Text-to-Image Generation
Seongbeom Park, Suhong Moon, Jinkyu Kim
arXiv 2022. [Paper]
7 Dec 2022

NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov
arXiv 2022. [Paper]
6 Dec 2022

Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei
CVPR 2023. [Paper] [Github]
6 Dec 2022

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
Muheng Li, Yueqi Duan, Jie Zhou, Jiwen Lu
CVPR 2023. [Paper] [Project] [Github]
6 Dec 2022

ADIR: Adaptive Diffusion for Image Reconstruction
Shady Abu-Hussein, Tom Tirer, Raja Giryes
arXiv 2022. [Paper] [Project]
6 Dec 2022

M-VADER: A Model for Diffusion with Multimodal Context
Samuel Weinbach, Marco Bellagente, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Björn Deiseroth, Koen Oostermeijer, Hannah Teufel, Andres Felipe Cruz-Salinas
arXiv 2022. [Paper]
6 Dec 2022

Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim, Hajin Shim, Hyunsu Kim, Yunjey Choi, Junho Kim, Eunho Yang
CVPR 2023. [Paper] [Project] [Github]
6 Dec 2022

Unite and Conquer: Cross Dataset Multimodal Synthesis using Diffusion Models
Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel
arXiv 2022. [Paper] [Project]
1 Dec 2022

Shape-Guided Diffusion with Inside-Outside Attention
Dong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell
arXiv 2022. [Paper] [Project]
1 Dec 2022

SinDDM: A Single Image Denoising Diffusion Model
Vladimir Kulikov, Shahar Yadin, Matan Kleiner, Tomer Michaeli
arXiv 2022. [Paper] [Project]
29 Nov 2022

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim, Se Young Chun
CVPR 2023. [Paper] [Github]
29 Nov 2022

Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning
Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye
arXiv 2022. [Paper] [Github]
28 Nov 2022

Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, Dacheng Tao, Ponnuthurai N. Suganthan
arXiv 2022. [Paper]
27 Nov 2022

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao
arXiv 2022. [Paper]
25 Nov 2022

SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin
CVPR 2023. [Paper] [Project]
25 Nov 2022

Sketch-Guided Text-to-Image Diffusion Models
Andrey Voynov, Kfir Aberman, Daniel Cohen-Or
arXiv 2022. [Paper] [Project]
24 Nov 2022

Shifted Diffusion for Text-to-image Generation
Yufan Zhou, Bingchen Liu, Yizhe Zhu, Xiao Yang, Changyou Chen, Jinhui Xu
CVPR 2023. [Paper]
24 Nov 2022

Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan, Leonid Sigal
CVPR 2023. [Paper]
23 Nov 2022

Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition
Jennifer C. White, Ryan Cotterell
arXiv 2022. [Paper]
23 Nov 2022

EDICT: Exact Diffusion Inversion via Coupled Transformations
Bram Wallace, Akash Gokul, Nikhil Naik
arXiv 2022. [Paper] [Github]
22 Nov 2022

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel
CVPR 2023. [Paper] [Github]
22 Nov 2022

Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Vitali Petsiuk, Alexander E. Siemenn, Saisamrit Surbehera, Zad Chin, Keith Tyser, Gregory Hunter, Arvind Raghavan, Yann Hicke, Bryan A. Plummer, Ori Kerret, Tonio Buonassisi, Kate Saenko, Armando Solar-Lezama, Iddo Drori
NeurIPS Workshop 2022. [Paper]
22 Nov 2022

SinDiffusion: Learning a Diffusion Model from a Single Natural Image
Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li
arXiv 2022. [Paper] [Github]
22 Nov 2022

SinFusion: Training Diffusion Models on a Single Image or Video
Yaniv Nikankin, Niv Haim, Michal Irani
arXiv 2022. [Paper] [Github]
21 Nov 2022

Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu, Yixuan Wei, Jianfeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu
arXiv 2022. [Paper] [Github]
21 Nov 2022

Investigating Prompt Engineering in Diffusion Models
Sam Witteveen, Martin Andrews
NeurIPS Workshop 2022. [Paper]
21 Nov 2022

VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Ajay Jain, Amber Xie, Pieter Abbeel
arXiv 2022. [Paper] [Project]
21 Nov 2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Xichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen
arXiv 2022. [Paper] [Github]
20 Nov 2022

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, Weiming Dong, Changsheng Xu
arXiv 2022. [Paper]
19 Nov 2022

Magic3D: High-Resolution Text-to-3D Content Creation
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin
CVPR 2023. [Paper] [Project]
18 Nov 2022

Invariant Learning via Diffusion Dreamed Distribution Shifts
Priyatham Kattakinda, Alexander Levine, Soheil Feizi
arXiv 2022. [Paper]
18 Nov 2022

Null-text Inversion for Editing Real Images using Guided Diffusion Models
Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, Daniel Cohen-Or
arXiv 2022. [Paper]
17 Nov 2022

InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks, Aleksander Holynski, Alexei A. Efros
CVPR 2023. [Paper] [Project] [Github]
17 Nov 2022

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi
arXiv 2022. [Paper] [Github]
15 Nov 2022

Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models
Adham Elarabawy, Harish Kamath, Samuel Denton
arXiv 2022. [Paper]
15 Nov 2022

Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
Zhihong Pan, Xin Zhou, Hao Tian
WACV 2023. [Paper]
14 Nov 2022

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting
CVPR 2023. [Paper] [Github]
9 Nov 2022

Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation Models
Lukas Struppek, Dominik Hintersdorf, Kristian Kersting
arXiv 2022. [Paper] [Github]
4 Nov 2022

eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu
arXiv 2022. [Paper] [Github]
2 Nov 2022

MagicMix: Semantic Mixing with Diffusion Models
Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng
arXiv 2022. [Paper] [Project]
28 Oct 2022

UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li, Xue Xu, Xinyan Xiao, Jiachen Liu, Hu Yang, Guohao Li, Zhanpeng Wang, Zhifan Feng, Qiaoqiao She, Yajuan Lyu, Hua Wu
arXiv 2022. [Paper]
28 Oct 2022

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
Hritik Bansal, Da Yin, Masoud Monajatipoor, Kai-Wei Chang
EMNLP 2022. [Paper] [Github]
27 Oct 2022

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts
Zhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
CVPR 2023. [Paper]
27 Oct 2022

DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
Zijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, Duen Horng Chau
arXiv 2022. [Paper] [Project] [Github]
26 Oct 2022

Lafite2: Few-shot Text-to-Image Generation
Yufan Zhou, Chunyuan Li, Changyou Chen, Jianfeng Gao, Jinhui Xu
arXiv 2022. [Paper]
25 Oct 2022

High-Resolution Image Editing via Multi-Stage Blended Diffusion
Johannes Ackermann, Minjun Li
NeurIPS Workshop 2022. [Paper] [Github]
24 Oct 2022

Conditional Diffusion with Less Explicit Guidance via Model Predictive Control
Max W. Shen, Ehsan Hajiramezanali, Gabriele Scalia, Alex Tseng, Nathaniel Diamant, Tommaso Biancalani, Andreas Loukas
arXiv 2022. [Paper]
21 Oct 2022

A Visual Tour Of Current Challenges In Multimodal Language Models
Shashank Sonkar, Naiming Liu, Richard G. Baraniuk
arXiv 2022. [Paper]
22 Oct 2022

DiffEdit: Diffusion-based semantic image editing with mask guidance
Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord
ICLR 2023. [Paper]
20 Oct 2022

Diffusion Models already have a Semantic Latent Space
Mingi Kwon, Jaeseok Jeong, Youngjung Uh
ICLR 2023. [Paper] [Project]
20 Oct 2022

UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image
Dani Valevski, Matan Kalman, Yossi Matias, Yaniv Leviathan
arXiv 2022. [Paper]
18 Oct 2022

Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Ruijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan Bai
arXiv 2022. [Paper]
18 Oct 2022

Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani
CVPR 2023. [Paper] [Project]
17 Oct 2022

Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation
Chaerin Kong, DongHyeon Jeon, Ohjoon Kwon, Nojun Kwak
WACV 2022. [Paper]
12 Oct 2022

Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
Chen Henry Wu, Fernando De la Torre
arXiv 2022. [Paper] [Github-1] [Github-2]
11 Oct 2022

Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans
arXiv 2022. [Paper]
5 Oct 2022

DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh, Vitalis Vosylius, Edward Johns
IEEE RA-L 2022. [Paper]
5 Oct 2022

LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models
Paramanand Chandramouli, Kanchana Vaishnavi Gandikota
BMVC 2022. [Paper]
5 Oct 2022

clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
Justin N. M. Pinkney, Chuan Li
BMVC 2022. [Paper] [Github]
5 Oct 2022

Membership Inference Attacks Against Text-to-image Generation Models
Yixin Wu, Ning Yu, Zheng Li, Michael Backes, Yang Zhang
arXiv 2022. [Paper]
3 Oct 2022

Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
arXiv 2022. [Paper]
29 Sep 2022

DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
arXiv 2022. [Paper] [Github]
29 Sep 2022

Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen
arXiv 2022. [Paper]
29 Sep 2022

Creative Painting with Latent Diffusion Models
Xianchao Wu
arXiv 2022. [Paper]
29 Sep 2022

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu
ACM MM 2022. [Paper] [Github]
27 Sep 2022

Personalizing Text-to-Image Generation via Aesthetic Gradients
Victor Gallego
NeurIPS Workshop 2022. [Paper] [Github]
25 Sep 2022

Best Prompts for Text-to-Image Models and How to Find Them
Nikita Pavlichenko, Dmitry Ustalov
NeurIPS Workshop 2022. [Paper]
23 Sep 2022

The Biased Artist: Exploiting Cultural Biases via Homoglyphs in Text-Guided Image Generation Models
Lukas Struppek, Dominik Hintersdorf, Kristian Kersting
arXiv 2022. [Paper] [Github]
19 Sep 2022

Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models
Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre
NeurIPS 2022. [Paper] [Github]
14 Sep 2022

ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
ICLR 2023. [Paper] [Github]
9 Sep 2022

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman
CVPR 2023. [Paper] [Project] [Github]
25 Aug 2022

Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach, Andreas Blattmann, Björn Ommer
arXiv 2022. [Paper] [Github]
26 Jul 2022

Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation
Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan
ICLR 2023. [Paper] [Github]
15 Jun 2022

Blended Latent Diffusion
Omri Avrahami, Ohad Fried, Dani Lischinski
ACM 2022. [Paper] [Project] [Github]
6 Jun 2022

Compositional Visual Generation with Composable Diffusion Models
Nan Liu, Shuang Li, Yilun Du, Antonio Torralba, Joshua B. Tenenbaum
ECCV 2022. [Paper] [Project] [Github]
3 Jun 2022

DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
Jie Shi, Chenfei Wu, Jian Liang, Xiang Liu, Nan Duan
arXiv 2022. [Paper]
1 Jun 2022

Improved Vector Quantized Diffusion Models
Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
arXiv 2022. [Paper] [Github]
31 May 2022

Text2Human: Text-Driven Controllable Human Image Generation
Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
ACM 2022. [Paper] [Github]
31 May 2022

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi
NeurIPS 2022. [Paper] [Github]
23 May 2022

Retrieval-Augmented Diffusion Models
Andreas Blattmann, Robin Rombach, Kaan Oktay, Björn Ommer
NeurIPS 2022. [Paper] [Github]
25 Apr 2022

Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen
arXiv 2022. [Paper] [Github]
13 Apr 2022

KNN-Diffusion: Image Generation via Large-Scale Retrieval
Oron Ashual, Shelly Sheynin, Adam Polyak, Uriel Singer, Oran Gafni, Eliya Nachmani, Yaniv Taigman
ICLR 2023. [Paper]
6 Apr 2022

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer
CVPR 2022. [Paper] [Github]
20 Dec 2021

More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
WACV 2021. [Paper] [Project]
10 Dec 2021

Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo
CVPR 2022. [Paper] [Github]
29 Nov 2021

Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami, Dani Lischinski, Ohad Fried
CVPR 2022. [Paper] [Project] [Github]
29 Nov 2021

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
Zhisheng Xiao, Karsten Kreis, Arash Vahdat
ICLR 2022 (Spotlight). [Paper] [Project]
15 Dec 2021

DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
Gwanghyun Kim, Jong Chul Ye
CVPR 2022. [Paper] [Github]
6 Oct 2021

3D Vision

DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion
Yunhong Lou, Linchao Zhu, Yaxiong Wang, Xiaohan Wang, Yi Yang
AAAI 2024. [Paper]
4 Sep 2023

BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models
Yao Wei, George Vosselman, Michael Ying Yang
ICCV Workshop 2023. [Paper]
31 Aug 2023

MVDream: Multi-view Diffusion for 3D Generation
Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang
arXiv 2023. [Paper]
31 Aug 2023

Diffusion Inertial Poser: Human Motion Reconstruction from Arbitrary Sparse IMU Configurations
Tom Van Wouwe, Seunghwan Lee, Antoine Falisse, Scott Delp, C. Karen Liu
arXiv 2023. [Paper]
31 Aug 2023

InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion
Sirui Xu, Zhengyuan Li, Yu-Xiong Wang, Liang-Yan Gui
ICCV 2023. [Paper] [Project] [Github]
31 Aug 2023

Priority-Centric Human Motion Generation in Discrete Latent Space
Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang
arXiv 2023. [Paper]
28 Aug 2023

HoloFusion: Towards Photo-realistic 3D Generative Modeling
Animesh Karnewar, Niloy J. Mitra, Andrea Vedaldi, David Novotny
ICCV 2023. [Paper] [Project]
28 Aug 2023

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
Abril Corona-Figueroa, Sam Bond-Taylor, Neelanjan Bhowmik, Yona Falinie A. Gaus, Toby P. Breckon, Hubert P. H. Shum, Chris G. Willcocks
ICCV 2023. [Paper]
27 Aug 2023

Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views
Zi-Xin Zou, Weihao Cheng, Yan-Pei Cao, Shi-Sheng Huang, Ying Shan, Song-Hai Zhang
arXiv 2023. [Paper]
27 Aug 2023

Multi-plane denoising diffusion-based dimensionality expansion for 2D-to-3D reconstruction of microstructures with harmonized sampling
Kang-Hyun Lee, Gun Jin Yun
arXiv 2023. [Paper]
27 Aug 2023

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023
Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai
ICMI 2023. [Paper] [Github]
26 Aug 2023

Distribution-Aligned Diffusion for Human Mesh Recovery
Lin Geng Foo, Jia Gong, Hossein Rahmani, Jun Liu
ICCV 2023. [Paper] [Project]
25 Aug 2023

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Zhipeng Hu, Changjie Fan, Xin Yu
arXiv 2023. [Paper]
25 Aug 2023

LongDanceDiff: Long-term Dance Generation with Conditional Diffusion Model
Siqi Yang, Zejun Yang, Zhisheng Wang
arXiv 2023. [Paper]
23 Aug 2023

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
Yiwen Chen, Chi Zhang, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
arXiv 2023. [Paper] [Github]
22 Aug 2023

Texture Generation on 3D Meshes with Point-UV Diffusion
Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi
ICCV 2023. [Paper]
21 Aug 2023

Physics-Guided Human Motion Capture with Pose Probability Modeling
Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao Li, Yangang Wang
IJCAI 2023. [Paper] [Github]
19 Aug 2023

Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling
Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li
arXiv 2023. [Paper]
18 Aug 2023

MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR
Xudong Xu, Zhaoyang Lyu, Xingang Pan, Bo Dai
arXiv 2023. [Paper] [Project]
18 Aug 2023

O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model
Yubin Hu, Sheng Ye, Wang Zhao, Matthieu Lin, Yuze He, Yu-Hui Wen, Ying He, Yong-Jin Liu
arXiv 2023. [Paper]
18 Aug 2023

Denoising Diffusion for 3D Hand Pose Estimation from Images
Maksym Ivashechkin, Oscar Mendez, Richard Bowden
arXiv 2023. [Paper]
18 Aug 2023

PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation
Hanbing Liu, Jun-Yan He, Zhi-Qi Cheng, Wangmeng Xiang, Qize Yang, Wenhao Chai, Gaoang Wang, Xu Bao, Bin Luo, Yifeng Geng, Xuansong Xie
ACM MM 2023. [Paper] [Github]
18 Aug 2023

Guide3D: Create 3D Avatars from Text and Image Guidance
Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
arXiv 2023. [Paper]
18 Aug 2023

DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction
Xiaoxiao He, Chaowei Tan, Ligong Han, Bo Liu, Leon Axel, Kang Li, Dimitris N. Metaxas
MICCAI 2023. [Paper] [Github]
18 Aug 2023

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model
Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu
arXiv 2023. [Paper] [Project]
18 Aug 2023

Watch Your Steps: Local Image and Scene Editing by Text Instructions
Ashkan Mirzaei, Tristan Aumentado-Armstrong, Marcus A. Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G. Derpanis, Igor Gilitschenski
arXiv 2023. [Paper] [Project]
17 Aug 2023

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang, Deng Cai, Justus Thies
arXiv 2023. [Paper] [Project] [Github]
16 Aug 2023

CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction
Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari
arXiv 2023. [Paper]
15 Aug 2023

Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model
Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang
arXiv 2023. [Paper]
15 Aug 2023

3D Scene Diffusion Guidance using Scene Graphs
Mohammad Naanaa, Katharina Schmid, Yinyu Nie
arXiv 2023. [Paper]
8 Aug 2023

Cloth2Tex: A Customized Cloth Texture Generation Pipeline for 3D Virtual Try-On
Daiheng Gao, Xu Chen, Xindi Zhang, Qi Wang, Ke Sun, Bang Zhang, Liefeng Bo, Qixing Huang
arXiv 2023. [Paper]
8 Aug 2023

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose
Huichao Zhang, Bowen Chen, Hao Yang, Liao Qu, Xu Wang, Li Chen, Chao Long, Feida Zhu, Kang Du, Min Zheng
arXiv 2023. [Paper] [Project]
7 Aug 2023

Generative Approach for Probabilistic Human Mesh Recovery using Diffusion Models
Hanbyel Cho, Junmo Kim
ICCV Workshop 2023. [Paper] [Github]
5 Aug 2023

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation
Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, Shuicheng Yan
ACM MM 2023. [Paper]
5 Aug 2023

Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation
Zijie Wu, Yaonan Wang, Mingtao Feng, He Xie, Ajmal Mian
arXiv 2023. [Paper]
5 Aug 2023

On the Transition from Neural Representation to Symbolic Knowledge
Junyan Cheng, Peter Chin
arXiv 2023. [Paper]
3 Aug 2023

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling
Zhao Yang, Bing Su, Ji-Rong Wen
ACM MM 2023. [Paper] [Github]
3 Aug 2023

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation
Jinbo Wu, Xiaobo Gao, Xing Liu, Zhengyang Shen, Chen Zhao, Haocheng Feng, Jingtuo Liu, Errui Ding
arXiv 2023. [Paper]
30 Jul 2023

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction
Sibo Tian, Minghui Zheng, Xiao Liang
arXiv 2023. [Paper]
30 Jul 2023

TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis
Zihan Zhang, Richard Liu, Kfir Aberman, Rana Hanocka
arXiv 2023. [Paper]
27 Jul 2023

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation
Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang
arXiv 2023. [Paper]
26 Jul 2023

Fake It Without Making It: Conditioned Face Generation for Accurate 3D Face Shape Estimation
Will Rowan, Patrik Huber, Nick Pears, Andrew Keeling
arXiv 2023. [Paper]
25 Jul 2023

NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis
Nilesh Kulkarni, Davis Rempe, Kyle Genova, Abhijit Kundu, Justin Johnson, David Fouhey, Leonidas Guibas
arXiv 2023. [Paper] [Project]
14 Jul 2023

AvatarFusion: Zero-shot Generation of Clothing-Decoupled 3D Avatars Using 2D Diffusion
Shuo Huang, Zongxin Yang, Liangting Li, Yi Yang, Jia Jia
arXiv 2023. [Paper]
13 Jul 2023

Articulated 3D Head Avatar Generation using Text-to-Image Diffusion Models
Alexander W. Bergman, Wang Yifan, Gordon Wetzstein
arXiv 2023. [Paper]
10 Jul 2023

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation
Zhongyu Jiang, Zhuoran Zhou, Lei Li, Wenhao Chai, Cheng-Yen Yang, Jenq-Neng Hwang
arXiv 2023. [Paper]
7 Jul 2023

AutoDecoding Latent 3D Diffusion Models
Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc Van Gool, Sergey Tulyakov
arXiv 2023. [Paper]
7 Jul 2023

SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection
Yuguang Shi
arXiv 2023. [Paper]
5 Jul 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
Shentong Mo, Enze Xie, Ruihang Chu, Lewei Yao, Lanqing Hong, Matthias Nießner, Zhenguo Li
arXiv 2023. [Paper]
4 Jul 2023

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors
Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem
arXiv 2023. [Paper] [Project]
30 Jun 2023

Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua Gao
arXiv 2023. [Paper]
29 Jun 2023

DiffComplete: Diffusion-based Generative 3D Shape Completion
Ruihang Chu, Enze Xie, Shentong Mo, Zhenguo Li, Matthias Nießner, Chi-Wing Fu, Jiaya Jia
arXiv 2023. [Paper]
28 Jun 2023

DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation
Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang
arXiv 2023. [Paper]
21 Jun 2023

EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model
Lianying Yin, Yijun Wang, Tianyu He, Jinming Liu, Wei Zhao, Bohan Li, Xin Jin, Jianxin Lin
arXiv 2023. [Paper]
20 Jun 2023

Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
Yoni Kasten, Ohad Rahamim, Gal Chechik
arXiv 2023. [Paper]
18 Jun 2023

AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation
Yifei Zeng, Yuanxun Lu, Xinya Ji, Yao Yao, Hao Zhu, Xun Cao
arXiv 2023. [Paper]
16 Jun 2023

Edit-DiffNeRF: Editing 3D Neural Radiance Fields using 2D Diffusion Model
Lu Yu, Wei Xiang, Kang Han
arXiv 2023. [Paper]
15 Jun 2023

Adding 3D Geometry Control to Diffusion Models
Wufei Ma, Qihao Liu, Jiahao Wang, Angtian Wang, Yaoyao Liu, Adam Kortylewski, Alan Yuille
arXiv 2023. [Paper]
13 Jun 2023

Viewset Diffusion: (0-)Image-Conditioned 3D Generative Models from 2D Data
Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi
arXiv 2023. [Paper]
13 Jun 2023

3D molecule generation by denoising voxel grids
Pedro O. Pinheiro, Joshua Rackers, Joseph Kleinhenz, Michael Maser, Omar Mahmood, Andrew Martin Watkins, Stephen Ra, Vishnu Sresht, Saeed Saremi
arXiv 2023. [Paper]
13 Jun 2023

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions
Jiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua Gao
arXiv 2023. [Paper]
12 Jun 2023

RePaint-NeRF: NeRF Editting via Semantic Masks and Diffusion Models
Xingchen Zhou, Ying He, F. Richard Yu, Jianqiang Li, You Li
arXiv 2023. [Paper]
9 Jun 2023

Stochastic Multi-Person 3D Motion Forecasting
Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui
arXiv 2023. [Paper]
8 Jun 2023

ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections
Chun-Han Yao, Amit Raj, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani
arXiv 2023. [Paper]
7 Jun 2023

Synthesizing realistic sand assemblies with denoising diffusion in latent space
Nikolaos N. Vlassis, WaiChing Sun, Khalid A. Alshibli, Richard A. Regueiro
arXiv 2023. [Paper]
7 Jun 2023

AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars
Mohit Mendiratta, Xingang Pan, Mohamed Elgharib, Kartik Teotia, Mallikarjun B R, Ayush Tewari, Vladislav Golyanik, Adam Kortylewski, Christian Theobalt
arXiv 2023. [Paper]
1 Jun 2023

DiffRoom: Diffusion-based High-Quality 3D Room Reconstruction and Generation
Xiaoliang Ju, Zhaoyang Huang, Yijin Li, Guofeng Zhang, Yu Qiao, Hongsheng Li
arXiv 2023. [Paper]
1 Jun 2023

Controllable Motion Diffusion Model
Yi Shi, Jingbo Wang, Xuekun Jiang, Bo Dai
arXiv 2023. [PaperProject]
1 Jun 2023

FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models
Hao Zhang, Yanbo Xu, Tianyuan Dai, Yu-Wing, Tai Chi-Keung Tang
arXiv 2023. [Paper]
1 Jun 2023

Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images
Junxing Hu, Hongwen Zhang, Zerui Chen, Mengcheng Li, Yunlong Wang, Yebin Liu, Zhenan Sun
arXiv 2023. [Paper] [Project]
31 May 2023

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
arXiv 2023. [Paper]
30 May 2023

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance
Junzhe Zhu, Peiye Zhuang
arXiv 2023. [Paper]
30 May 2023

Conditional Diffusion Models for Semantic 3D Medical Image Synthesis
Zolnamar Dorjsembe, Hsing-Kuo Pao, Sodtavilan Odonchimed, Furen Xiao
arXiv 2023. [Paper]
29 May 2023

ZeroAvatar: Zero-shot 3D Avatar Generation from a Single Image
Zhenzhen Weng, Zeyu Wang, Serena Yeung
arXiv 2023. [Paper]
25 May 2023

NAP: Neural 3D Articulation Prior
Jiahui Lei, Congyue Deng, Bokui Shen, Leonidas Guibas, Kostas Daniilidis
arXiv 2023. [Paper] [Project]
25 May 2023

CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graphs
Guangyao Zhai, Evin Pınar Örnek, Shun-Cheng Wu, Yan Di, Federico Tombari, Nassir Navab, Benjamin Busam
arXiv 2023. [Paper]
25 May 2023

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu
arXiv 2023. [Paper] [Project]
25 May 2023

DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification
Sitian Shen, Zilin Zhu, Linqian Fan, Harry Zhang, Xinxiao Wu
arXiv 2023. [Paper]
25 May 2023

Confronting Ambiguity in 6D Object Pose Estimation via Score-Based Diffusion on SE(3)
Tsu-Ching Hsiao, Hao-Wei Chen, Hsuan-Kung Yang, Chun-Yi Lee
arXiv 2023. [Paper]
25 May 2023

Deceptive-NeRF: Enhancing NeRF Reconstruction using Pseudo-Observations from Diffusion Models
Xinhang Liu, Shiu-hong Kao, Jiaben Chen, Yu-Wing Tai, Chi-Keung Tang
arXiv 2023. [Paper]
24 May 2023

Manifold Diffusion Fields
Ahmed A. Elhag, Joshua M. Susskind, Miguel Angel Bautista
arXiv 2023. [Paper]
24 May 2023

Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape
Rundi Wu, Ruoshi Liu, Carl Vondrick, Changxi Zheng
arXiv 2023. [Paper] [Project] [Github]
24 May 2023

Understanding Text-driven Motion Synthesis with Keyframe Collaboration via Diffusion Models
Dong Wei, Xiaoning Sun, Huaijiang Sun, Bin Li, Shengxiang Hu, Weiqing Li, Jianfeng Lu
arXiv 2023. [Paper]
23 May 2023

DiffHand: End-to-End Hand Mesh Reconstruction via Diffusion Models
Lijun Li, Li'an Zhuo, Bang Zhang, Liefeng Bo, Chen Chen
arXiv 2023. [Paper]
23 May 2023

GMD: Controllable Human Motion Synthesis via Guided Diffusion Models
Korrawe Karunratanakul, Konpat Preechakul, Supasorn Suwajanakorn, Siyu Tang
arXiv 2023. [Paper] [Project]
21 May 2023

Towards Globally Consistent Stochastic Human Motion Prediction via Motion Diffusion
Jiarui Sun, Girish Chowdhary
arXiv 2023. [Paper]
21 May 2023

Few-shot 3D Shape Generation
Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
arXiv 2023. [Paper]
19 May 2023

Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
Byungjun Kim, Patrick Kwon, Kwangho Lee, Myunggi Lee, Sookwan Han, Daesik Kim, Hanbyul Joo
arXiv 2023. [Paper] [Project]
19 May 2023

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields
Jingbo Zhang, Xiaoyu Li, Ziyu Wan, Can Wang, Jing Liao
arXiv 2023. [Paper]
19 May 2023

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture
Liangchen Song, Liangliang Cao, Hongyu Xu, Kai Kang, Feng Tang, Junsong Yuan, Yang Zhao
arXiv 2023. [Paper]
18 May 2023

LDM3D: Latent Diffusion Model for 3D
Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, Jean Yu, Estelle Aflalo, Shao-Yen Tseng, Fabio Nonato, Matthias Muller, Vasudev Lal
arXiv 2023. [Paper]
18 May 2023

Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
Samaneh Azadi, Akbar Shah, Thomas Hayes, Devi Parikh, Sonal Gupta
arXiv 2023. [Paper] [Project]
16 May 2023

FitMe: Deep Photorealistic 3D Morphable Model Avatars
Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Jiankang Deng, Stefanos Zafeiriou
CVPR 2023. [Paper] [Project]
16 May 2023

AMD: Autoregressive Motion Diffusion
Bo Han, Hao Peng, Minjing Dong, Chang Xu, Yi Ren, Yixuan Shen, Yuheng Li
arXiv 2023. [Paper]
16 May 2023

Text-guided High-definition Consistency Texture Model
Zhibin Tang, Tiantong He
arXiv 2023. [Paper]
10 May 2023

Relightify: Relightable 3D Faces from a Single Image via Diffusion Models
Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Stefanos Zafeiriou
arXiv 2023. [Paper] [Project]
10 May 2023

CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter Simulation
Erik Buhmann, Sascha Diefenbacher, Engin Eren, Frank Gaede, Gregor Kasieczka, Anatolii Korol, William Korcari, Katja Krüger, Peter McKeown
arXiv 2023. [Paper]
8 May 2023

Locally Attentional SDF Diffusion for Controllable 3D Shape Generation
Xin-Yang Zheng, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu, Heung-Yeung Shum
SIGGRAPH 2023. [Paper]
8 May 2023

DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion
Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas
arXiv 2023. [Paper] [Github]
4 May 2023

Shap-E: Generating Conditional 3D Implicit Functions
Heewoo Jun, Alex Nichol
arXiv 2023. [Paper] [Github] 3 May 2023

ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation
Zehao Zhu, Jiashun Wang, Yuzhe Qin, Deqing Sun, Varun Jampani, Xiaolong Wang
arXiv 2023. [Paper] [Project]
2 May 2023

DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling
Mehmet Saygin Seyfioglu, Karim Bouyarmane, Suren Kumar, Amir Tavanaei, Ismail B. Tutar
arXiv 2023. [Paper]
2 May 2023

Learning a Diffusion Prior for NeRFs
Guandao Yang, Abhijit Kundu, Leonidas J. Guibas, Jonathan T. Barron, Ben Poole
ICLR Workshop 2023. [Paper]
27 Apr 2023

TextMesh: Generation of Realistic 3D Meshes From Text Prompts
Christina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, Federico Tombari
arXiv 2023. [Paper]
24 Apr 2023

Nerfbusters: Removing Ghostly Artifacts from Casually Captured NeRFs
Frederik Warburg, Ethan Weber, Matthew Tancik, Aleksander Holynski, Angjoo Kanazawa
arXiv 2023. [Paper] [Project] [Github]
20 Apr 2023

Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion
Tomas Jakab, Ruining Li, Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi
arXiv 2023. [Paper] [Project]
20 Apr 2023

Anything-3D: Towards Single-view Anything Reconstruction in the Wild
Qiuhong Shen, Xingyi Yang, Xinchao Wang
arXiv 2023. [Paper]
19 Apr 2023

Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model
Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet, Artsiom Sanakoyeu
CVPR 2023. [Paper] [Project] [Github]
17 Apr 2023

Towards Controllable Diffusion Models via Reward-Guided Exploration
Hengtong Zhang, Tingyang Xu
arXiv 2023. [Paper]
14 Apr 2023

Learning Controllable 3D Diffusion Models from Single-view Images
Jiatao Gu, Qingzhe Gao, Shuangfei Zhai, Baoquan Chen, Lingjie Liu, Josh Susskind
arXiv 2023. [Paper] [Project]
13 Apr 2023

Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction
Hansheng Chen, Jiatao Gu, Anpei Chen, Wei Tian, Zhuowen Tu, Lingjie Liu, Hao Su
arXiv 2023. [Paper] [Project]
13 Apr 2023

Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views
Siwei Zhang, Qianli Ma, Yan Zhang, Sadegh Aliakbarian, Darren Cosker, Siyu Tang
arXiv 2023. [Paper] [Project]
12 Apr 2023

InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions
Han Liang, Wenqian Zhang, Wenxuan Li, Jingyi Yu, Lan Xu
arXiv 2023. [Paper] [Github]
12 Apr 2023

Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views
Siwei Zhang, Qianli Ma, Yan Zhang, Sadegh Aliakbarian, Darren Cosker, Siyu Tang
arXiv 2023. [Paper] [Project]
12 Apr 2023

Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond
Mohammadreza Armandpour, Huangjie Zheng, Ali Sadeghian, Amir Sadeghian, Mingyuan Zhou
arXiv 2023. [Paper] [Project]
11 Apr 2023

NeRF applied to satellite imagery for surface reconstruction
Federico Semeraro, Yi Zhang, Wenying Wu, Patrick Carroll
arXiv 2023. [Paper] [Github]
9 Apr 2023

DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
Hoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun
arXiv 2023. [Paper] [Project]
6 Apr 2023

Generative Novel View Synthesis with 3D-Aware Diffusion Models
Eric R. Chan, Koki Nagano, Matthew A. Chan, Alexander W. Bergman, Jeong Joon Park, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, Gordon Wetzstein
arXiv 2023. [Paper] [Project]
5 Apr 2023

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
Davis Rempe, Zhengyi Luo, Xue Bin Peng, Ye Yuan, Kris Kitani, Karsten Kreis, Sanja Fidler, Or Litany
CVPR 2023. [Paper] [Github]
4 Apr 2023

PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion
Gwanghyun Kim, Ji Ha Jang, Se Young Chun
arXiv 2023. [Paper] [Project]
4 Apr 2023

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
arXiv 2023. [Paper] [Project] [Github]
3 Apr 2023

Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models
Wenjie Yin, Ruibo Tu, Hang Yin, Danica Kragic, Hedvig Kjellström, Mårten Björkman
arXiv 2023. [Paper]
3 Apr 2023

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models
Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong
arXiv 2023. [Paper]
3 Apr 2023

DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Longwen Zhang, Qiwei Qiu, Hongyang Lin, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang, Lan Xu, Jingyi Yu
arXiv 2023. [Paper] [Project]
1 Apr 2023

AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
Ruixiang Jiang, Can Wang, Jingbo Zhang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao
arXiv 2023. [Paper] [Project] [Github]
30 Mar 2023

HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
Animesh Karnewar, Andrea Vedaldi, David Novotny, Niloy Mitra
CVPR 2023. [Paper] [Project]
29 Mar 2023

4D Facial Expression Diffusion Model
Kaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo
arXiv 2023. [Paper] [Github]
29 Mar 2023

Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion
Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira
arXiv 2023. [Paper] [Project] [Github]
28 Mar 2023

Novel View Synthesis of Humans using Differentiable Rendering
Guillaume Rochette, Chris Russell, Richard Bowden
IEEE T-BIOM 2023. [Paper] [Github]
28 Mar 2023

Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation
Susung Hong, Donghoon Ahn, Seungryong Kim
CVPR Workshop 2023. [Paper]
27 Mar 2023

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen
arXiv 2023. [Paper] [Project] [Github]
24 Mar 2023

ISS++: Image as Stepping Stone for Text-Guided 3D Shape Generation
Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
ICLR 2023. [Paper]
24 Mar 2023

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
Yiqi Lin, Haotian Bai, Sijia Li, Haonan Lu, Xiaodong Lin, Hui Xiong, Lin Wang
arXiv 2023. [Paper] [Project]
24 Mar 2023

Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation
Rui Chen, Yongwei Chen, Ningxin Jiao, Kui Jia
arXiv 2023. [Paper] [Project] [Github]
24 Mar 2023

DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video
Ce Zheng, Guo-Jun Qi, Chen Chen
arXiv 2023. [Paper]
23 Mar 2023

Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions
Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa
arXiv 2023. [Paper] [Project]
22 Mar 2023

FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models
Jianglong Ye, Naiyan Wang, Xiaolong Wang
arXiv 2023. [Paper] [Project]
22 Mar 2023

Vox-E: Text-guided Voxel Editing of 3D Objects
Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor
arXiv 2023. [Paper] [Project]
21 Mar 2023

Compositional 3D Scene Generation using Locally Conditioned Diffusion
Ryan Po, Gordon Wetzstein
arXiv 2023. [Paper] [Github]
21 Mar 2023

Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation
Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao
arXiv 2023. [Paper] [Github]
21 Mar 2023

3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion
Yu-Jhe Li, Kris Kitani
arXiv 2023. [Paper]
21 Mar 2023

Affordance Diffusion: Synthesizing Hand-Object Interactions
Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu
CVPR 2023. [Paper] [Project]
21 Mar 2023

SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation
Juil Koo, Seungwoo Yoo, Minh Hieu Nguyen, Minhyuk Sung
arXiv 2023. [Paper] [Project]
21 Mar 2023

Learning a 3D Morphable Face Reflectance Model from Low-cost Data
Yuxuan Han, Zhibo Wang, Feng Xu
CVPR 2023. [Paper] [Project]
21 Mar 2023

Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner
arXiv 2023. [Paper] [Project]
20 Mar 2023

Zero-1-to-3: Zero-shot One Image to 3D Object
Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, Carl Vondrick
arXiv 2023. [Paper] [Project] [Github]
20 Mar 2023

SKED: Sketch-guided Text-based 3D Editing
Aryan Mikaeili, Or Perel, Daniel Cohen-Or, Ali Mahdavi-Amiri
arxiv 2023. [Paper]
19 Mar 2023

3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
Yuhan Li, Yishun Dou, Xuanhong Chen, Bingbing Ni, Yilin Sun, Yutian Liu, Fuzhen Wang
CVPR 2023. [Paper] [Github]
18 Mar 2023

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Lingting Zhu, Xian Liu, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu
CVPR 2023. [Paper] [Github]
16 Mar 2023

Diffusion-HPC: Generating Synthetic Images with Realistic Humans
Zhenzhen Weng, Laura Bravo-Sánchez, Serena Yeung
arXiv 2023. [Paper] [Github]
16 Mar 2023

DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars
David Svitov, Dmitrii Gudkov, Renat Bashirov, Victor Lempitsky
arXiv 2023. [Paper]
16 Mar 2023

Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models
Suhyeon Lee, Hyungjin Chung, Minyoung Park, Jonghyuk Park, Wi-Sun Ryu, Jong Chul Ye
arXiv 2023. [Paper]
15 Mar 2023

Controllable Mesh Generation Through Sparse Latent Point Diffusion Models
Zhaoyang Lyu, Jinyi Wang, Yuwei An, Ya Zhang, Dahua Lin, Bo Dai
CVPR 2023. [Paper] [Project]
14 Mar 2023

MeshDiffusion: Score-based Generative 3D Mesh Modeling
Zhen Liu, Yao Feng, Michael J. Black, Derek Nowrouzezahrai, Liam Paull, Weiyang Liu
ICLR 2023. [Paper] [Project] [Github]
14 Mar 2023

Point Cloud Diffusion Models for Automatic Implant Generation
Paul Friedrich, Julia Wolleb, Florentin Bieder, Florian M. Thieringer, Philippe C. Cattin
arXiv 2023. [Paper]
14 Mar 2023

Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Jaehoon Ko, Hyeonsu Kim, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim
arXiv 2023. [Paper] [Github]
14 Mar 2023

GECCO: Geometrically-Conditioned Point Diffusion Models
Michał J. Tyszkiewicz, Pascal Fua, Eduard Trulls
arXiv 2023. [Paper]
10 Mar 2023

3DGen: Triplane Latent Diffusion for Textured Mesh Generation
Anchit Gupta, Wenhan Xiong, Yixin Nie, Ian Jones, Barlas Oğuz
arXiv 2023. [Paper]
9 Mar 2023

Human Motion Diffusion as a Generative Prior
Yonatan Shafir, Guy Tevet, Roy Kapon, Amit H. Bermano
arXiv 2023. [Paper]
2 Mar 2023

Can We Use Diffusion Probabilistic Models for 3D Motion Prediction?
Hyemin Ahn, Esteve Valls Mascaro, Dongheui Lee
ICRA 2023. [Paper] [Project] [Github]
28 Feb 2023

DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
Jamie Wynn, Daniyar Turmukhambetov
CVPR 2023. [Paper] [Github] [Github]
23 Feb 2023

PC2: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction
Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi
arXiv 2023. [PaperProject]
23 Feb 2023

NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion
Jiatao Gu, Alex Trevithick, Kai-En Lin, Josh Susskind, Christian Theobalt, Lingjie Liu, Ravi Ramamoorthi
ICML 2023. [Paper] [Github]
20 Feb 2023

SinMDM: Single Motion Diffusion
Sigal Raab, Inbal Leibovitch, Guy Tevet, Moab Arar, Amit H. Bermano, Daniel Cohen-Or
arXiv 2023. [Paper] [Project] [Github]
12 Feb 2023

3D Colored Shape Reconstruction from a Single RGB Image through Diffusion
Bo Li, Xiaolin Wei, Fengwei Chen, Bin Liu
arXiv 2023. [Paper]
11 Feb 2023

HumanMAC: Masked Motion Completion for Human Motion Prediction
Ling-Hao Chen, Jiawei Zhang, Yewen Li, Yiren Pang, Xiaobo Xia, Tongliang Liu
arXiv 2023. [Paper] [Project] [Github]
7 Feb 2023

TEXTure: Text-Guided Texturing of 3D Shapes
Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or
arXiv 2023. [Paper] [Project] [Github]
3 Feb 2023

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation
Bo Han, Yitong Liu, Yixuan Shen
arXiv 2023. [Paper]
31 Jan 2023

Neural Wavelet-domain Diffusion for 3D Shape Generation, Inversion, and Manipulation
Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Ruihui Li, Chi-Wing Fu
SIGGRAPH ASIA 2023. [Paper] [Github]
1 Feb 2023

3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models
Biao Zhang, Jiapeng Tang, Matthias Niessner, Peter Wonka
SIGGRAPH 2023. [Paper] [Github] [Github]
26 Jan 2023

DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model
Fan Zhang, Naye Ji, Fuxing Gao, Yongping Li
arXiv 2023. [Paper]
24 Jan 2023

Bipartite Graph Diffusion Model for Human Interaction Generation
Baptiste Chopin, Hao Tang, Mohamed Daoudi
arXiv 2023. [Paper]
24 Jan 2023

Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Siyuan Huang, Zan Wang, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu
arXiv 2023. [Paper] [Project] [Github]
15 Jan 2023

Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models
Mengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, Nicu Sebe
arXiv 2023. [Paper]
10 Jan 2023

Diffusion Probabilistic Models for Scene-Scale 3D Categorical Data
Jumin Lee, Woobin Im, Sebin Lee, Sung-Eui Yoon
arXiv 2023. [Paper] [Github]
2 Jan 2023

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao
CVPR 2023. [Paper] [Project]
28 Dec 2022

Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, Mark Chen
arXiv 2022. [Paper] [Github]
16 Dec 2022

Real-Time Rendering of Arbitrary Surface Geometries using Learnt Transfer
Sirikonda Dhawal, Aakash KT, P.J. Narayanan
ICVGIP 2022. [Paper]
19 Dec 2022

Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models
Ziyi Chang, Edmund J. C. Findlay, Haozheng Zhang, Hubert P. H. Shum
arXiv 2022. [Paper]
16 Dec 2022

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo
arXiv 2022. [Paper] [Project]
12 Dec 2022

Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models
Jiabao Lei, Jiapeng Tang, Kui Jia
CVPR 2023. [Paper] [Project] [Github]
12 Dec 2022

Ego-Body Pose Estimation via Ego-Head Pose Estimation
Jiaman Li, C. Karen Liu, Jiajun Wu
CVPR 2023. [Paper]
9 Dec 2022

MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Rishabh Dabral, Muhammad Hamza Mughal, Vladislav Golyanik, Christian Theobalt
CVPR 2023. [Paper] [Project]
8 Dec 2022

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, Liangyan Gui
CVPR 2023. [Paper] [Project]
8 Dec 2022

Executing your Commands via Motion Diffusion in Latent Space
Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, Gang Yu
CVPR 2023. [Paper] [Project] [Github]
8 Dec 2022

Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Xiu Li
arXiv 2022. [Paper]
7 Dec 2022

NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov
arXiv 2022. [Paper]
6 Dec 2022

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
Muheng Li, Yueqi Duan, Jie Zhou, Jiwen Lu
CVPR 2023. [Paper] [Github]
6 Dec 2022

Pretrained Diffusion Models for Unified Human Motion Synthesis
Jianxin Ma, Shuai Bai, Chang Zhou
arXiv 2022. [Paper] [Project]
6 Dec 2022

DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model
Jeongjun Choi, Dongseok Shim, H. Jin Kim
arXiv 2022. [Paper]
6 Dec 2022

PhysDiff: Physics-Guided Human Motion Diffusion Model
Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz
arXiv 2022. [Paper] [Project]
5 Dec 2022

Fast Point Cloud Generation with Straight Flows
Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, Qiang Liu
arXiv 2022. [Paper]
4 Dec 2022

DiffRF: Rendering-Guided 3D Radiance Field Diffusion
Norman Müller, Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Matthias Nießner
CVPR 2023. [Paper] [Project]
2 Dec 2022

3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
Gimin Nam, Mariem Khlifi, Andrew Rodriguez, Alberto Tono, Linqi Zhou, Paul Guerrero
arXiv 2022. [Paper]
1 Dec 2022

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, Greg Shakhnarovich
CVPR 2023. [Paper] [Project]
1 Dec 2022

SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction
Zhizhuo Zhou, Shubham Tulsiani
CVPR 2023. [Paper] [Project] [Github]
1 Dec 2022

3D Neural Field Generation using Triplane Diffusion
J. Ryan Shue, Eric Ryan Chan, Ryan Po, Zachary Ankner, Jiajun Wu, Gordon Wetzstein
arXiv 2022. [Paper] [Project]
30 Nov 2022

DiffPose: Toward More Reliable 3D Pose Estimation
Jia Gong, Lin Geng Foo, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, Jun Liu
CVPR 2023. [Paper] [Github]
30 Nov 2022

DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models
Karl Holmquist, Bastian Wandt
arXiv 2022. [Paper] [Github]
29 Nov 2022

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim, Se Young Chun
CVPR 2023. [Paper] [Github]
29 Nov 2022

NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views
Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang
arXiv 2022. [Paper] [Project] [Github]
29 Nov 2022

Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive Diffusion
Kui Zhang, Hang Zhou, Jie Zhang, Qidong Huang, Weiming Zhang, Nenghai Yu
arXiv 2022. [Paper]
29 Nov 2022

UDE: A Unified Driving Engine for Human Motion Generation
Zixiang Zhou, Baoyuan Wang
arXiv 2022. [Paper] [Project] [Github]
29 Nov 2022

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao
arXiv 2022. [Paper]
25 Nov 2022

DiffusionSDF: Conditional Generative Modeling of Signed Distance Functions
Gene Chou, Yuval Bahat, Felix Heide
arXiv 2022. [Paper] [Github]
24 Nov 2022

Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek, Torben Peters, Jan D. Wegner, Konrad Schindler
arXiv 2022. [Paper]
23 Nov 2022

IC3D: Image-Conditioned 3D Diffusion for Shape Generation
Cristian Sbrolli, Paolo Cudrano, Matteo Frosi, Matteo Matteucci
arXiv 2022. [Paper]
20 Nov 2022

Listen, denoise, action! Audio-driven motion synthesis with diffusion models
Simon Alexanderson, Rajmund Nagy, Jonas Beskow, Gustav Eje Henter
arXiv 2022. [Paper]
17 Nov 2022

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevičius, Zexiang Xu, Matthew Fisher, Paul Henderson, Hakan Bilen, Niloy J. Mitra, Paul Guerrero
CVPR 2023. [Paper] [Github]
17 Nov 2022

Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, Daniel Cohen-Or
arXiv 2022. [Paper] [Github]
14 Nov 2022

ReFu: Refine and Fuse the Unobserved View for Detail-Preserving Single-Image 3D Human Reconstruction
Gyumin Shim, Minsoo Lee, Jaegul Choo
ACM 2022. [Paper]
9 Nov 2022

StructDiffusion: Object-Centric Diffusion for Semantic Rearrangement of Novel Objects
Weiyu Liu, Tucker Hermans, Sonia Chernova, Chris Paxton
RSS 2023. [Paper]
8 Nov 2022

Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model
Zhiyuan Ren, Zhihong Pan, Xin Zhou, Le Kang
ICASSP 2023. [Paper]
22 Oct 2022

LION: Latent Point Diffusion Models for 3D Shape Generation
Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis
NeurIPS 2022. [Paper] [Project]
12 Oct 2022

Human Joint Kinematics Diffusion-Refinement for Stochastic Motion Prediction
Dong Wei, Huaijiang Sun, Bin Li, Jianfeng Lu, Weiqing Li, Xiaoning Sun, Shengxiang Hu
AAAI 2023. [Paper]
12 Oct 2022

A generic diffusion-based approach for 3D human pose prediction in the wild
Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi
ICRA 2023. [Paper]
11 Oct 2022

Novel View Synthesis with Diffusion Models
Daniel Watson, William Chan, Ricardo Martin-Brualla, Jonathan Ho, Andrea Tagliasacchi, Mohammad Norouzi
ICLR 2023. [Paper]
6 Oct 2022

Neural Volumetric Mesh Generator
Yan Zheng, Lemeng Wu, Xingchao Liu, Zhen Chen, Qiang Liu, Qixing Huang
NeurIPS Workshop 2022. [Paper]
6 Oct 2022

Denoising Diffusion Probabilistic Models for Styled Walking Synthesis
Edmund J. C. Findlay, Haozheng Zhang, Ziyi Chang, Hubert P. H. Shum
ICLR 2023. [Paper]
29 Sep 2022

Human Motion Diffusion Model
Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Amit H. Bermano, Daniel Cohen-Or
arXiv 2022. [Paper] [Project]
29 Sep 2022

ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
Zhengzhe Liu, Peng Dai, Ruihui Li, Xiaojuan Qi, Chi-Wing Fu
ICLR 2023. [Paper] [Github]
9 Sep 2022

SE(3)-DiffusionFields: Learning cost functions for joint grasp and motion optimization through diffusion
Julen Urain, Niklas Funk, Georgia Chalvatzaki, Jan Peters
arXiv 2022. [Paper] [Github]
8 Sep 2022

First Hitting Diffusion Models for Generating Manifold, Graph and Categorical Data
Mao Ye, Lemeng Wu, Qiang Liu
NeruIPS 2022. [Paper]
2 Sep 2022

FLAME: Free-form Language-based Motion Synthesis & Editing
Jihoon Kim, Jiseob Kim, Sungjoon Choi
AAAI 2023. [Paper]
1 Sep 2022

Let us Build Bridges: Understanding and Extending Diffusion Generative Models
Xingchao Liu, Lemeng Wu, Mao Ye, Qiang Liu
NeurIPS Workshop 2022. [Paper]
31 Aug 2022

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu
arXiv 2022. [Paper] [Project]
31 Aug 2022

A Diffusion Model Predicts 3D Shapes from 2D Microscopy Images
Dominik J. E. Waibel, Ernst Röell, Bastian Rieck, Raja Giryes, Carsten Marr
arXiv 2022. [Paper]
30 Aug 2022

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Jiachen Sun, Weili Nie, Zhiding Yu, Z. Morley Mao, Chaowei Xiao
arXiv 2022. [Paper]
21 Aug 2022

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion
Zhaoyang Lyu, Zhifeng Kong, Xudong Xu, Liang Pan, Dahua Lin
ICLR 2022. [Paper] [Github]
7 Dec 2021

Score-Based Point Cloud Denoising
Shitong Luo, Wei Hu
ICCV 2021. [Paper] [Github]
23 Jul 2021

DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras
Ruizhi Shao, Zerong Zheng, Hongwen Zhang, Jingxiang Sun, Yebin Liu
ECCV 2022. [Paper] [Project] [Github]
16 Jul 2022

3D Shape Generation and Completion through Point-Voxel Diffusion
Linqi Zhou, Yilun Du, Jiajun Wu
ICCV 2021. [Paper] [Project]
8 Apr 2021

Diffusion Probabilistic Models for 3D Point Cloud Generation
Shitong Luo, Wei Hu
CVPR 2021. [Paper] [Github]
2 Mar 2021

Adversarial Attack

Improving Visual Quality and Transferability of Adversarial Attacks on Face Recognition Simultaneously with Adversarial Restoration
Fengfan Zhou
arXiv 2023. [Paper]
4 Sep 2023

Intriguing Properties of Diffusion Models: A Large-Scale Dataset for Evaluating Natural Attack Capability in Text-to-Image Generative Models
Takami Sato, Justin Yue, Nanze Chen, Ningfei Wang, Qi Alfred Chen
arXiv 2023. [Paper]
30 Aug 2023

DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing
Jiawei Zhang, Zhongzhu Chen, Huan Zhang, Chaowei Xiao, Bo Li
USENIX Security 2023. [Paper]
28 Aug 2023

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models
Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang
arXiv 2023. [Paper]
23 Aug 2023

White-box Membership Inference Attacks against Diffusion Models
Yan Pang, Tianhao Wang, Xuhui Kang, Mengdi Huai, Yang Zhang
arXiv 2023. [Paper]
11 Aug 2023

BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models
Jordan Vice, Naveed Akhtar, Richard Hartley, Ajmal Mian
arXiv 2023. [Paper] [Github] [Dataset]
31 Jul 2023

Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models
Weikang Yu, Yonghao Xu, Pedram Ghamisi
arXiv 2023. [Paper]
31 Jul 2023

AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models
Xuelong Dai, Kaisheng Liang, Bin Xiao
arXiv 2023. [Paper]
24 Jul 2023

Enhancing Adversarial Robustness via Score-Based Optimization
Boya Zhang, Weijian Luo, Zhihua Zhang
arXiv 2023. [Paper]
10 Jul 2023

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks in the Physical World
Caixin Kang, Yinpeng Dong, Zhengyi Wang, Shouwei Ruan, Hang Su, Xingxing Wei
arXiv 2023. [Paper]
15 Jun 2023

An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization
Fei Kong, Jinhao Duan, RuiPeng Ma, Hengtao Shen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu
arXiv 2023. [Paper]
26 May 2023

Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllability
Haotian Xue, Alexandre Araujo, Bin Hu, Yongxin Chen
arXiv 2023. [Paper] [Github]
25 May 2023

Differentially Private Latent Diffusion Models
Saiyue Lyu, Margarita Vinaroz, Michael F. Liu, Mijung Park
arXiv 2023. [Paper]
25 May 2023

Latent Magic: An Investigation into Adversarial Examples Crafted in the Semantic Latent Space
BoYang Zheng
arXiv 2023. [Paper]
22 May 2023

Mist: Towards Improved Adversarial Examples for Diffusion Models
Chumeng Liang, Xiaoyu Wu
arXiv 2023. [Paper]
22 May 2023

Content-based Unrestricted Adversarial Attack
Zhaoyu Chen, Bo Li, Shuang Wu, Kaixun Jiang, Shouhong Ding, Wenqiang Zhang
arXiv 2023. [Paper]
18 May 2023

Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via Personalization
Yihao Huang, Qing Guo, Felix Juefei-Xu
arXiv 2023. [Paper]
18 May 2023

Raising the Bar for Certified Adversarial Robustness with Diffusion Models
Thomas Altstidl, David Dobre, Björn Eskofier, Gauthier Gidel, Leo Schwinn
arXiv 2023. [Paper]
17 May 2023

Diffusion Models for Imperceptible and Transferable Adversarial Attack
Jianqi Chen, Hao Chen, Keyan Chen, Yilan Zhang, Zhengxia Zou, Zhenwei Shi
arXiv 2023. [Paper] [Github]
14 May 2023

On enhancing the robustness of Vision Transformers: Defensive Diffusion
Raza Imam, Muhammad Huzaifa, Mohammed El-Amine Azz
arXiv 2023. [Paper] [Github]
14 May 2023

Generative Steganography Diffusion
Ping Wei, Qing Zhou, Zichi Wang, Zhenxing Qian, Xinpeng Zhang, Sheng Li
arXiv 2023. [Paper]
5 May 2023

A Pilot Study of Query-Free Adversarial Attack against Stable Diffusion
Haomin Zhuang, Yihua Zhang, Sijia Liu
CVPR Workshop 2023. [Paper]
3 Apr 2023

Black-box Backdoor Defense via Zero-shot Image Purification
Yucheng Shi, Mengnan Du, Xuansheng Wu, Zihan Guan, Ninghao Liu
arXiv 2023. [Paper]
21 Mar 2023

Adversarial Counterfactual Visual Explanations
Guillaume Jeanneret, Loïc Simon, Frédéric Jurie
CVPR 2023. [Paper] [Github]
17 Mar 2023

Robust Evaluation of Diffusion-Based Adversarial Purification
Minjong Lee, Dongwoo Kim
ICLR 2023. [Paper]
16 Mar 2023

The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models
Hadi M. Dolatabadi, Sarah Erfani, Christopher Leckie
arXiv 2023. [Paper]
15 Mar 2023

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Weixin Chen, Dawn Song, Bo Li
CVPR 2023. [Paper] [Github]
10 Mar 2023

Generative Model-Based Attack on Learnable Image Encryption for Privacy-Preserving Deep Learning
AprilPyone MaungMaung, Hitoshi Kiya
arXiv 2023. [Paper]
9 Mar 2023

Differentially Private Diffusion Models Generate Useful Synthetic Images
Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle
arXiv 2023. [Paper]
27 Feb 2023

Data Forensics in Diffusion Models: A Systematic Analysis of Membership Privacy
Derui Zhu, Dingfan Chen, Jens Grossklags, Mario Fritz
arXiv 2023. [Paper]
15 Feb 2023

Raising the Cost of Malicious AI-Powered Image Editing
Hadi Salman, Alaa Khaddaj, Guillaume Leclerc, Andrew Ilyas, Aleksander Madry
arXiv 2023. [Paper] [Github]
13 Feb 2023

Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
Chumeng Liang, Xiaoyu Wu, Yang Hua, Jiaru Zhang, Yiming Xue, Tao Song, Zhengui Xue, Ruhui Ma, Haibing Guan
arXiv 2023. [Paper]
9 Feb 2023

Better Diffusion Models Further Improve Adversarial Training
Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan
arXiv 2023. [Paper] [Github]
9 Feb 2023

Membership Inference Attacks against Diffusion Models
Tomoya Matsumoto, Takayuki Miura, Naoto Yanai
arXiv 2023. [Paper]
7 Feb 2023

MorDIFF: Recognition Vulnerability and Attack Detectability of Face Morphing Attacks Created by Diffusion Autoencoders
Naser Damer, Meiling Fang, Patrick Siebke, Jan Niklas Kolf, Marco Huber, Fadi Boutros
IWBF 2023. [Paper] [Github]
3 Feb 2023

Extracting Training Data from Diffusion Models
Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace
arXiv 2023. [Paper]
2 Feb 2023

Are Diffusion Models Vulnerable to Membership Inference Attacks?
Jinhao Duan, Fei Kong, Shiqi Wang, Xiaoshuang Shi, Kaidi Xu
arXiv 2023. [Paper]
2 Feb 2023

Salient Conditional Diffusion for Defending Against Backdoor Attacks
Brandon B. May, N. Joseph Tatro, Piyush Kumar, Nathan Shnidman
ICLR Workshop 2023. [Paper]
31 Jan 2023

Extracting Training Data from Diffusion Models
Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace
arXiv 2023. [Paper]
30 Jan 2023

Membership Inference of Diffusion Models
Hailong Hu, Jun Pang
arXiv 2023. [Paper]
24 Jan 2023

Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks
Lars Lien Ankile, Anna Midgley, Sebastian Weisshaar
arXiv 2023. [Paper] [Github]
17 Jan 2023

Fight Fire With Fire: Reversing Skin Adversarial Examples by Multiscale Diffusive and Denoising Aggregation Mechanism
Yongwei Wang, Yuan Li, Zhiqi Shen
arXiv 2022. [Paper]
22 Aug 2022

DensePure: Understanding Diffusion Models towards Adversarial Robustness
Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song
NeurIPS 2022. [Paper]
1 Nov 2022

Improving Adversarial Robustness by Contrastive Guided Diffusion Process
Yidong Ouyang, Liyan Xie, Guang Cheng
arXiv 2022. [Paper]
18 Oct 2022

Differentially Private Diffusion Models
Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis
arXiv 2022. [Paper] [Project]
18 Oct 2022

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Jiachen Sun, Weili Nie, Zhiding Yu, Z. Morley Mao, Chaowei Xiao
arXiv 2022. [Paper]
21 Aug 2022

Threat Model-Agnostic Adversarial Defense using Diffusion Models
Tsachi Blau, Roy Ganz, Bahjat Kawar, Alex Bronstein, Michael Elad
arXiv 2022. [Paper] [Github]
17 Jul 2022

Back to the Source: Diffusion-Driven Test-Time Adaptation
Jin Gao, Jialing Zhang, Xihui Liu, Trevor Darrell, Evan Shelhamer, Dequan Wang
arXiv 2022. [Paper] [Github]
7 Jul 2022

Guided Diffusion Model for Adversarial Purification from Random Noise
Quanlin Wu, Hang Ye, Yuntian Gu
arXiv 2022. [Paper]
17 Jun 2022

(Certified!!) Adversarial Robustness for Free!
Nicholas Carlini, Florian Tramer, Krishnamurthy (Dj)Dvijotham, J. Zico Kolter
ICLR 2023. [Paper]
21 Jun 2022

Guided Diffusion Model for Adversarial Purification
Jinyi Wang, Zhaoyang Lyu, Dahua Lin, Bo Dai, Hongfei Fu
ICML 2022. [Paper] [Github]
30 May 2022

Diffusion Models for Adversarial Purification
Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar
ICML 2022. [Paper] [Project] [Github]
16 May 2022

TFDPM: Attack detection for cyber-physical systems with diffusion probabilistic models
Tijin Yan, Tong Zhou, Yufeng Zhan, Yuanqing Xia
Elsveier Knowledge-Based Systems 2021. [Paper]
20 Dec 2021

Adversarial purification with Score-based generative models
Jongmin Yoon, Sung Ju Hwang, Juho Lee
ICML 2021. [Paper] [Github]
11 Jun 2021

Miscellany

Robustness and Generalizability of Deepfake Detection: A Study with Diffusion Models
Haixu Song, Shiyu Huang, Yinpeng Dong, Wei-Wei Tu
arXiv 2023. [Paper] [Github]
5 Sep 2023

ControlMat: A Controlled Generative Approach to Material Capture
Giuseppe Vecchio, Rosalie Martin, Arthur Roullier, Adrien Kaiser, Romain Rouffet, Valentin Deschaintre, Tamy Boubekeur
arXiv 2023. [Paper]
4 Sep 2023

Softmax Bias Correction for Quantized Generative Models
Nilesh Prasad Pandey, Marios Fournarakis, Chirag Patel, Markus Nagel
arXiv 2023. [Paper]
4 Sep 2023

DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion
Cédric Rommel, Eduardo Valle, Mickaël Chen, Souhaiel Khalfaoui, Renaud Marlet, Matthieu Cord, Patrick Pérez
arXiv 2023. [Paper]
4 Sep 2023

Diffusion Model with Clustering-based Conditioning for Food Image Generation
Yue Han, Jiangpeng He, Mridul Gupta, Edward J. Delp, Fengqing Zhu
MADiMa 2023. [Paper]
1 Sep 2023

Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps
Miguel Espinosa, Elliot J. Crowley
arXiv 2023. [Paper] [Github]
31 Aug 2023

Diffusion Models for Interferometric Satellite Aperture Radar
Alexandre Tuel, Thomas Kerdreux, Claudia Hulbert, Bertrand Rouet-Leduc
arXiv 2023. [Paper]
31 Aug 2023

MFR-Net: Multi-faceted Responsive Listening Head Generation via Denoising Diffusion Model
Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han
ACM MM 2023. [Paper]
31 Aug 2023

SignDiff: Learning Diffusion Models for American Sign Language Production
Sen Fang, Chunyu Sui, Xuedong Zhang, Yapeng Tian
arXiv 2023. [Paper]
30 Aug 2023

DiffuVolume: Diffusion Model for Volume based Stereo Matching
Dian Zheng, Xiao-Ming Wu, Zuhao Liu, Jingke Meng, Wei-shi Zheng
arXiv 2023. [Paper]
30 Aug 2023

Feature Attention Network (FA-Net): A Deep-Learning Based Approach for Underwater Single Image Enhancement
Muhammad Hamza, Ammar Hawbani, Sami Ul Rehman, Xingfu Wang, Liang Zhao
arXiv 2023. [Paper]
30 Aug 2023

Total Selfie: Generating Full-Body Selfies
Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz
arXiv 2023. [Paper] [Project]
28 Aug 2023

Unsupervised Domain Adaptation via Domain-Adaptive Diffusion
Duo Peng, Qiuhong Ke, Yinjie Lei, Jun Liu
arXiv 2023. [Paper]
26 Aug 2023

SDeMorph: Towards Better Facial De-morphing from Single Morph
Nitish Shukla
arXiv 2023. [Paper]
22 Aug 2023

Hey That's Mine Imperceptible Watermarks are Preserved in Diffusion Generated Outputs
Luke Ditria, Tom Drummond
arXiv 2023. [Paper]
22 Aug 2023

MatFuse: Controllable Material Generation with Diffusion Models
Giuseppe Vecchio, Renato Sortino, Simone Palazzo, Concetto Spampinato
arXiv 2023. [Paper]
22 Aug 2023

ControlCom: Controllable Image Composition using Diffusion Model
Bo Zhang, Yuxuan Duan, Jun Lan, Yan Hong, Huijia Zhu, Weiqiang Wang, Li Niu
arXiv 2023. [Paper]
19 Aug 2023

DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization
Xiaoyu Ye, Hao Huang, Jiaqi An, Yongtao Wang
arXiv 2023. [Paper]
19 Aug 2023

Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising Diffusion Model
Ran Jiang, Sanfeng Zhang, Linfeng Liu, Yanbing Peng
arXiv 2023. [Paper]
16 Aug 2023

U-Turn Diffusion
Hamidreza Behjoo, Michael Chertkov
arXiv 2023. [Paper]
14 Aug 2023

Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation
Philipp Vaeth, Alexander M. Fruehwald, Benjamin Paassen, Magda Gregorova
arXiv 2023. [Paper]
11 Aug 2023

DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images
Xuechao Zou, Kai Li, Junliang Xing, Yu Zhang, Shiying Wang, Lei Jin, Pin Tao
arXiv 2023. [Paper]
8 Aug 2023

Towards Personalized Prompt-Model Retrieval for Generative Recommendation
Yuanhe Guo, Haoming Liu, Hongyi Wen
arXiv 2023. [Paper] [Github]
4 Aug 2023

Training Data Protection with Compositional Diffusion Models
Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
arXiv 2023. [Paper]
2 Aug 2023

Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation
Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li
ACM MM 2023. [Paper]
2 Aug 2023

RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects
Sascha Kirch, Valeria Olyunina, Jan Ondřej, Rafael Pagés, Sergio Martin, Clara Pérez-Molina
arXiv 2023. [Paper]
29 Jul 2023

Not with my name! Inferring artists' names of input strings employed by Diffusion Models
Roberto Leotta, Oliver Giudice, Luca Guarnera, Sebastiano Battiato
arXiv 2023. [Paper] [Github]
25 Jul 2023

Data-free Black-box Attack based on Diffusion Model
Mingwen Shao, Lingzhuang Meng, Yuanjian Qiao, Lixu Zhang, Wangmeng Zuo
arXiv 2023. [Paper]
24 Jul 2023

Diffusion Models for Probabilistic Deconvolution of Galaxy Images
Zhiwei Xue, Yuhang Li, Yash Patel, Jeffrey Regier
arXiv 2023. [Paper] [Github]
20 Jul 2023

BSDM: Background Suppression Diffusion Model for Hyperspectral Anomaly Detection
Jitao Ma, Weiying Xie, Yunsong Li, Leyuan Fang
arXiv 2023. [Paper]
19 Jul 2023

Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model
Rongke Liu
arXiv 2023. [Paper]
17 Jul 2023

LafitE: Latent Diffusion Model with Feature Editing for Unsupervised Multi-class Anomaly Detection
Haonan Yin, Guanlong Jiao, Qianhui Wu, Borje F. Karlsson, Biqing Huang, Chin Yew Lin
arXiv 2023. [Paper]
16 Jul 2023

Improved Flood Insights: Diffusion-Based SAR to EO Image Translation
Minseok Seo, Youngtack Oh, Doyi Kim, Dongmin Kang, Yeji Choi
arXiv 2023. [Paper]
14 Jul 2023

Exposing the Fake: Effective Diffusion-Generated Images Detection
Ruipeng Ma, Jinhao Duan, Fei Kong, Xiaoshuang Shi, Kaidi Xu
ICML 2023. [Paper]
12 Jul 2023

On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models
Marija Ivanovska, Vitomir Štruc
arXiv 2023. [Paper]
11 Jul 2023

Unsupervised 3D out-of-distribution detection with latent diffusion models
Mark S. Graham, Walter Hugo Lopez Pinaya, Paul Wright, Petru-Daniel Tudosiu, Yee H. Mah, James T. Teo, H. Rolf Jäger, David Werring, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso
arXiv 2023. [Paper] [Github]
7 Jul 2023

Hyperspectral and Multispectral Image Fusion Using the Conditional Denoising Diffusion Probabilistic Model
Shuaikai Shi, Lijun Zhang, Jie Chen
arXiv 2023. [Paper]
7 Jul 2023

Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback
TaeHo Yoon, Kibeom Myoung, Keon Lee, Jaewoong Cho, Albert No, Ernest K. Ryu
arXiv 2023. [Paper] [Github]
6 Jul 2023

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
Peter Lorenz, Ricard Durall, Janis Keuper
arXiv 2023. [Paper]
5 Jul 2023

Diffusion Models for Computational Design at the Example of Floor Plans
Joern Ploennigs, Markus Berger
arXiv 2023. [Paper]
5 Jul 2023

RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation
Renato Sortino, Thomas Cecconello, Andrea DeMarco, Giuseppe Fiameni, Andrea Pilzer, Andrew M. Hopkins, Daniel Magro, Simone Riggi, Eva Sciacca, Adriano Ingallinera, Cristobal Bordiu, Filomena Bufano, Concetto Spampinato
arXiv 2023. [Paper]
5 Jul 2023

TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models
Marija Ivanovska, Vitomir Struc, Janez Pers
arXiv 2023. [Paper]
3 Jul 2023

Squeezing Large-Scale Diffusion Models for Mobile
Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, Jae-Joon Kim, Hyungjun Kim
arXiv 2023. [Paper]
3 Jul 2023

Class-Incremental Learning using Diffusion Model for Distillation and Replay
Quentin Jodelet, Xin Liu, Yin Jun Phua, Tsuyoshi Murata
arXiv 2023. [Paper]
30 Jun 2023

ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models
Weihao Cheng, Yan-Pei Cao, Ying Shan
arXiv 2023. [Paper]
29 Jun 2023

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation
Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei Fu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
arXiv 2023. [Paper]
29 Jun 2023

DiffusionSTR: Diffusion Model for Scene Text Recognition
Masato Fujitake
arXiv 2023. [Paper]
29 Jun 2023

Filtered-Guided Diffusion: Fast Filter Guidance for Black-Box Diffusion Models
Zeqi Gu, Abe Davis
arXiv 2023. [Paper] [Github]
29 Jun 2023

ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models
Weihao Cheng, Yan-Pei Cao, Ying Shan
arXiv 2023. [Paper]
29 Jun 2023

Face Morphing Attack Detection with Denoising Diffusion Probabilistic Models
Marija Ivanovska, Vitomir Štruc
arXiv 2023. [Paper]
27 Jun 2023

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
Jianyuan Wang, Christian Rupprecht, David Novotny
arXiv 2023. [Paper]
27 Jun 2023

Fuzzy-Conditioned Diffusion and Diffusion Projection Attention Applied to Facial Image Correction
Majed El Helou
arXiv 2023. [Paper]
26 Jun 2023

Towards More Realistic Membership Inference Attacks on Large Diffusion Models
Jan Dubiński, Antoni Kowalczuk, Stanisław Pawlak, Przemysław Rokita, Tomasz Trzciński, Paweł Morawiecki
arXiv 2023. [Paper]
22 Jun 2023

DiffWA: Diffusion Models for Watermark Attack
Xinyu Li
arXiv 2023. [Paper]
22 Jun 2023

Improving visual image reconstruction from human brain activity using latent diffusion models via multiple decoded inputs
Yu Takagi, Shinji Nishimoto
arXiv 2023. [Paper]
20 Jun 2023

Diffusion model based data generation for partial differential equations
Rucha Apte, Sheel Nidhan, Rishikesh Ranade, Jay Pathak
arXiv 2023. [Paper]
19 Jun 2023

GenPose: Generative Category-level Object Pose Estimation via Diffusion Models
Jiyao Zhang, Mingdong Wu, Hao Dong
arXiv 2023. [Paper]
18 Jun 2023

Drag-guided diffusion models for vehicle image generation
Nikos Arechiga, Frank Permenter, Binyang Song, Chenyang Yuan
arXiv 2023. [Paper]
16 Jun 2023

R2-Diff: Denoising by diffusion as a refinement of retrieved motion for image-based motion prediction
Takeru Oba, Norimichi Ukita
arXiv 2023. [Paper]
15 Jun 2023

On the Robustness of Latent Diffusion Models
Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu, Michael R. Lyu
arXiv 2023. [Paper]
14 Jun 2023

VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models
Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho
arXiv 2023. [Paper]
12 Jun 2023

Boosting GUI Prototyping with Diffusion Models
Jialiang Wei, Anne-Lise Courbis, Thomas Lambolais, Binbin Xu, Pierre Louis Bernard, Gérard Dray
arXiv 2023. [Paper]
9 Jun 2023

Extraction and Recovery of Spatio-Temporal Structure in Latent Dynamics Alignment with Diffusion Model
Yule Wang, Zijing Wu, Chengrui Li, Anqi Wu
arXiv 2023. [Paper] [Github]
9 Jun 2023

Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model
Yida Chen, Fernanda Viégas, Martin Wattenberg
arXiv 2023. [Paper]
9 Jun 2023

PriSampler: Mitigating Property Inference of Diffusion Models
Hailong Hu, Jun Pang
arXiv 2023. [Paper]
8 Jun 2023

Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models
George Stein, Jesse C. Cresswell, Rasa Hosseinzadeh, Yi Sui, Brendan Leigh Ross, Valentin Villecroze, Zhaoyan Liu, Anthony L. Caterini, J. Eric T. Taylor, Gabriel Loaiza-Ganem
arXiv 2023. [Paper] [Github]
7 Jun 2023

Phoenix: A Federated Generative Diffusion Model
Fiona Victoria Stanley Jothiraj, Afra Mashhadi
arXiv 2023. [Paper]
7 Jun 2023

High-dimensional and Permutation Invariant Anomaly Detection
Vinicius Mikuni, Benjamin Nachman
arXiv 2023. [Paper]
6 Jun 2023

Emergent Correspondence from Image Diffusion
Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan
arXiv 2023. [Paper]
6 Jun 2023

Phoenix: A Federated Generative Diffusion Model
Fiona Victoria Stanley Jothiraj, Afra Mashhadi
arXiv 2023. [Paper]
7 Jun 2023

Change Diffusion: Change Detection Map Generation Based on Difference-Feature Guided DDPM
Yihan Wen, Jialu Sui, Xianping Ma, Wendi Liang, Xiaokang Zhang, Man-On Pun
arXiv 2023. [Paper]
6 Jun 2023

Towards Visual Foundational Models of Physical Scenes
Chethan Parameshwara, Alessandro Achille, Matthew Trager, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, CJ Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto
arXiv 2023. [Paper]
6 Jun 2023

Protecting the Intellectual Property of Diffusion Models by the Watermark Diffusion Process
Sen Peng, Yufei Chen, Cong Wang, Xiaohua Jia
arXiv 2023. [Paper]
6 Jun 2023

Emergent Correspondence from Image Diffusion
Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan
arXiv 2023. [Paper] [Project]
6 Jun 2023

Enhance Diffusion to Improve Robust Generalization
Jianhui Sun, Sanchit Sinha, Aidong Zhang
arXiv 2023. [Paper]
5 Jun 2023

Training Data Attribution for Diffusion Models
Zheng Dai, David K Gifford
arXiv 2023. [Paper]
3 Jun 2023

Deep Classifier Mimicry without Data Access
Steven Braun, Martin Mundt, Kristian Kersting
arXiv 2023. [Paper]
3 Jun 2023

Quantifying Sample Anonymity in Score-Based Generative Models with Adversarial Fingerprinting
Mischa Dombrowski, Bernhard Kainz
arXiv 2023. [Paper]
2 Jun 2023

Generative Autoencoders as Watermark Attackers: Analyses of Vulnerabilities and Threats
Xuandong Zhao, Kexun Zhang, Yu-Xiang Wang, Lei Li
arXiv 2023. [Paper]
2 Jun 2023

PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models
Jiacheng Chen, Ruizhi Deng, Yasutaka Furukawa
arXiv 2023. [Paper]
2 Jun 2023

Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation
Zhengyue Zhao, Jinhao Duan, Xing Hu, Kaidi Xu, Chenan Wang, Rui Zhang, Zidong Du, Qi Guo, Yunji Chen
arXiv 2023. [Paper]
2 Jun 2023

Robust Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers
Ruotong Wang, Hongrui Chen, Zihao Zhu, Li Liu, Yong Zhang, Yanbo Fan, Baoyuan Wu
arXiv 2023. [Paper]
1 Jun 2023

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein
arXiv 2023. [Paper] [Github]
31 May 2023

GANDiffFace: Controllable Generation of Synthetic Datasets for Face Recognition with Realistic Variations
Pietro Melzi, Christian Rathgeb, Ruben Tolosana, Ruben Vera-Rodriguez, Dominik Lawatsch, Florian Domin, Maxim Schaubert
arXiv 2023. [Paper]
31 May 2023

Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model
Haisong Ding, Bozhi Luan, Dongnan Gui, Kai Chen, Qiang Huo
arXiv 2023. [Paper]
31 May 2023

Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels
Jian Chen, Ruiyi Zhang, Tong Yu, Rohan Sharma, Zhiqiang Xu, Tong Sun, Changyou Chen
arXiv 2023. [Paper] [Github]
31 May 2023

DiffMatch: Diffusion Model for Dense Matching
Jisu Nam, Gyuseong Lee, Sunwoo Kim, Hyeonsu Kim, Hyoungwon Cho, Seyeon Kim, Seungryong Kim
arXiv 2023. [Paper] [Project]
30 May 2023

Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling
Qisheng Liao, Gus Xia, Zhinuo Wang
arXiv 2023. [Paper]
30 May 2023

Diffusion-Stego: Training-free Diffusion Generative Steganography via Message Projection
Daegyu Kim, Chaehun Shin, Jooyoung Choi, Dahuin Jung, Sungroh Yoon
arXiv 2023. [Paper]
30 May 2023

On Diffusion Modeling for Anomaly Detection
Victor Livernoche, Vineet Jain, Yashar Hezaveh, Siamak Ravanbakhsh
arXiv 2023. [Paper]
29 May 2023

Aligning Optimization Trajectories with Diffusion Models for Constrained Design Generation
Giorgio Giannone, Akash Srivastava, Ole Winther, Faez Ahmed
arXiv 2023. [Paper]
29 May 2023

Generating Driving Scenes with Diffusion
Ethan Pronovost, Kai Wang, Nick Roy
arXiv 2023. [Paper]
29 May 2023

Generative Diffusion for 3D Turbulent Flows
Marten Lienen, Jan Hansen-Palmus, David Lüdke, Stephan Günnemann
arXiv 2023. [Paper]
29 May 2023

GlyphControl: Glyph Conditional Control for Visual Text Generation
Yukang Yang, Dongnan Gui, Yuhui Yuan, Haisong Ding, Han Hu, Kai Chen
arXiv 2023. [Paper] [Github]
29 May 2023

High-Fidelity Image Compression with Score-based Generative Models
Emiel Hoogeboom, Eirikur Agustsson, Fabian Mentzer, Luca Versari, George Toderici, Lucas Theis
arXiv 2023. [Paper]
26 May 2023

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models
Zhongxi Chen, Ke Sun, Xianming Lin, Rongrong Ji
arXiv 2023. [Paper]
29 May 2023

DiffusionNAG: Task-guided Neural Architecture Generation with Diffusion Models
Sohyun An, Hayeon Lee, Jaehyeong Jo, Seanie Lee, Sung Ju Hwang
arXiv 2023. [Paper]
26 May 2023

CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography
Jiwen Yu, Xuanyu Zhang, Youmin Xu, Jian Zhang
arXiv 2023. [Paper] [Github]
26 May 2023

DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion Models
Yingqian Cui, Jie Ren, Han Xu, Pengfei He, Hui Liu, Lichao Sun, Jiliang Tang
arXiv 2023. [Paper]
25 May 2023

Realistic Noise Synthesis with Diffusion Models
Qi Wu, Mingyan Han, Ting Jiang, Haoqiang Fan, Bing Zeng, Shuaicheng Liu
arXiv 2023. [Paper]
23 May 2023

Anomaly Detection with Conditioned Denoising Diffusion Models
Arian Mousakhan, Thomas Brox, Jawad Tayyub
arXiv 2023. [Paper]
25 May 2023

Anomaly Detection in Satellite Videos using Diffusion Models
Akash Awasthi, Son Ly, Jaer Nizam, Samira Zare, Videet Mehta, Safwan Ahmed, Keshav Shah, Ramakrishna Nemani, Saurabh Prasad, Hien Van Nguyen
arXiv 2023. [Paper]
25 May 2023

Knowledge Diffusion for Distillation
Tao Huang, Yuan Zhang, Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Chang Xu
arXiv 2023. [Paper] [Github]
25 May 2023

Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition
Dongnan Gui, Kai Chen, Haisong Ding, Qiang Huo
arXiv 2023. [Paper]
25 May 2023

Unsupervised Semantic Correspondence Using Stable Diffusion
Eric Hedlin, Gopal Sharma, Shweta Mahajan, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, Kwang Moo Yi
arXiv 2023. [Paper]
24 May 2023

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence
Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, Ming-Hsuan Yang
arXiv 2023. [Paper] [Project]
24 May 2023

Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence
Grace Luo, Lisa Dunlap, Dong Huk Park, Aleksander Holynski, Trevor Darrell
arXiv 2023. [Paper] [Project]
23 May 2023

DiffProtect: Generate Adversarial Examples with Diffusion Models for Facial Privacy Protection
Jiang Liu, Chun Pong Lau, Rama Chellappa
arXiv 2023. [Paper]
23 May 2023

GSURE-Based Diffusion Model Training with Corrupted Data
Bahjat Kawar, Noam Elata, Tomer Michaeli, Michael Elad
arXiv 2023. [Paper] [Github]
22 May 2023

Watermarking Diffusion Model
Yugeng Liu, Zheng Li, Michael Backes, Yun Shen, Yang Zhang
arXiv 2023. [Paper]
21 May 2023

DiffUCD:Unsupervised Hyperspectral Image Change Detection with Semantic Correlation Diffusion Model
Xiangrong Zhang, Shunli Tian, Guanchun Wang, Huiyu Zhou, Licheng Jiao
arXiv 2023. [Paper]
21 May 2023

Incomplete Multi-view Clustering via Diffusion Completion
Sifan Fang
arXiv 2023. [Paper]
19 May 2023

SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models
Ziyi Wu, Jingyu Hu, Wuyue Lu, Igor Gilitschenski, Animesh Garg
arXiv 2023. [Paper]
18 May 2023

Selective Guidance: Are All the Denoising Steps of Guided Diffusion Important?
Pareesa Ameneh Golnari, Zhewei Yao, Yuxiong He
arXiv 2023. [Paper]
16 May 2023

A Method for Training-free Person Image Picture Generation
Tianyu Chen
ICOAI 2023. [Paper]
16 May 2023

Constructing a personalized AI assistant for shear wall layout using Stable Diffusion
Lufeng Wang, Jiepeng Liu, Guozhong Cheng, En Liu, Wei Chen
arXiv 2023. [Paper]
18 May 2023

DiffUTE: Universal Text Editing Diffusion Model
Chen, Haoxing, Xu, Zhuoer, Gu, Zhangxuan, Lan, Jun, Zheng, Xing, Li, Yaohui, Meng, Changhua, Zhu, Huijia, Wang, Weiqiang
arXiv 2023. [Paper] [Github]
18 May 2023

CDDM: Channel Denoising Diffusion Models for Wireless Communications
Tong Wu, Zhiyong Chen, Dazhi He, Liang Qian, Yin Xu, Meixia Tao, Wenjun Zhang
arXiv 2023. [Paper]
16 May 2023

Unlearnable Examples Give a False Sense of Security: Piercing through Unexploitable Data with Learnable Examples
Wan Jiang, Yunfeng Diao, He Wang, Jianxin Sun, Meng Wang, Richang Hong
arXiv 2023. [Paper]
16 May 2023

Diffusion Dataset Generation: Towards Closing the Sim2Real Gap for Pedestrian Detection
Andrew Farley, Mohsen Zand, Michael Greenspan
CRV 2023. [Paper]
16 May 2023

A Reproducible Extraction of Training Images from Diffusion Models
Ryan Webster
arXiv 2023. [Paper] [Github]
15 May 2023

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Antoni Bigata Casademunt, Rodrigo Mira, Nikita Drobyshev, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
arXiv 2023. [Paper]
15 May 2023

Manipulating Visually-aware Federated Recommender Systems and Its Countermeasures
Wei Yuan, Shilong Yuan, Chaoqun Yang, Quoc Viet Hung Nguyen, Hongzhi Yin
arXiv 2023. [Paper]
14 May 2023

Undercover Deepfakes: Detecting Fake Segments in Videos
Sanjay Saha, Rashindrie Perera, Sachith Seneviratne, Tamasha Malepathirana, Sanka Rasnayaka, Deshani Geethika, Terence Sim, Saman Halgamuge
arXiv 2023. [Paper] [Github]
11 May 2023

Comprehensive Dataset of Synthetic and Manipulated Overhead Imagery for Development and Evaluation of Forensic Tools
Brandon B. May, Kirill Trapeznikov, Shengbang Fang, Matthew C. Stamm
arXiv 2023. [Paper]
9 May 2023

DifFIQA: Face Image Quality Assessment Using Denoising Diffusion Probabilistic Models
Žiga Babnik, Peter Peer, Vitomir Štruc
arXiv 2023. [Paper]
9 May 2023

Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai, Yinpeng Dong, Qingni Shen, Shi Pu, Yuejian Fang, Hang Su
arXiv 2023. [Paper]
7 May 2023

Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model
Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue
arXiv 2023. [Paper]
6 May 2023

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
Ruijia Wu, Yuhang Wang, Huafeng Shi, Zhipeng Yu, Yichao Wu, Ding Liang
arXiv 2023. [Paper] 6 May 2023

Conditional Diffusion Feature Refinement for Continuous Sign Language Recognition
Leming Guo, Wanli Xue, Qing Guo, Yuxi Zhou, Tiantian Yuan, Shengyong Chen
arXiv 2023. [Paper]
5 May 2023

LayoutDM: Transformer-based Diffusion Model for Layout Generation
Shang Chai, Liansheng Zhuang, Fengying Yan
CVPR 2023. [Paper]
4 May 2023

Long-Term Rhythmic Video Soundtracker
Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, Yu Qiao
ICML 2023. [Paper] [Github]
2 May 2023

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A. Efros, Krishna Kumar Singh
CVPR 2023. [Paper] [Project] [Github]
27 Apr 2023

Single-View Height Estimation with Conditional Diffusion Probabilistic Models
Isaac Corley, Peyman Najafirad
arXiv 2023. [Paper]
26 Apr 2023

Diffusion Probabilistic Model Based Accurate and High-Degree-of-Freedom Metasurface Inverse Design
Zezhou Zhang, Chuanchuan Yang, Yifeng Qin, Hao Feng, Jiqiang Feng, Hongbin Li
arXiv 2023. [Paper]
25 Apr 2023

Improving Synthetically Generated Image Detection in Cross-Concept Settings
Pantelis Dogoulis, Giorgos Kordopatis-Zilos, Ioannis Kompatsiaris, Symeon Papadopoulos
arXiv 2023. [Paper]
24 Apr 2023

Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis
Yansong Gao, Zhihong Pan, Xin Zhou, Le Kang, Pratik Chaudhari
arXiv 2023. [Paper]
22 Apr 2023

Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
Yu-Hui Chen, Raman Sarokin, Juhyun Lee, Jiuqiang Tang, Chuo-Ling Chang, Andrei Kulik, Matthias Grundmann
CVPR 2023 Workshop. [Paper]
21 Apr 2023

A data augmentation perspective on diffusion models and retrieval
Max F. Burg, Florian Wenzel, Dominik Zietlow, Max Horn, Osama Makansi, Francesco Locatello, Chris Russell
arXiv 2023. [Paper]
20 Apr 2023

Diffusion models with location-scale noise
Alexia Jolicoeur-Martineau, Kilian Fatras, Ke Li, Tal Kachman
arXiv 2023. [Paper]
12 Apr 2023

Exploring Diffusion Models for Unsupervised Video Anomaly Detection
Anil Osman Tur, Nicola Dall'Asen, Cigdem Beyan, Elisa Ricci
arXiv 2023. [Paper]
12 Apr 2023

CamDiff: Camouflage Image Augmentation via Diffusion Model
Xue-Jing Luo, Shuo Wang, Zongwei Wu, Christos Sakaridis, Yun Cheng, Deng-Ping Fan, Luc Van Gool
arXiv 2023. [Paper] [Github]
11 Apr 2023

DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion
ZiHan Cao, ShiQi Cao, Xiao Wu, JunMing Hou, Ran Ran, Liang-Jian Deng
arXiv 2023. [Paper]
10 Apr 2023

CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model
Zhongqi Wang, Jie Zhang, Zhilong Ji, Jinfeng Bai, Shiguang Shan
arXiv 2023. [Paper]
9 Apr 2023

ChiroDiff: Modelling chirographic data with Diffusion Models
Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
ICLR 2023. [Paper] [Project]
7 Apr 2023

RoSteALS: Robust Steganography using Autoencoder Latent Space
Tu Bui, Shruti Agarwal, Ning Yu, John Collomosse
CVPR Workshop 2023. [Paper] [Github]
6 Apr 2023

JPEG Compressed Images Can Bypass Protections Against AI Editing
Pedro Sandoval-Segura, Jonas Geiping, Tom Goldstein
arXiv 2023. [Paper]
5 Apr 2023

Learning to Read Braille: Bridging the Tactile Reality Gap with Diffusion Models
Carolina Higuera, Byron Boots, Mustafa Mukadam
arXiv 2023. [Paper]
3 Apr 2023

Textile Pattern Generation Using Diffusion Models
Halil Faruk Karagoz, Gulcin Baykal, Irem Arikan Eksi, Gozde Unal
ITFC 2023. [Paper]
2 Apr 2023

Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso, Davide Morelli, Marcella Cornia, Lorenzo Baraldi, Alberto Del Bimbo, Rita Cucchiara
ACM 2023. [Paper]
2 Apr 2023

NeuroDAVIS: A neural network model for data visualization
Chayan Maitra, Dibyendu B. Seal, Rajat K. De
arXiv 2023. [Paper]
1 Apr 2023

Diffusion Action Segmentation
Daochang Liu, Qiyue Li, AnhDung Dinh, Tingting Jiang, Mubarak Shah, Chang Xu
arXiv 2023. [Paper]
31 Mar 2023

One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models
Yasser Benigmim, Subhankar Roy, Slim Essid, Vicky Kalogeiton, Stéphane Lathuilière
arXiv 2023. [Paper]
31 Mar 2023

DDP: Diffusion Model for Dense Visual Prediction
Yuanfeng Ji, Zhe Chen, Enze Xie, Lanqing Hong, Xihui Liu, Zhaoqiang Liu, Tong Lu, Zhenguo Li, Ping Luo
arXiv 2023. [Paper]
30 Mar 2023

WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models
Konstantina Nikolaidou, George Retsinas, Vincent Christlein, Mathias Seuret, Giorgos Sfikas, Elisa Barney Smith, Hamam Mokayed, Marcus Liwicki
arXiv 2023. [Paper]
29 Mar 2023

Visual Chain-of-Thought Diffusion Models
William Harvey, Frank Wood
arXiv 2023. [Paper]
28 Mar 2023

DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang
arXiv 2023. [Paper]
27 Mar 2023

The Stable Signature: Rooting Watermarks in Latent Diffusion Models
Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon
arXiv 2023. [Paper] [Project]
27 Mar 2023

Exploring Continual Learning of Diffusion Models
Michał Zając, Kamil Deja, Anna Kuzina, Jakub M. Tomczak, Tomasz Trzciński, Florian Shkurti, Piotr Miłoś
arXiv 2023. [Paper]
27 Mar 2023

Freestyle Layout-to-Image Synthesis
Han Xue, Zhiwu Huang, Qianru Sun, Li Song, Wenjun Zhang
CVPR 2023. [Paper]
25 Mar 2023

Controllable Inversion of Black-Box Face-Recognition Models via Diffusion
Manuel Kansy, Anton Raël, Graziana Mignone, Jacek Naruniec, Christopher Schroers, Markus Gross, Romann M. Weber
arXiv 2023. [Paper]
23 Mar 2023

End-to-End Diffusion Latent Optimization Improves Classifier Guidance
Bram Wallace, Akash Gokul, Stefano Ermon, Nikhil Naik
arXiv 2023. [Paper]
23 Mar 2023

DiffPattern: Layout Pattern Generation via Discrete Diffusion
Zixiao Wang, Yunheng Shen, Wenqian Zhao, Yang Bai, Guojin Chen, Farzan Farnia, Bei Yu
DAC 2023. [Paper]
23 Mar 2023

Diffuse-Denoise-Count: Accurate Crowd-Counting with Diffusion Models
Yasiru Ranasinghe, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel
arXiv 2023. [Paper]
22 Mar 2023

LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models
Junyi Zhang, Jiaqi Guo, Shizhao Sun, Jian-Guang Lou, Dongmei Zhang
arXiv 2023. [Paper]
21 Mar 2023

Positional Diffusion: Ordering Unordered Sets with Diffusion Probabilistic Models
Francesco Giuliari, Gianluca Scarpellini, Stuart James, Yiming Wang, Alessio Del Bue
arXiv 2023. [Paper] [Project]
20 Mar 2023

Leapfrog Diffusion Model for Stochastic Trajectory Prediction
Weibo Mao, Chenxin Xu, Qi Zhu, Siheng Chen, Yanfeng Wang
CVPR 2023. [Paper] [Github]
20 Mar 2023

Pluralistic Aging Diffusion Autoencoder
Peipei Li, Rui Wang, Huaibo Huang, Ran He, Zhaofeng He
arXiv 2023. [Paper]
20 Mar 2023

AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion Models
Yu Cao, Xiangqiao Meng, P.Y. Mok, Xueting Liu, Tong-Yee Lee, Ping Li
arxiv 2023. [Paper]
20 Mar 2023

Diffusion-based Document Layout Generation
Liu He, Yijuan Lu, John Corring, Dinei Florencio, Cha Zhang
arXiv 2023. [Paper]
19 Mar 2023

On the De-duplication of LAION-2B
Ryan Webster, Julien Rabin, Loic Simon, Frederic Jurie
arXiv 2023. [Paper]
17 Mar 2023

A Recipe for Watermarking Diffusion Models
Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, Min Lin
arXiv 2023. [Paper] [Github]
17 Mar 2023

DIRE for Diffusion-Generated Image Detection
Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, Houqiang Li
arXiv 2023. [Paper] [Github]
16 Mar 2023

DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion
Maham Tanveer, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang
arXiv 2023. [Paper] [Project]
16 Mar 2023

DiffusionAD: Denoising Diffusion for Anomaly Detection
Hui Zhang, Zheng Wang, Zuxuan Wu, Yu-Gang Jiang
arXiv 2023. [Paper]
15 Mar 2023

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
Naoto Inoue, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi
CVPR 2023. [Paper] [Project] [Github]
14 Mar 2023

Parallel Vertex Diffusion for Unified Visual Grounding
Zesen Cheng, Kehan Li, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen
arXiv 2023. [Paper]
13 Mar 2023

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, Luc Van Gool
arXiv 2023. [Paper]
13 Mar 2023

Detecting Images Generated by Diffusers
Davide Alessandro Coccomini, Andrea Esuli, Fabrizio Falchi, Claudio Gennaro, Giuseppe Amato
arXiv 2023. [Paper]
9 Mar 2023

Unifying Layout Generation with a Decoupled Diffusion Model
Mude Hui, Zhizheng Zhang, Xiaoyi Zhang, Wenxuan Xie, Yuwang Wang, Yan Lu
CVPR 2023. [Paper]
9 Mar 2023

DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer
Elad Levi, Eli Brosh, Mykola Mykhailych, Meir Perez
ICCV 2023. [Paper]
7 Mar 2023

Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition
Cindy M. Nguyen, Eric R. Chan, Alexander W. Bergman, Gordon Wetzstein
arXiv 2023. [Paper] [Project]
7 Mar 2023

Word-As-Image for Semantic Typography
Shir Iluz, Yael Vinker, Amir Hertz, Daniel Berio, Daniel Cohen-Or, Ariel Shamir
SIGGRAPH 2023. [Paper] [Project]
3 Mar 2023

Makeup Extraction of 3D Representation via Illumination-Aware Image Decomposition
Xingchao Yang, Takafumi Taketomi, Yoshihiro Kanamori
Eurographics 2023. [Paper]
26 Feb 2023

Monocular Depth Estimation using Diffusion Models
Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet
arXiv 2023. [Paper] [Github]
28 Feb 2023

Spatial-temporal Transformer-guided Diffusion based Data Augmentation for Efficient Skeleton-based Action Recognition
Yifan Jiang, Han Chen, Hanseok Ko
arXiv 2023. [Paper]
26 Feb 2023

LDFA: Latent Diffusion Face Anonymization for Self-driving Applications
Marvin Klemp, Kevin Rösch, Royden Wagner, Jannik Quehl, Martin Lauer
arXiv 2023. [Paper]
17 Feb 2023

Road Redesign Technique Achieving Enhanced Road Safety by Inpainting with a Diffusion Model
Sumit Mishra, Medhavi Mishra, Taeyoung Kim, Dongsoo Har
arXiv 2023. [Paper]
15 Feb 2023

Effective Data Augmentation With Diffusion Models
Brandon Trabucco, Kyle Doherty, Max Gurinas, Ruslan Salakhutdinov
arXiv 2023. [Paper] [Project]
7 Feb 2023

Learning End-to-End Channel Coding with Diffusion Models
Muah Kim, Rick Fritschek, Rafael F. Schaefer
WSA/SCC 2023. [Paper]
3 Feb 2023

Extracting Training Data from Diffusion Models
Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace
arXiv 2023. [Paper]
2 Feb 2023

Diffusion Models for High-Resolution Solar Forecasts
Yusuke Hatanaka, Yannik Glaser, Geoff Galgon, Giuseppe Torri, Peter Sadowski
arxiv 2023. [Paper]
1 Feb 2023

A Denoising Diffusion Model for Fluid Field Prediction
Gefan Yang, Stefan Sommer
arXiv 2023. [Paper]
27 Jan 2023

Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
Victor Boutin, Thomas Fel, Lakshya Singhal, Rishav Mukherji, Akash Nagaraj, Julien Colin, Thomas Serre
arXiv 2023. [Paper]
27 Jan 2023

PLay: Parametrically Conditioned Layout Generation using Latent Diffusion
Chin-Yi Cheng, Forrest Huang, Gang Li, Yang Li
arXiv 2023. [Paper]
27 Jan 2023

LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
Qiuhong Anna Wei, Sijie Ding, Jeong Joon Park, Rahul Sajnani, Adrien Poulenard, Srinath Sridhar, Leonidas Guibas
CVPR 2023. [Paper] [Project]
23 Jan 2023

Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models
Jun Yue, Leyuan Fang, Shaobo Xia, Yue Deng, Jiayi Ma
arXiv 2023. [Paper]
19 Jan 2023

Neural Image Compression with a Diffusion-Based Decoder
Noor Fathima Goose, Jens Petersen, Auke Wiggers, Tianlin Xu, Guillaume Sautière
arXiv 2023. [Paper]
13 Jan 2023

Diffusion Models For Stronger Face Morphing Attacks
Zander Blasingame, Chen Liu
arXiv 2023. [Paper]
10 Jan 2023

AI Art in Architecture
Joern Ploennigs, Markus Berger
arXiv 2022. [Paper]
19 Dec 2022

Diffusing Surrogate Dreams of Video Scenes to Predict Video Memorability
Lorin Sweeney, Graham Healy, Alan F. Smeaton
MediaEval Workshop 2022. [Paper]
19 Dec 2022

Diff-Font: Diffusion Model for Robust One-Shot Font Generation
Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao, Yu Qiao
arXiv 2022. [Paper]
12 Dec 2022

How to Backdoor Diffusion Models?
Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho
CVPR 2023. [Paper]
11 Dec 2022

Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models
Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
CVPR 2023. [Paper] [Github]
7 Dec 2022

ObjectStitch: Generative Object Compositing
Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga
arXiv 2022. [Paper]
2 Dec 2022

Post-training Quantization on Diffusion Models
Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan
arXiv 2022. [Paper] [Github]
28 Nov 2022

Diffusion Probabilistic Model Made Slim
Xingyi Yang, Daquan Zhou, Jiashi Feng, Xinchao Wang
CVPR 2023. [Paper]
27 Nov 2022

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
German Barquero, Sergio Escalera, Cristina Palmero
arXiv 2022. [Paper] [Project] [Github]
25 Nov 2022

JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models
Sepidehsadat Hosseini, Mohammad Amin Shabani, Saghar Irandoust, Yasutaka Furukawa
arXiv 2022. [Paper] [Project]
24 Nov 2022

HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
Mohammad Amin Shabani, Sepidehsadat Hosseini, Yasutaka Furukawa
arXiv 2022. [Paper] [Project]
23 Nov 2022

Can denoising diffusion probabilistic models generate realistic astrophysical fields?
Nayantara Mudur, Douglas P. Finkbeiner
NeurIPS Workshop 2022. [Paper]
22 Nov 2022

DiffDreamer: Consistent Single-view Perpetual View Generation with Conditional Diffusion Models
Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc Van Gool, Gordon Wetzstein
arXiv 2022. [Paper] [Project]
22 Nov 2022

CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming
Qihua Zhou, Ruibin Li, Song Guo, Yi Liu, Jingcai Guo, Zhenda Xu
arXiv 2022. [Paper]
15 Nov 2022

Extreme Generative Image Compression by Learning Text Embedding from Diffusion Models
Zhihong Pan, Xin Zhou, Hao Tian
arXiv 2022. [Paper]
14 Nov 2022

Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Andreas Stöckl
arXiv 2022. [Paper]
3 Nov 2022

On the detection of synthetic images generated by diffusion models
Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, Luisa Verdoliva
arXiv 2022. [Paper] [Github]
1 Nov 2022

DOLPH: Diffusion Models for Phase Retrieval
Shirin Shoushtari, Jiaming Liu, Ulugbek S. Kamilov
arXiv 2022. [Paper]
1 Nov 2022

Towards the Detection of Diffusion Model Deepfakes
Jonas Ricker, Simon Damm, Thorsten Holz, Asja Fischer
arXiv 2022. [Paper]
26 Oct 2022

Deep Data Augmentation for Weed Recognition Enhancement: A Diffusion Probabilistic Model and Transfer Learning Based Approach
Dong Chen, Xinda Qi, Yu Zheng, Yuzhen Lu, Zhaojian Li
arXiv 2022. [Paper] [Github]
18 Oct 2022

DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Diffusion Models
Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang
arXiv 2022. [Paper]
13 Oct 2022

Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng, Noriyuki Kojima, Alexander M. Rush
ICLR 2023. [Paper]
11 Oct 2022

What the DAAM: Interpreting Stable Diffusion Using Cross Attention
Raphael Tang, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Jimmy Lin, Ferhan Ture
arXiv 2022. [Paper] [Github]
10 Oct 2022

CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning
Shitong Xu
arXiv 2022. [Paper] [Github]
10 Oct 2022

Diffusion Models Beat GANs on Topology Optimization
François Mazé, Faez Ahmed
AAAI 2022. [Paper] [Project] [Github]
20 Aug 2022

Vector Quantized Diffusion Model with CodeUnet for Text-to-Sign Pose Sequences Generation
Pan Xie, Qipeng Zhang, Zexian Li, Hao Tang, Yao Du, Xiaohui Hu
arXiv 2022. [Paper]
19 Aug 2022

Deep Diffusion Models for Seismic Processing
Ricard Durall, Ammar Ghanim, Mario Fernandez, Norman Ettrich, Janis Keuper
arXiv 2022. [Paper]
21 Jul 2022

Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie Zhou, Jiwen Lu
CVPR 2022. [Paper] [Github]
25 Mar 2022

Audio

Generation

Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang
AAAI 2024. [Paper] [Project]
23 Aug 2023

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu, Qiao Tian, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley
arXiv 2023. [Paper]
10 Aug 2023

DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
Hyung-Seok Oh, Sang-Hoon Lee, Seong-Whan Lee
arXiv 2023. [Paper]
31 Jul 2023

Progressive distillation diffusion for raw music generation
Svetlana Pavlova
arXiv 2023. [Paper]
20 Jul 2023

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Heeseung Kim, Sungwon Kim, Jiheum Yeom, Sungroh Yoon
arXiv 2023. [Paper]
28 Jun 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter
arXiv 2023. [Paper]
15 Jun 2023

HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Ji-Sang Hwang, Sang-Hoon Lee, Seong-Whan Lee
arXiv 2023. [Paper]
12 Jun 2023

Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion
Haogeng Liu, Tao Wang, Jie Cao, Ran He, Jianhua Tao
arXiv 2023. [Paper]
9 Jun 2023

EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis
Haobin Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
InterSpeech 2023. [Paper]
1 Jun 2023

Efficient Neural Music Generation
Max W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Feng, Ming Tu, Yuliang Ji, Rui Xia, Mingbo Ma, Xuchen Song, Jitong Chen, Yuping Wang, Yuxuan Wang
arXiv 2023. [Paper] [Github]
25 May 2023

Generating symbolic music using diffusion models
Lilac Atassi
arXiv 2023. [Paper]
15 Mar 2023

DiffuseRoll: Multi-track multi-category music generation based on diffusion model
Hongfei Wang
arXiv 2023. [Paper]
14 Mar 2023

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
Giorgio Mariani, Irene Tallini, Emilian Postolache, Michele Mancusi, Luca Cosmo, Emanuele Rodolà
arXiv 2023. [Paper] [Project]
4 Feb 2023

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo
CVPR 2023. [Paper] [Github]
19 Dec 2022

SDMuse: Stochastic Differential Music Editing and Generation via Hybrid Representation
Chen Zhang, Yi Ren, Kejun Zhang, Shuicheng Yan
arXiv 2022. [Paper] [Project]
1 Nov 2022

Full-band General Audio Synthesis with Score-based Diffusion
Santiago Pascual, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, Joan Serrà
arXiv 2022. [Paper]
26 Oct 2022

Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi, Mayank Kumar, Singh, Yuki Mitsufuji
arXiv 2022. [Paper]
14 Oct 2022

Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Yin-Ping Cho, Yu Tsao, Hsin-Min Wang, Yi-Wen Liu
arXiv 2022. [Paper] [Project]
21 Sep 2022

DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu, Wen-Yi Hsiao, Fu-Rong Yang, Oscar Friedman, Warren Jackson, Scott Bruzenak, Yi-Wen Liu, Yi-Hsuan Yang
ISMIR 2022. [Paper] [Github]
9 Aug 2022

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren
ACM Multimedia 2022. [Paper] [Project]
13 Jul 2022

CARD: Classification and Regression Diffusion Models
Xizewen Han, Huangjie Zheng, Mingyuan Zhou
NeurIPS 2022. [Paper] [Github]
15 Jun 2022

Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu, Grigorios G Chrysos, Volkan Cevher
ICML workshop 2022. [Paper]
14 Jun 2022

Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse Engel
ISMIR 2022. [Paper]
11 Jun 2022

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu
NeurIPS 2022. [Paper] [Github]
30 May 2022

FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao
IJCAI 2022. [Paper] [Project] [Github]
21 Apr 2022

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel Bacchiani
Interspeech 2022. [Paper]
31 Mar 2022

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu
ICLR 2022. [Paper] [Github]
25 Mar 2022

ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
Shoule Wu, Ziqiang Shi
CoRR 2022. [Paper] [Project]
29 Jan 2022

Itô-Taylor Sampling Scheme for Denoising Diffusion Probabilistic Models using Ideal Derivatives
Hideyuki Tachibana, Mocho Go, Muneyoshi Inahara, Yotaro Katayama, Yotaro Watanabe
arXiv 2021. [Paper]
26 Dec 2021

Denoising Diffusion Gamma Models
Eliya Nachmani, Robin San Roman, Lior Wolf
arXiv 2021. [Paper]
10 Oct 2021

Variational Diffusion Models
Diederik P. Kingma, Tim Salimans, Ben Poole, Jonathan Ho
NeurIPS 2021. [Paper] [Github]
1 Jul 2021

CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis
Simon Rouard, Gaëtan Hadjeres
ISMIR 2021. [Paper] [Project]
14 Jun 2021

PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior
Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu
ICLR 2022. [Paper] [Project]
11 Jun 2021

ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation
Shoule Wu, Ziqiang Shi
arXiv 2022. [Paper] [Project]
17 May 2021

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Peng Liu, Zhou Zhao
AAAI 2022. [Paper] [Project] [Github]
6 May 2021

Symbolic Music Generation with Diffusion Models
Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon
ISMIR 2021. [Paper] [Github]
30 Mar 2021

DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro
ICLR 2021. [Paper] [Github]
21 Sep 2020

WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan
ICLR 2021. [Paper] [Project] [Github]
2 Sep 2020

Conversion

Voice Conversion with Denoising Diffusion Probabilistic GAN Models
Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
ADMA 2023. [Paper]
28 Aug 2023

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee
arXiv 2023. [Paper] [Project]
25 May 2023

Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
ACL 2023. [Paper]
22 May 2023

DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion
Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng
IEEE 2021. [Paper] [Github]
28 May 2021

Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Vadim Popov, Ivan Vovk, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov, Jiansheng Wei
ICLR 2022. [Paper] [Project]
28 Sep 2021

Enhancement

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou
arXiv 2023. [Paper] [Project]
3 Sep 2023

Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
arXiv 2023. [Paper]
16 Jul 2023

Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditions
Sandipana Dowerah, Ajinkya Kulkarni, Romain Serizel, Denis Jouvet
arXiv 2023. [Paper]
5 Jul 2023

Diffusion Posterior Sampling for Informed Single-Channel Dereverberation
Jean-Marie Lemercier, Simon Welker, Timo Gerkmann
arXiv 2023. [Paper]
21 Jun 2023

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Zilu Guo, Jun Du, Chin-Hui Lee, Yu Gao, Wenbin Zhang
arXiv 2023. [Paper]
14 Jun 2023

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model
Anastasiia Iashchenko, Pavel Andreev, Ivan Shchekotov, Nicholas Babaev, Dmitry Vetrov
Interspeech 2023. [Paper]
1 Jun 2023

SE-Bridge: Speech Enhancement with Consistent Brownian Bridge
Zhibin Qiu, Mengfan Fu, Fuchun Sun, Gulila Altenbek, Hao Huang
arXiv 2023. [Paper]
23 May 2023

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji
arXiv 2023. [Paper]
18 May 2023

Speech Signal Improvement Using Causal Generative Diffusion Models
Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann
ICASSP 2023. [Paper]
15 Mar 2023

Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement
Bunlong Lay, Simon Welker, Julius Richter, Timo Gerkmann
arXiv 2023. [Paper]
28 Feb 2023

Metric-oriented Speech Enhancement using Diffusion Probabilistic Model
Chen Chen, Yuchen Hu, Weiwei Weng, Eng Siong Chng
arXiv 2023. [Paper]
23 Feb 2023

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann
ICASSP 2023. [Paper]
22 Dec 2022

Unsupervised vocal dereverberation with diffusion-based generative models
Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji
ICASSP 2023. [Paper]
8 Nov 2022

DiffPhase: Generative Diffusion-based STFT Phase Retrieval
Tal Peer, Simon Welker, Timo Gerkmann
ICASSP 2023. [Paper]
8 Nov 2022

Cold Diffusion for Speech Enhancement
Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux
ICASSP 2023. [Paper]
4 Nov 2022

Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration
Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann
Interspeech 2022. [Paper] [Project] [Github]
4 Nov 2022

SRTNet: Time Domain Speech Enhancement Via Stochastic Refinement
Zhibin Qiu, Mengfan Fu, Yinfeng Yu, LiLi Yin, Fuchun Sun, Hao Huang
ICASSP 2022. [Paper] [Github]
30 Oct 2022

A Versatile Diffusion-based Generative Refiner for Speech Enhancement
Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji
ICASSP 2023. [Paper]
27 Oct 2022

Conditioning and Sampling in Variational Diffusion Models for Speech Super-resolution
Chin-Yun Yu, Sung-Lin Yeh, György Fazekas, Hao Tang
ICASSP 2023. [Paper] [Project] [Github]
27 Oct 2022

Solving Audio Inverse Problems with a Diffusion Model
Eloi Moliner, Jaakko Lehtinen, Vesa Välimäki
ICASSP 2023. [Paper]
27 Oct 2022

Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann
arXiv 2022. [Paper] [Project] [Github]
11 Aug 2022

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates
Seungu Han, Junhyeok Lee
Interspeech 2022. [Paper] [Project]
17 Jun 2022

Universal Speech Enhancement with Score-based Diffusion
Joan Serrà, Santiago Pascual, Jordi Pons, R. Oguz Araz, Davide Scaini
arXiv 2022. [Paper]
7 Jun 2022

Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain
Simon Welker, Julius Richter, Timo Gerkmann
InterSpeech 2022. [Paper] [Github]
31 Mar 2022

Conditional Diffusion Probabilistic Model for Speech Enhancement
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao
IEEE 2022. [Paper] [Github]
10 Feb 2022

A Study on Speech Enhancement Based on Diffusion Probabilistic Model
Yen-Ju Lu, Yu Tsao, Shinji Watanabe
APSIPA 2021. [Paper]
25 Jul 2021

Restoring degraded speech via a modified diffusion model
Jianwei Zhang, Suren Jayasuriya, Visar Berisha
Interspeech 2021. [Paper]
22 Apr 2021

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
Junhyeok Lee, Seungu Han
Interspeech 2021. [Paper] [Project] [Github]
6 Apr 2021

Separation

Diffusion-based Signal Refiner for Speech Separation
Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji
arXiv 2023. [Paper]
10 May 2023

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
Giorgio Mariani, Irene Tallini, Emilian Postolache, Michele Mancusi, Luca Cosmo, Emanuele Rodolà
arXiv 2023. [Paper] [Project]
4 Feb 2023

Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Shahar Lutati, Eliya Nachmani, Lior Wolf
arXiv 2023. [Paper]
25 Jan 2023

Diffusion-based Generative Speech Source Separation
Robin Scheibler, Youna Ji, Soo-Whan Chung, Jaeuk Byun, Soyeon Choe, Min-Seok Choi
ICASSP 2023. [Paper]
31 Oct 2022

Instrument Separation of Symbolic Music by Explicitly Guided Diffusion Model
Sangjun Han, Hyeongrae Ihm, DaeHan Ahn, Woohyung Lim
NeurIPS Workshop 2022. [Paper]
5 Sep 2022

Text-to-Speech

DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li, Qiao Tian, Yuping Wang, Lei Xie
TASLP 2023. [Paper]
2 Sep 2023

LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Jie Chen, Xingchen Song, Zhendong Peng, Binbin Zhang, Fuping Pan, Zhiyong Wu
ICASSP 2023. [Paper]
31 Aug 2023

Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models
Heyang Xue, Shuai Guo, Pengcheng Zhu, Mengxiao Bi
arXiv 2023. [Paper] [Project]
21 Aug 2023

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Peike Li, Boyu Chen, Yao Yao, Yikai Wang, Allen Wang, Alex Wang
arXiv 2023. [Paper] [Project]
9 Aug 2023

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies
Ke Chen, Yusong Wu, Haohe Liu, Marianna Nezhurina, Taylor Berg-Kirkpatrick, Shlomo Dubnov
arXiv 2023. [Paper] [Project]
3 Aug 2023

Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS
*Myeongj

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/AllinToyou/article/detail/162736
推荐阅读
相关标签
  

闽ICP备14008679号