3rd Workshop on Generative Models
for Computer Vision
CVPR 2025
8:45am - 5:00pm, Tuesday, June 11th, 2025 Grand A2, Music City Center, Nashville, Tennessee
Overview
Recent advances in generative modeling leveraging generative adversarial networks, auto-regressive models, neural fields and diffusion models have enabled the synthesis of near photorealistic images, drastically increasing the visibility and popularity of generative modeling across the computer vision research community. However, these impressive advances in generative modeling have not yet found wide adoption in computer vision for visual recognition tasks. In this workshop, we aim to bring together researchers from the fields of image synthesis and computer vision to facilitate discussions and progress at the intersection of those two subfields. We investigate the question: "How can visual recognition benefit from the advances in generative image modeling?". We invite a diverse set of experts to discuss their recent research results and future directions for generative modeling and computer vision, with a particular focus on the intersection between image synthesis and visual recognition. We hope this workshop will lay the foundation for future development of generative models for computer vision tasks.

Invited Speakers
Schedule
11th of June, 2025 | |
---|---|
8:45 | Opening |
9:00 | Invited Talk: Rana Hanocka |
9:40 | Invited Talk: Yingnian Wu |
10:20 | Coffee Break |
10:40 | Invited Talk: Björn Ommer |
11:20 | Invited Talk: Yiyi Liao |
12:00 | Lunch |
13:00 | Posters |
14:00 | Invited Talk: Alan Yuille |
14:40 | Invited Talk: Jiatao Gu |
15:20 | Coffee Break |
15:40 | Invited Talk: Kaiming He |
16:20 | Invited Talk: Zhuowen Tu |
17:00 | Closing |
Covered Topics
-
Submission site: OpenReview
- Advances in generative image models
- Inversion of generative image models
- Training computer vision with realistic synthetic images
- Benchmarking computer vision with generative models
- Analysis-by-synthesis / render-and-compare approaches for visual recognition
- Self-supervised learning with generative models
- Adversarial attacks and defenses with generative models
- Out-of-distribution generalization and detection with generative models
- Ethical considerations in generative modeling, dataset and model biases
Author kit: CVPR Author KIT.
We invite submissions of short papers (4 page abstracts). Accepted submissions will be presented as posters at the workshop and selected papers will be presented as spotlights. Submissions to this workshop are non-archival, allowing for the inclusion of ongoing, unpublished work or dual submission.The short papers will Not be included in the proceedings of CVPR. References may be included on pages beyond the 4-page limit.
Potential topics include but are not limited to:
Important Dates
Event | Date (Anywhere on Earth) |
---|---|
Workshop paper submission deadline | April 25, 2025 |
Decisions | April 30, 2025 |
Accepted Papers
-
Diffusion Classifiers Understand Compositionality, but Conditions Apply [Paper]
Yujin Jeong, Arnas Uselis, Seong Joon Oh, Anna Rohrbach -
Objaverse++: Curated 3D Object Dataset with Quality Annotations [Paper]
Chendi Lin, Heshan Liu, Qunshu Lin, Zachary Bright, Shitao Tang, Yihui He, Minghao Liu, Ling Zhu, Cindy Le -
DICE: Discrete Inversion Enabling Controllable Editing for Masked Generative Models [Paper]
Xiaoxiao He, Ligong Han, Quan Dao, Song Wen, Minhao Bai, Di Liu, Han Zhang, Felix Juefei-Xu, Chaowei Tan, Bo Liu, Martin Renqiang Min, Kang Li, Faez Ahmed, Akash Srivastava, Hongdong Li, Junzhou Huang, Dimitris N. Metaxas -
Where Do Erased Concepts Go in Diffusion Models? [Paper]
Kevin Lu, Nicky Kriplani, Rohit Gandikota, Minh Pham, David Bau, Chinmay Hegde, Niv Cohen -
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes [Paper]
Ruijie Lu, Yixin Chen, Junfeng Ni, Baoxiong Jia, Yu Liu, Diwen Wan, Gang Zeng, Siyuan Huang -
M3Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing [Paper]
Mohammadreza Mofayezi, Reza Alipour, Mohammad Ali Kakavand, Ehsaneddin Asgari -
How Useful is the Density Learned by GANs for Computer Vision? [Paper]
Roy Friedman, Yair Weiss -
Emergence and Evolution of Interpretable Concepts in Diffusion Models Through the Lens of Sparse Autoencoders [Paper]
Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi -
EscherNet++: Simultaneous Amodal Completion and Scalable View Synthesis [Paper]
Xinan Zhang, Muhammad Zubair Irshad, Anthony Yezzi, Yi-Chang Tsai, Zsolt Kira -
An Image-to-Music Generation Framework Powered by An Algorithm-Driven Music Core [Paper]
Callie C. Liao, Duoduo Liao, Ellie L. Zhang -
Particle-based 6D Object Pose Estimation from Point Clouds using Diffusion Models [Paper]
Christian Möller, Niklas Funk, Jan Peters -
Learn Your Scales: Towards Scale-Consistent Generative Novel View Synthesis [Paper]
Fereshteh Forghani, Jason J. Yu, Tristan Aumentado-Armstrong, Konstantinos G. Derpanis, Marcus A. Brubaker -
VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors [Paper]
Juil Koo, Paul Guerrero, Chun-Hao P. Huang, Duygu Ceylan, Minhyuk Sung -
Generative Modeling of Weights: Generalization or Memorization? [Paper]
Boya Zeng, Yida Yin, Zhiqiu Xu, Zhuang Liu -
Towards Efficient Vision Transformers for Perceptual Quality Assessment of Diffusion-Generated Images [Paper]
Shivam Bhardwaj, Tushar Shinde -
GaussianVAE: Adaptive Learning Dynamics of 3D Gaussians for High-Fidelity Super-Resolution [Paper]
Shuja Khalid, Mohamed Ibrahim, Yang Liu -
HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction [Paper]
Chen Bao, Jiarui Xu, Xiaolong Wang, Abhinav Gupta, Homanga Bharadhwaj -
Flow-Optimizer: Revealing an Optimizable Flow Latent Space via One-Step Inversion for Controlled Interpolation and Editing [Paper]
Yan Zheng, Yi Yang -
DDPM-BP: Bernoulli Priors as Efficient Denoising Guides for Diffusion Models [Paper]
Magdalena Proszewska, Nikolay Malkin, N. Siddharth -
Guiding Diffusion with Deep Geometric Moments: Balancing Fidelity and Variation [Paper]
Sangmin Jung, Utkarsh Nath, Yezhou Yang, Giulia Pedrielli, Joydeep Biswas, Amy Zhang, Hassan Ghasemzadeh, Pavan Turaga -
Scaled Momentum Guidance for Flow Models [Paper]
Wooyeol Baek, Seongdo Kim, Jinseong Kim, Jongyoo Kim -
FreSca: Scaling in Frequency Space Enhances Diffusion Models [Paper]
Chao Huang, Susan Liang, Yunlong Tang, Jing Bi, Li Ma, Yapeng Tian, Chenliang Xu -
Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models [Paper]
Ketan Suhaas Saichandran, Xavier Thomas, Prakhar Kaushik, Deepti Ghadiyaram -
Panoptic Diffusion Models: Co-generation of Images and Segmentation Maps [Paper]
Yinghan Long, Kaushik Roy -
Boosting Adversarial Transferability with a Generative Model Perspective [Paper]
Jongoh Jeong, Hunmin Yang, Kuk-Jin Yoon -
ConceptMix++: Leveling the Playing Field in Text-to-Image Benchmarking via Iterative Prompt Optimization [Paper]
Haosheng Gan, Berk Tinaz, Mohammad Shahab Sepehri, Zalan Fabian, Mahdi Soltanolkotabi -
Rectified CFG++ for Flow Based Models [Paper]
Shreshth Saini, Shashank Gupta, Alan C. Bovik -
Generative Defect Synthesis for Enhancing Industrial Anomaly Detection [Paper]
Avinash Kumar Sharma, Tushar Shinde -
Pixel-Aligned Multi-View Generation with Depth Guided Decoder [Paper]
Zhenggang Tang, Peiye Zhuang, Chaoyang Wang, Aliaksandr Siarohin, Yash Kant, Alexander Schwing, Sergey Tulyakov, Hsin-Ying Lee -
Generative Modeling of Weights: Generalization or Memorization? [Paper]
Boya Zeng, Yida Yin, Zhiqiu Xu, Zhuang Liu -
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis [Paper]
Woojung Han, Yeonkyung Lee, Chanyoung Kim, Kwanghyun Park, Seong Jae Hwang -
Visual Acoustic Fields [Paper]
Yuelei Li, Hyunjin Kim, Fangneng Zhan, Ri-Zhao Qiu, Mazeyu Ji, Xiaojun Shan, Xueyan Zou, Paul Liang, Hanspeter Pfister, Xiaolong Wang
Organizers
Top