ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
Zhang, Yuxin2,3; Dong, Weiming2,3; Tang, Fan4; Huang, Nisha2,3; Huang, Haibin5; Ma, Chongyang5; Lee, Tong-Yee6; Deussen, Oliver1; Xu, Changsheng2,3
刊名ACM TRANSACTIONS ON GRAPHICS
2023-12-01
卷号42期号:6页码:14
关键词Image generation Diffusion models Attribute-aware editing Model personalization
ISSN号0730-0301
DOI10.1145/3618342
通讯作者Dong, Weiming(weiming.dong@ia.ac.cn)
英文摘要Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to a lack of disentanglement and editability. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information, providing a new perspective on representing, generating, and editing images. We develop the Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called ProSpect. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer better disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models. Our source code is available at https: //github.com/zyxElsa/ProSpect.
资助项目National Key R&D Program of China[2020AAA0106200] ; National Natural Science Foundation of China[61832016] ; National Natural Science Foundation of China[62102162] ; National Natural Science Foundation of China[U20B2070] ; Beijing Natural Science Foundation[L221013] ; National Science and Technology Council[111-2221-E-006-112-MY3] ; Deutsche Forschungsgemeinschaft (DFG)[413891298]
WOS研究方向Computer Science
语种英语
出版者ASSOC COMPUTING MACHINERY
WOS记录号WOS:001139790400072
资助机构National Key R&D Program of China ; National Natural Science Foundation of China ; Beijing Natural Science Foundation ; National Science and Technology Council ; Deutsche Forschungsgemeinschaft (DFG)
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/55394]  
专题多模态人工智能系统全国重点实验室
通讯作者Dong, Weiming
作者单位1.Univ Konstanz, Constance, Germany
2.Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China
3.UCAS, Sch Artificial Intelligence, Beijing, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
5.Kuaishou Technol, Beijing, Peoples R China
6.Natl Cheng Kung Univ, Tainan, Taiwan
推荐引用方式
GB/T 7714
Zhang, Yuxin,Dong, Weiming,Tang, Fan,et al. ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models[J]. ACM TRANSACTIONS ON GRAPHICS,2023,42(6):14.
APA Zhang, Yuxin.,Dong, Weiming.,Tang, Fan.,Huang, Nisha.,Huang, Haibin.,...&Xu, Changsheng.(2023).ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models.ACM TRANSACTIONS ON GRAPHICS,42(6),14.
MLA Zhang, Yuxin,et al."ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models".ACM TRANSACTIONS ON GRAPHICS 42.6(2023):14.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace