ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models | |
Zhang, Yuxin2,3; Dong, Weiming2,3; Tang, Fan4; Huang, Nisha2,3; Huang, Haibin5; Ma, Chongyang5; Lee, Tong-Yee6; Deussen, Oliver1; Xu, Changsheng2,3 | |
刊名 | ACM TRANSACTIONS ON GRAPHICS |
2023-12-01 | |
卷号 | 42期号:6页码:14 |
关键词 | Image generation Diffusion models Attribute-aware editing Model personalization |
ISSN号 | 0730-0301 |
DOI | 10.1145/3618342 |
通讯作者 | Dong, Weiming(weiming.dong@ia.ac.cn) |
英文摘要 | Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to a lack of disentanglement and editability. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information, providing a new perspective on representing, generating, and editing images. We develop the Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called ProSpect. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer better disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models. Our source code is available at https: //github.com/zyxElsa/ProSpect. |
资助项目 | National Key R&D Program of China[2020AAA0106200] ; National Natural Science Foundation of China[61832016] ; National Natural Science Foundation of China[62102162] ; National Natural Science Foundation of China[U20B2070] ; Beijing Natural Science Foundation[L221013] ; National Science and Technology Council[111-2221-E-006-112-MY3] ; Deutsche Forschungsgemeinschaft (DFG)[413891298] |
WOS研究方向 | Computer Science |
语种 | 英语 |
出版者 | ASSOC COMPUTING MACHINERY |
WOS记录号 | WOS:001139790400072 |
资助机构 | National Key R&D Program of China ; National Natural Science Foundation of China ; Beijing Natural Science Foundation ; National Science and Technology Council ; Deutsche Forschungsgemeinschaft (DFG) |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/55394] |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Dong, Weiming |
作者单位 | 1.Univ Konstanz, Constance, Germany 2.Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China 3.UCAS, Sch Artificial Intelligence, Beijing, Peoples R China 4.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 5.Kuaishou Technol, Beijing, Peoples R China 6.Natl Cheng Kung Univ, Tainan, Taiwan |
推荐引用方式 GB/T 7714 | Zhang, Yuxin,Dong, Weiming,Tang, Fan,et al. ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models[J]. ACM TRANSACTIONS ON GRAPHICS,2023,42(6):14. |
APA | Zhang, Yuxin.,Dong, Weiming.,Tang, Fan.,Huang, Nisha.,Huang, Haibin.,...&Xu, Changsheng.(2023).ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models.ACM TRANSACTIONS ON GRAPHICS,42(6),14. |
MLA | Zhang, Yuxin,et al."ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models".ACM TRANSACTIONS ON GRAPHICS 42.6(2023):14. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论