Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI
Nick Bryan-Kinns1
刊名Machine Intelligence Research
2024
卷号21期号:1页码:29-45
关键词Variational auto-encoder, explainable AI (XAI), generative music, musical features, datasets
ISSN号2731-538X
DOI10.1007/s11633-023-1457-1
英文摘要Generative AI models for music and the arts in general are increasingly complex and hard to understand. The field of explainable AI (XAI) seeks to make complex and opaque AI models such as neural networks more understandable to people. One approach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on generative AI models. This paper contributes a systematic examination of the impact that different combinations of variational auto-encoder models (measureVAE and adversarialVAE), configurations of latent space in the AI model (from 4 to 256 latent dimensions), and training datasets (Irish folk, Turkish folk, classical, and pop) have on music generation performance when 2 or 4 meaningful musical attributes are imposed on the generative model. To date, there have been no systematic comparisons of such models at this level of combinatorial detail. Our findings show that measureVAE has better reconstruction performance than adversarialVAE which has better musical attribute independence. Results demonstrate that measureVAE was able to generate music across music genres with interpretable musical dimensions of control, and performs best with low complexity music such as pop and rock. We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using measureVAE to generate music across genres. Our results are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models, musical features, and datasets for more understandable generation of music.
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/54573]  
专题自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位1.School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK
2.Computer Science Department, Carleton College, Northfield MN 55057, USA
推荐引用方式
GB/T 7714
Nick Bryan-Kinns. Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI[J]. Machine Intelligence Research,2024,21(1):29-45.
APA Nick Bryan-Kinns.(2024).Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI.Machine Intelligence Research,21(1),29-45.
MLA Nick Bryan-Kinns."Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI".Machine Intelligence Research 21.1(2024):29-45.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace