Interpretability of Neural Networks Based on Game-theoretic Interactions | |
Huilin Zhou2; Jie Ren2; Huiqi Deng2; Xu Cheng2; Jinpeng Zhang1; Quanshi Zhang2 | |
刊名 | Machine Intelligence Research |
2024 | |
卷号 | 21期号:4页码:718-739 |
关键词 | Model interpretability and transparency explainable AI game theory interaction deep learning |
ISSN号 | 2731-538X |
DOI | 10.1007/s11633-023-1419-7 |
英文摘要 | This paper introduces the system of game-theoretic interactions, which connects both the explanation of knowledge encoded in a deep neural networks (DNN) and the explanation of the representation power of a DNN. In this system, we define two game theoretic interaction indexes, namely the multi-order interaction and the multivariate interaction. More crucially, we use these interaction indexes to explain feature representations encoded in a DNN from the following four aspects: 1) Quantifying knowledge concepts encoded by a DNN; 2) Exploring how a DNN encodes visual concepts, and extracting prototypical concepts encoded in the DNN; 3) Learning optimal baseline values for the Shapley value, and providing a unified perspective to compare fourteen different attribution methods; 4) Theoretically explaining the representation bottleneck of DNNs. Furthermore, we prove the relationship between the interaction encoded in a DNN and the representation power of a DNN (e.g., generalization power, adversarial transferability, and adversarial robustness). In this way, game-theoretic interactions successfully bridge the gap between “the explanation of knowledge concepts encoded in a DNN” and “the explanation of the representation capacity of a DNN” as a unified explanation. |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/58569] |
专题 | 自动化研究所_学术期刊_International Journal of Automation and Computing |
作者单位 | 1.XLAB, The Second Academy of China Aerospace Science and Industry Corporation, Beijing 100854, China 2.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China |
推荐引用方式 GB/T 7714 | Huilin Zhou, Jie Ren, Huiqi Deng,et al. Interpretability of Neural Networks Based on Game-theoretic Interactions[J]. Machine Intelligence Research,2024,21(4):718-739. |
APA | Huilin Zhou, Jie Ren, Huiqi Deng, Xu Cheng,Jinpeng Zhang,& Quanshi Zhang.(2024).Interpretability of Neural Networks Based on Game-theoretic Interactions.Machine Intelligence Research,21(4),718-739. |
MLA | Huilin Zhou,et al."Interpretability of Neural Networks Based on Game-theoretic Interactions".Machine Intelligence Research 21.4(2024):718-739. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论