Interpretability of Neural Networks Based on Game-theoretic Interactions

doi:10.1007/s11633-023-1419-7

CORC > 自动化研究所 > 中国科学院自动化研究所 > 学术期刊 > International Journal of Automation and Computing

	Interpretability of Neural Networks Based on Game-theoretic Interactions
	Huilin Zhou 2; Jie Ren 2; Huiqi Deng 2; Xu Cheng 2; Jinpeng Zhang 1; Quanshi Zhang 2
刊名	Machine Intelligence Research
	2024
卷号	21 期号:4 页码:718-739
关键词	Model interpretability and transparency explainable AI game theory interaction deep learning
ISSN号	2731-538X
DOI	10.1007/s11633-023-1419-7
英文摘要	This paper introduces the system of game-theoretic interactions, which connects both the explanation of knowledge encoded in a deep neural networks (DNN) and the explanation of the representation power of a DNN. In this system, we define two game theoretic interaction indexes, namely the multi-order interaction and the multivariate interaction. More crucially, we use these interaction indexes to explain feature representations encoded in a DNN from the following four aspects: 1) Quantifying knowledge concepts encoded by a DNN; 2) Exploring how a DNN encodes visual concepts, and extracting prototypical concepts encoded in the DNN; 3) Learning optimal baseline values for the Shapley value, and providing a unified perspective to compare fourteen different attribution methods; 4) Theoretically explaining the representation bottleneck of DNNs. Furthermore, we prove the relationship between the interaction encoded in a DNN and the representation power of a DNN (e.g., generalization power, adversarial transferability, and adversarial robustness). In this way, game-theoretic interactions successfully bridge the gap between “the explanation of knowledge concepts encoded in a DNN” and “the explanation of the representation capacity of a DNN” as a unified explanation.
内容类型	期刊论文
源URL	[http://ir.ia.ac.cn/handle/173211/58569]
专题	自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位	1.XLAB, The Second Academy of China Aerospace Science and Industry Corporation, Beijing 100854, China 2.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
推荐引用方式 GB/T 7714	Huilin Zhou, Jie Ren, Huiqi Deng,et al. Interpretability of Neural Networks Based on Game-theoretic Interactions[J]. Machine Intelligence Research,2024,21(4):718-739.
APA	Huilin Zhou, Jie Ren, Huiqi Deng, Xu Cheng,Jinpeng Zhang,& Quanshi Zhang.(2024).Interpretability of Neural Networks Based on Game-theoretic Interactions.Machine Intelligence Research,21(4),718-739.
MLA	Huilin Zhou,et al."Interpretability of Neural Networks Based on Game-theoretic Interactions".Machine Intelligence Research 21.4(2024):718-739.