Improving Extreme Low-bit Quantization with Soft Threshold | |
Xu WX(许伟翔)1,2; Wang PS(王培松)1,2; Cheng J(程健)1,2 | |
刊名 | IEEE Transactions on Circuits and Systems for Video Technology |
2022 | |
页码 | 1549 - 1563 |
英文摘要 | Deep neural networks executing with low precision at inference time can gain acceleration and compression advantages over their high-precision counterparts, but need to overcome the challenge of accuracy degeneration as the bit-width decreases. This work focuses on under 4-bit quantization that has a significant accuracy degeneration. We start with ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. We find that the hard threshold ∆ introduced in previous ternary networks for determining quantization intervals and the suboptimal solution of ∆ limit the performance of the ternary model. To alleviate it, we present Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine ternarized values instead of depending on a hard threshold. Based on it, we further generalize the idea of soft threshold from ternarization to arbitrary bitwidth, named Soft Threshold Quantized Networks (STQN). We observe that previous quantization relies on the rounding-tonearest function, constraining the quantization solution space and leading to a significant accuracy degradation, especially in lowbit (≤ 3-bits) quantization. Instead of relying on the traditional rounding-to-nearest function, STQN is able to determine quantization intervals by itself adaptively. Accuracy experiments on image classification, object detection and instance segmentation, as well as efficiency experiments on field-programmable gate array (FPGA) demonstrate that the proposed framework can achieve a prominent tradeoff between accuracy and efficiency. Code is available at: https://github.com/WeixiangXu/STTN. |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/52073] |
专题 | 类脑芯片与系统研究 |
作者单位 | 1.中国科学院大学 2.中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Xu WX,Wang PS,Cheng J. Improving Extreme Low-bit Quantization with Soft Threshold[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022:1549 - 1563. |
APA | Xu WX,Wang PS,&Cheng J.(2022).Improving Extreme Low-bit Quantization with Soft Threshold.IEEE Transactions on Circuits and Systems for Video Technology,1549 - 1563. |
MLA | Xu WX,et al."Improving Extreme Low-bit Quantization with Soft Threshold".IEEE Transactions on Circuits and Systems for Video Technology (2022):1549 - 1563. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论