Improving Extreme Low-bit Quantization with Soft Threshold

	Improving Extreme Low-bit Quantization with Soft Threshold
	Xu WX(许伟翔)1,2; Wang PS(王培松)1,2; Cheng J(程健)1,2
刊名	IEEE Transactions on Circuits and Systems for Video Technology
	2022
页码	1549 - 1563
英文摘要	Deep neural networks executing with low precision at inference time can gain acceleration and compression advantages over their high-precision counterparts, but need to overcome the challenge of accuracy degeneration as the bit-width decreases. This work focuses on under 4-bit quantization that has a significant accuracy degeneration. We start with ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. We find that the hard threshold ∆ introduced in previous ternary networks for determining quantization intervals and the suboptimal solution of ∆ limit the performance of the ternary model. To alleviate it, we present Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine ternarized values instead of depending on a hard threshold. Based on it, we further generalize the idea of soft threshold from ternarization to arbitrary bitwidth, named Soft Threshold Quantized Networks (STQN). We observe that previous quantization relies on the rounding-tonearest function, constraining the quantization solution space and leading to a significant accuracy degradation, especially in lowbit (≤ 3-bits) quantization. Instead of relying on the traditional rounding-to-nearest function, STQN is able to determine quantization intervals by itself adaptively. Accuracy experiments on image classification, object detection and instance segmentation, as well as efficiency experiments on field-programmable gate array (FPGA) demonstrate that the proposed framework can achieve a prominent tradeoff between accuracy and efficiency. Code is available at: https://github.com/WeixiangXu/STTN. Index Terms—Convolutional neural network, network compression, low-bit quantization, ternary quantization.
内容类型	期刊论文
源URL	[http://ir.ia.ac.cn/handle/173211/52073]
专题	类脑芯片与系统研究
作者单位	1.中国科学院大学 2.中国科学院自动化研究所
推荐引用方式 GB/T 7714	Xu WX,Wang PS,Cheng J. Improving Extreme Low-bit Quantization with Soft Threshold[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022:1549 - 1563.
APA	Xu WX,Wang PS,&Cheng J.(2022).Improving Extreme Low-bit Quantization with Soft Threshold.IEEE Transactions on Circuits and Systems for Video Technology,1549 - 1563.
MLA	Xu WX,et al."Improving Extreme Low-bit Quantization with Soft Threshold".IEEE Transactions on Circuits and Systems for Video Technology (2022):1549 - 1563.