-
ILMPQ: An Intra-Layer Multi-Precision Deep Neural Network Quantization framew...
ILMPQ: An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA -
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large La...
Large Language Models (LLMs) have greatly advanced the natural language processing paradigm. However, the high computational load and huge model sizes pose a grand challenge for... -
TensorQuant
TensorQuant toolbox is used to apply fixed point quantization to DNNs. The simulations are focused on popular CNN topologies, such as Inception V1, Inception V3, ResNet 50 and... -
LSQ+: Improving low-bit quantization through learnable offsets and better ini...
The proposed method, called LSQ+, extends LSQ [7] by adding a simple yet effective learnable offset parameter for activation quantization to recover the lost accuracy on... -
IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Sho...
Learning to synthesize data has emerged as a promising direction in zero-shot quantization (ZSQ), which represents neural networks by low-bit integer without accessing any of...