Skip to content

ambideXtrous9/Quantization-of-Models-PTQ-and-QAT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantization

Link : Quantization : PTQ and QAT on CNN using Keras

Quantization is a model size reduction technique that converts model weights from high-precision floating-point representation to low-precision floating-point (FP) or integer (INT) representations, such as 16-bit or 8-bit.

image

image

Post-Training Quantization (PTQ)

Post-training quantization (PTQ) is a quantization technique where the model is quantized after it has been trained.

Quantization-Aware Training (QAT)

Quantization-aware training (QAT) is a fine-tuning of the PTQ model, where the model is further trained with quantization in mind. The quantization process (scaling, clipping, and rounding) is incorporated into the training process, allowing the model to be trained to retain its accuracy even after quantization

image

References :

  1. QAT PyTorch
  2. QAT Details

About

Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages