# Understanding TensorFlow quantization API
###### tags: small_tpu
[TOC]
## GitHub repo
[link](https://github.com/WeiCheng14159/Small_TPU)
## TensorFlow Quantization Flow
```graphviz
digraph G {
graph [rankdir="TB"];
node [color=red fontsize=10 fontname="Verdana"];
"Build NN model \n i.e. tf.keras.models.Sequential()" ->
"Clone NN layers conditionally \n i.e. tf.keras.models.clone_model(model, clone_function=xx)" ->
"clone_function specifies which layers (i.e. Dense layer) should be quantized" ->
"quantize_apply apply quantization config to model \n i.e. q_model = quantize_apply(annotated_model)" ->
"Annotated model should be recompiled \n i.e. model.compile(...)" ->
"Quantization config class specifies 6 member functions to override \n i.e. tfmot.quantization.keras.QuantizeConfig" ->
"Each member functions depend on quantizers \n i.e. tfmot.quantization.keras.quantizers.Quantizer" -> "Each Quantizer should specify the following functions: \n 1. __init__ 2. build 3. __call__";
}
```
### TF Quantization Python APIs
#### tfmot.quantization.keras.QuantizeConfig
[Documentation](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/QuantizeConfig)
[Source Code](https://github.com/tensorflow/model-optimization/blob/da9cca770e6a1abb55f6e38f9a9d47cc731dd6a9/tensorflow_model_optimization/python/core/quantization/keras/quantize_config.py#L24-L202)
Notice that 6 member functions of a quantize config are as follow:
* def get_weights_and_quantizers(self, layer):
* def get_activations_and_quantizers(self, layer):
* def set_quantize_weights(self, layer, quantize_weights):
* def set_quantize_activations(self, layer, quantize_activations):
* def get_output_quantizers(self, layer):
* def get_config(self):
All these member functions depend on quantizers [Doc](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/quantizers) i.e. [AllValuesQuantizer](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/quantizers/AllValuesQuantizer), [FixedQuantizer](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/quantizers/FixedQuantizer), [LastValueQuantizer](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/quantizers/LastValueQuantizer), [MovingAverageQuantizer](https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/quantization/keras/quantizers/MovingAverageQuantizer),
These quantizers depend on [tf.quantization.fake_quant_with_min_max_args](https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_args) TF ops
#### fake_quant_with_min_max_args
[Docs](https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_args)
### TF Quantization C++ APIs
[Docs](https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_args)
[Source Code](https://github.com/tensorflow/tensorflow/blob/ac74e1746a28b364230072d4dac5a45077326dc2/tensorflow/core/kernels/fake_quant_ops.cc#L63-L98)