Lecture 7: Quantization
in ML Quantization usually mean FP32 -> INT8Usually, when the training of the model is finished, train parameters falls in to simple shallow distribution. scale ? zero-point ? how do i calculate that scale? Z = 207 so we can map 2^b - 1 to 256. S = 0.6/255Linear depends on function , symmetric asymmetric depends on starting point ( 0 == symmetric , otherwise asymmetric)Dequantization cannot repr..