Module quantization

Expand description

Tensor quantization module.

Structs§

Calibration: Calibration method used to compute the quantization range mapping.
QuantAccPrecision: Quantization accumulator precision. This is the precision to used when accumulating values while executing algorithms such as matmul.
QuantInputType: Data type used to represent quantized values.
QuantLevel: Level or granularity of quantization.
QuantMode: Strategy used to quantize values.
QuantPropagation: Specify if the output of an operation is quantized using the scheme of the input or returned unquantized.
QuantizationStrategy: Quantization strategy.

QTensorPrimitive: Quantized tensor primitive.
Quantization: Quantization scheme to convert elements of a higher precision data type E to a lower precision data type Q and vice-versa.

pack_i8s_to_u32s: Pack signed 8-bit integer values into a sequence of unsigned 32-bit integers.
unpack_u32s_to_i8s: Unpack 32-bit unsigned integer values into a sequence of signed 8-bit integers.