Module quantization

Expand description

Tensor quantization module.

Structs§

CalibrationRange
The observed input calibration range.
QParams
The quantization tensor data parameters.
QuantScheme
Describes a quantization scheme/configuration.
QuantizationParametersPrimitive
The quantization parameters primitive.
QuantizedBytes
Quantized data bytes representation.
SymmetricQuantization
Symmetric quantization scheme.

Enums§

Calibration
Calibration method used to compute the quantization range mapping.
QuantAccPrecision
Quantization accumulator precision. This is the precision to used when accumulating values while executing algorithms such as matmul.
QuantInputType
Data type used to represent quantized values.
QuantLevel
Level or granularity of quantization.
QuantMode
Strategy used to quantize values.
QuantPropagation
Specify if the output of an operation is quantized using the scheme of the input or returned unquantized.
QuantizationStrategy
Quantization strategy.

Traits§

QTensorPrimitive
Quantized tensor primitive.
Quantization
Quantization scheme to convert elements of a higher precision data type E to a lower precision data type Q and vice-versa.

Functions§

pack_i8s_to_u32s
Pack signed 8-bit integer values into a sequence of unsigned 32-bit integers.
unpack_u32s_to_i8s
Unpack 32-bit unsigned integer values into a sequence of signed 8-bit integers.

Type Aliases§

QuantizationParameters
The tensor quantization parameters.