Skip to content

Issue with INT4 quantization #22

@NICESTM

Description

@NICESTM

Hello,

I am trying to quantize in INT4 all the weights of a mobilenet.
My attempt crashes at this line:
quark/onnx/quantization/quant_utils.py", line 254, in tensor_type
raise ValueError(f"Unexpected value qtype={self!r}.")
ValueError: Unexpected value qtype=<ExtendedQuantType.QInt4: 5>.

For doing this, I needed to define my own class Int4Spec(QTensorConfig) and also my own type INT4
class Int4(BaseInt4):
onnx_proto_dtype: TensorProto.INT4
map_onnx_format = ExtendedQuantType.QInt4

Then I used
config = QConfig(global_config=QLayerConfig(activation=Int8Spec(), weight=Int4Spec())

and then I have something like this:
qk_quantizer = ModelQuantizer(config)
dr = ImageDataReader(quantization_samples=data, model_path=model_path)
print(f"[INFO] : Running ONNX quantization on {model_path}")
qk_quantizer.quantize_model(model_path, qk_quantized_model_path, dr)

Could you help?
BR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions