Cannot save compressed binary or ternary weights, saved as float32 parameters

I am trying to save a quantized ternary model to a `.tflite` file, but larq doesn't seem to save the weights using datatypes with a reduced precision and thus compress the file size.
However, after converting and writing to disk, the size of the file is about the same as the one predicted by `larq.models.summary` in float32 parameters.

Even if I try to do the same thing with a simple `QuantDense` layer, the weights are saved in float32.

I am using this kind of code:

```python
quantDense = larq.layers.QuantDense(1000, kernel_quantizer="ste_sign", use_bias=False)
quantDense(tf.ones((1, 500)))

with larq.context.quantized_scope(True):
    inp_quant = keras.Input((1,500))
    out_quant = quantDense(inp_quant)
    quantModelTest = keras.Model(inputs=inp_quant, outputs=out_quant)
    print("Keras test model")
    larq.models.summary(quantModelTest)

    print("converting keras test model to tflite")
    converted = lce.convert_keras_model(quantModelTest)
    with open("test.tflite", "wb") as f:
        print("writing tflite model to disk")
        f.write(converted)
```

Am I doing something wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot save compressed binary or ternary weights, saved as float32 parameters #806

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Cannot save compressed binary or ternary weights, saved as float32 parameters #806

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions