Skip to content

[quantization] Support Vision Encoder wrapper for Gemma4#796

Open
Torrero wants to merge 1 commit into
Samsung:mainfrom
Torrero:gemma4_support_wrapper_visionencoder
Open

[quantization] Support Vision Encoder wrapper for Gemma4#796
Torrero wants to merge 1 commit into
Samsung:mainfrom
Torrero:gemma4_support_wrapper_visionencoder

Conversation

@Torrero

@Torrero Torrero commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Introduce a PTQ wrapper (QuantGemma4VisionEncoder) for the Hugging Face Gemma4VisionEncoder module, enabling post-training quantization of the Gemma4 vision tower with both dynamic evaluation and static torch.export paths.

Changes

tico/quantization/wrapq/wrappers/gemma4/quant_vision_encoder.py

  • QuantGemma4VisionEncoder registered against Gemma4VisionEncoder via @try_register
  • Dynamic forward path (forward): computes RoPE position embeddings via precomputed lookup tables (cos_table/sin_table) and bidirectional attention masks from pixel_position_ids — used during calibration and accuracy evaluation.
  • Static export path (forward_export): reads precomputed position_embeddings and attention_mask from registered buffers, avoiding any dynamic shape-dependent computation — safe for torch.export tracing
  • Precomputes RoPE sin/cos lookup tables for all position IDs at __init__, replacing the dynamic matmul+cos/sin with a simple gather
  • Observers on input activations, attention mask, position embeddings (cos/sin), and encoder output

Smoke Tests

Command:

python -m tico.quantization.examples.inspect   \
--config tico/quantization/examples/configs/wrapper_smoke.yaml     \
--mode wrapper-smoke     \
--case gemma4_vision_encoder     \
--export circle     \
--output-dir ./out/wrapper_smoke
[QuantCheck] WARNING: 45 nodes without qparam detected (see logs).
┌───────────── Wrapper Smoke Summary ─────────────
│ Case             : gemma4_vision_encoder
│ Status           : PASS
│ Mean |diff|      : 0.079886
│ Max |diff|       : 0.503346
│ PEIR             : 0.056661
│ Shape match      : True
│ Quant finite     : True
└─────────────────────────────────────────────────
Artifacts:
  - circle: out/wrapper_smoke/gemma4_vision_encoder.q.circle
    ┌────────────────────────────────────────────┐
 5.2┤                                            │
    │                                      •••   │
 3.5┤                                  •••       │
    │                               •••••        │
 1.8┤                          ••••••            │
    │                       ••••••               │
 0.1┤                   ••••••                   │
    │                ••••••                      │
-1.5┤            •••••                           │
    │         •••• •                             │
-3.2┤      ••••                                  │
    │  • ••                                      │
-4.9┤                                            │
    └┬──────────┬──────────┬─────────┬──────────┬┘
   -4.9       -2.4        0.1       2.7       5.2 

UnitTests

Command:

python -m pytest test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py -x -v 
27 tests passed. Output
============================================================================================================= test session starts =============================================================================================================
platform linux -- Python 3.10.12, pytest-9.1.1, pluggy-1.6.0 -- /home/emaltsev/SAMSUNG/llm_quantization/.gemma_venv/bin/python
cachedir: .pytest_cache
rootdir: /home/emaltsev/SAMSUNG/llm_quantization/TICO_my/TICO
configfile: pyproject.toml
plugins: anyio-4.13.0
collected 27 items                                                                                                                                                                                                                            

test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_precomputes_buffers_on_wrapper PASSED                                                                         [  3%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_rejects_mismatched_pixel_position_ids PASSED                                                                  [  7%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_requires_quant_mode PASSED                                                                                    [ 11%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_returns_adapter PASSED                                                                                        [ 14%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_with_padding_produces_valid_output PASSED                                                                     [ 18%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_as_export_module_without_pixel_position_ids_uses_templates PASSED                                                              [ 22%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_attention_mask_is_fake_quantized_in_quant_mode PASSED                                                                          [ 25%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_attention_mask_is_observed_in_calib_mode PASSED                                                                                [ 29%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_dtype_override PASSED                                                                                                          [ 33%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_dynamic_forward_with_padding_produces_valid_output PASSED                                                                      [ 37%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_forward_export_matches_forward PASSED                                                                                          [ 40%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_forward_export_requires_precomputed_buffers PASSED                                                                             [ 44%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_gather_position_embeddings_matches_hf_rotary PASSED                                                                            [ 48%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_gather_position_embeddings_with_padding PASSED                                                                                 [ 51%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_input_is_observed_in_calib_mode PASSED                                                                                         [ 55%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_make_bidirectional_mask_fill_value PASSED                                                                                      [ 59%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_make_bidirectional_mask_no_padding PASSED                                                                                      [ 62%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_make_bidirectional_mask_with_padding PASSED                                                                                    [ 66%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_mode_transitions PASSED                                                                                                        [ 70%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_no_quant_forward_matches_fp PASSED                                                                                             [ 74%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_no_quant_output_shape PASSED                                                                                                   [ 77%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_observers_are_collected PASSED                                                                                                 [ 81%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_output_is_fake_quantized_in_quant_mode PASSED                                                                                  [ 85%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_position_cos_sin_are_observed_in_calib_mode PASSED                                                                             [ 88%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_position_embeddings_are_fake_quantized_in_quant_mode PASSED                                                                    [ 92%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_quant_mode_output_is_finite PASSED                                                                                             [ 96%]
test/quantization/wrapq/wrappers/gemma4/test_quant_vision_encoder.py::TestQuantGemma4VisionEncoder::test_unsupported_export_mode_raises PASSED                                                                                          [100%]

============================================================================================================= 27 passed in 5.53s ==============================================================================================================

TICO-DCO-1.0-Signed-off-by: Evgenii Maltsev e.maltsev@samsung.com

@Torrero Torrero force-pushed the gemma4_support_wrapper_visionencoder branch 3 times, most recently from f770fb2 to 5a4ec51 Compare June 24, 2026 17:50
@Torrero Torrero requested review from dvsav and mhs4670go June 24, 2026 17:56
@Torrero Torrero marked this pull request as draft June 25, 2026 15:33
@Torrero Torrero force-pushed the gemma4_support_wrapper_visionencoder branch 2 times, most recently from 12e6510 to c9dcea8 Compare June 25, 2026 17:58
@Torrero Torrero marked this pull request as ready for review June 25, 2026 17:59
This commit supports Vision Encoder wrapper for Gemma4

Co-authored-by: Cline

TICO-DCO-1.0-Signed-off-by:  Evgenii Maltsev <e.maltsev@samsung.com>
@Torrero Torrero force-pushed the gemma4_support_wrapper_visionencoder branch from c9dcea8 to ab4b0f7 Compare June 26, 2026 09:02
@dvsav

dvsav commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Could you please clarify the cause of [QuantCheck] WARNING: 45 nodes without qparam detected (see logs). in the smoke test?
The exact culprit can be traced down by adding debug prints to tico/quantization/wrapq/utils/check_missing_qparam.py (print node.stack_trace of nodes lacking qparam).

@dvsav

dvsav commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Would you consider adding test/quantization/wrapq/wrappers/gemma4/test_quantize_vision_encoder.py?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants