Feature Request: Gemma Multimodal Support


**Is your feature request related to a problem? Please describe.**
Users can run Gemma in text mode, but cannot use Gemma multimodal models (image + text) through a first-class Gemma path in llama-cpp-python. Today, `gemma` is only a text chat format, server multimodal branches cover other formats (llava/moondream/nanollava/llama-3-vision-alpha/minicpm/qwen), and docs do not show Gemma multimodal setup. This creates confusion and extra integration work.

**Describe the solution you'd like**
Add official Gemma multimodal support in both Python API and server API:
- Add a Gemma multimodal chat handler (MTMD/image_url flow).
- Add server routing for a Gemma multimodal `chat_format`.
- Document required model/projector files and provide working examples.
- Add tests for registration, config validation, and a multimodal smoke path.

**Describe alternatives you've considered**
- Continue using existing multimodal formats (for example llava or qwen2.5-vl) instead of Gemma.
- Build a custom local handler outside the project.
- Use HF tokenizer-template fallback with manual prompt engineering and custom preprocessing.

These workarounds are possible but reduce portability and are harder to maintain.

**Additional context**
Expected outcome:
- Users can start the server with a Gemma multimodal format and send OpenAI-style `image_url` chat requests.
- Errors should be explicit when required projector/mtmd assets are missing or when model vision support is unavailable.

Potential acceptance checks:
- New chat format loads successfully.
- Mixed text + image request returns a valid response.
- README/docs include Gemma multimodal setup instructions.
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Gemma Multimodal Support #2224

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature Request: Gemma Multimodal Support #2224

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions