Skip to content

User Image

github-actions[bot] edited this page May 22, 2026 · 3 revisions

Image Mode

BLXCode's agent panel can generate images directly from a prompt. Toggle Image mode in the chat header, type what you want, and the agent produces an image instead of a chat reply.

This is not the same as attaching images to agent context for vision or terminal handoff — see Agent Providers — Agent context.

Image mode is available in the Tauri desktop app. It is not available in trunk serve mode because API keys and file writes are handled by the Tauri backend.

Agent chat with Image mode active and generated image preview in the timeline

Requirements

You need:

  • An API key for the chosen image provider (OpenAI or OpenRouter).
  • Network access to the provider.
  • A selected workspace if you want generated images saved to disk. Without one the image lives in chat memory only.

Image keys are set under Settings → API Keys (OpenAI, OpenRouter, fal.ai).

Settings (Settings → BLXCode Agent → Image)

Setting Default
Provider OpenAI
Model gpt-image-1
Quality Medium

Pick provider from the dropdown; choose model via AgentModelPicker (catalog + custom id). Changes auto-save. API-key status points to API Keys.

OpenRouter uses chat-completions with modalities: ["image"]. OpenAI uses /v1/images/generations (text-only) or /v1/images/edits when one or more reference images are attached.

Generating

  1. Click the image icon in the agent chat header. The icon turns blue and a hint appears.
  2. Optionally drop images into the panel to use as references (img2img).
  3. Type a prompt and hit Enter. A non-empty prompt is required even when reference images are attached.
  4. The result appears in the timeline as an inline image with a Download button.

When a workspace is set, the file is saved under:

<workspace>/.blxcode/generated/<unix-ms>-<slug>.<ext>

Filenames collide-protect with a numeric suffix. The relative path is shown under the image so you can find it in your file manager.

Voice + Image

If you submit an image-mode turn from voice (PTT or hotkey) and TTS is enabled in BLXCode Agent voice settings, BLXCode plays a short confirmation phrase after the image arrives. The image content itself is not narrated.

Limits

  • Up to 4 reference images per turn, 8 MiB each.
  • Supported reference MIME types: PNG, JPEG, GIF, WebP.
  • Generated previews larger than 20 MiB are not rehydrated after a workspace reload — the original file on disk is still valid.

Persistence

The chat timeline persists generated-image entries by their saved path, not by their base64 bytes, to keep sessions.json small. On reload, BLXCode lazily reads the file from disk for preview. If the file has been moved or deleted, the row remains in chat but the preview area stays empty.

Troubleshooting

  • "No API key set for the image provider." — Open Settings → Agent Provider and store a key for OpenAI or OpenRouter; image mode reuses those keys.
  • OpenRouter returns no image. — Pick a model whose output modality includes image (Settings → BLXCode Agent → Image → refresh, then choose one with "image" in the id).
  • Image saved but not visible after reload. — Confirm the file at the path shown under the image still exists.

See also

Clone this wiki locally