Skip to content

feat: add configurable max tokens support#65

Open
antispam2002 wants to merge 2 commits into
HartsyAI:masterfrom
antispam2002:feature/configurable_max_tokens
Open

feat: add configurable max tokens support#65
antispam2002 wants to merge 2 commits into
HartsyAI:masterfrom
antispam2002:feature/configurable_max_tokens

Conversation

@antispam2002

@antispam2002 antispam2002 commented Jun 6, 2026

Copy link
Copy Markdown

Running the extension against newer (thinking) models, like Gemma4, produces trimmed responses with the stop reason "length". This is because the thinking models generate "thinking" tokens which are counted to the actual prompt response. To workaround this, max_tokens parameter can now be configured via settings dialog.

Summary by CodeRabbit

  • New Features
    • Introduced configurable "Max Tokens" setting for Chat LLM requests with a default limit of 1024 tokens
    • Added "Max Tokens" numeric input field to Chat LLM settings panel for easy customization
    • Token limit settings now persist across sessions and apply to all supported AI backends

Running the extension agains newer (thinking) models, like Gemma4,
produces trimmed responses with stop reason "length".
This is because the thinking models generate "thinking" tokens which
are counted to the actual prompt response. To workaround this,
max_tokens parameter can now be configured via settings dialog.
@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Looking for one thing? Review this PR in Change Stack to search files, summaries, diffs, and code without losing your place.

Review Change Stack

Warning

Review limit reached

@antispam2002, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 52 minutes and 3 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: cf9d7d11-e7f1-4b70-af9d-b05340ca3471

📥 Commits

Reviewing files that changed from the base of the PR and between 633b4e5 and 0290529.

📒 Files selected for processing (1)
  • WebAPI/LLMAPICalls.cs
📝 Walkthrough

Walkthrough

This pull request adds a user-configurable maximum token limit setting throughout the application. Previously, token limits were hardcoded in backend request builders (1000 for OpenAI, 1024 for Anthropic). The changes introduce a centralized DefaultMaxTokens constant and allow users to override this via a new UI field, with the configured value propagated through settings and API layers.

Changes

Configurable Token Limits

Layer / File(s) Summary
Backend schema token configuration
BackendSchema.cs
Introduces DefaultMaxTokens constant (1024) and updates GetSchemaType, OpenAICompatibleRequestBody, and AnthropicRequestBody method signatures to accept an optional maxTokens parameter, replacing hardcoded token limits with null-coalescing fallback to the default.
Settings registration and API integration
WebAPI/SessionSettings.cs, WebAPI/LLMAPICalls.cs
SessionSettings imports BackendSchema and registers max_tokens in default configuration. LLMAPICalls reads the configured max_tokens value and passes it into schema request building.
Frontend UI and settings state management
Assets/settings.js, Tabs/Text2Image/MagicPrompt.html
Settings JavaScript adds constant, load/save logic, and modal initialization for max tokens. MagicPrompt.html adds a numeric "Max Tokens" input field and enables flex-wrap on the Chat LLM settings container.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Token limits, once fixed in stone,
Now let the user claim their own!
From backend deep to UI bright,
This setting tweak gets tuned just right!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add configurable max tokens support' clearly and accurately describes the main change: adding user-configurable max tokens settings across the UI, backend schema, and API layers.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@WebAPI/LLMAPICalls.cs`:
- Around line 547-548: The direct call to settings["max_tokens"]?.Value<int?>()
can throw on malformed or oversized persisted values; replace it with guarded
parsing that reads the token value as a string/JSON token, uses int.TryParse (or
long.TryParse then clamp) to safely convert, validates range (reject/limit
values outside acceptable bounds or overflow), and falls back to
BackendSchema.DefaultMaxTokens on failure, then pass the safe maxTokens into
GetSchemaType (same call site using messageContent, modelId, messageType, seed,
maxTokens).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5f356f47-9590-4c9b-b881-7800429ae846

📥 Commits

Reviewing files that changed from the base of the PR and between e2e5006 and 633b4e5.

📒 Files selected for processing (5)
  • Assets/settings.js
  • BackendSchema.cs
  • Tabs/Text2Image/MagicPrompt.html
  • WebAPI/LLMAPICalls.cs
  • WebAPI/SessionSettings.cs

Comment thread WebAPI/LLMAPICalls.cs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant