Skip to content

Commit ac953dc

Browse files
authored
added general guidance on token usage in AI Assistant (#295)
added section on token usage
1 parent e73fbb7 commit ac953dc

1 file changed

Lines changed: 20 additions & 0 deletions

File tree

content/features/ai-assistant.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ Select **Anthropic** as the provider and enter your API key. The default model i
6767

6868
![AI Assistant Anthropic Configuration](~/content/assets/images/ai-assistant/ai-assistant-anthropic-config.png)
6969

70+
> [!IMPORTANT]
71+
> Anthropic enforces input token per minute (ITPM) rate limits based on your account tier. A new API key starts at Tier 1 with 30,000 ITPM for Claude Sonnet 4.x. A single request against a large model can exceed this limit. Purchase $40 or more in API credits to reach Tier 2 (450,000 ITPM). See the [Anthropic rate limits documentation](https://docs.anthropic.com/en/api/rate-limits) for full tier details.
72+
7073
### Azure OpenAI
7174

7275
Select **Azure OpenAI** as the provider. Enter your API key and the service endpoint URL for your Azure OpenAI resource. Set the model name to match your deployment name.
@@ -282,6 +285,23 @@ Configure AI Assistant display and behavior options under **Tools > Preferences
282285

283286
![AI Assistant Preferences](~/content/assets/images/ai-assistant/ai-assistant-preferences.png)
284287

288+
## Token Usage
289+
290+
Each message to the AI Assistant consumes input tokens. The token cost of a single message depends on what context is included:
291+
292+
- **System prompt and custom instructions**: Sent with every message. Typically 5,000 to 15,000 tokens depending on which custom instructions are active.
293+
- **Model metadata**: When the assistant needs to understand your model, it retrieves metadata through tool calls. A compact summary includes table names, column names, measure names, relationships and descriptions. A full metadata retrieval includes the complete model definition. For large models this can consume tens of thousands of tokens.
294+
295+
### Reducing Token Usage
296+
297+
Select specific objects in the **TOM Explorer** before asking your question. When objects are selected, the assistant scopes its context to those objects instead of retrieving metadata for the entire model. This is the most effective way to reduce both token usage and API cost.
298+
299+
Other ways to reduce token usage:
300+
301+
- Ask focused questions about specific tables, measures or columns rather than broad questions about the entire model
302+
- Start new conversations when switching topics to avoid accumulating long conversation histories
303+
- Use a smaller or less expensive model for exploratory questions
304+
285305
## Limitations
286306

287307
- Requires a user-provided API key. No built-in API key is included

0 commit comments

Comments
 (0)