Best practices for using TOON with Large Language Models to maximize token efficiency and response quality.
Traditional JSON wastes tokens on structural characters:
- Braces & brackets:
{},[] - Repeated quotes: Every key quoted in JSON
- Commas everywhere: Between all elements
TOON eliminates this redundancy, achieving 30-60% token reduction while maintaining readability.
JSON (45 tokens with GPT-5):
{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}TOON (20 tokens with GPT-5, 56% reduction):
users[2,]{id,name}:
1,Alice
2,Bob
Explicit format instruction:
Respond using TOON format (Token-Oriented Object Notation):
- Use `key: value` for objects
- Use indentation for nesting
- Use `[N]` to indicate array lengths
- Use tabular format `[N,]{fields}:` for uniform arrays
Example:
users[2,]{id,name}:
1,Alice
2,Bob
Always wrap TOON in code blocks for clarity:
```toon
users[3,]{id,name,age}:
1,Alice,30
2,Bob,25
3,Charlie,35
```This helps the model distinguish TOON from natural language.
Use lengthMarker="#" for explicit validation hints:
from toon_format import encode
data = {"items": ["a", "b", "c"]}
toon = encode(data, {"lengthMarker": "#"})
# items[#3]: a,b,cTell the model:
"Array lengths are prefixed with
#. Ensure your response matches these counts exactly."
Before integrating TOON with your LLM application, measure actual savings for your data:
from toon_format import estimate_savings
# Your actual data structure
user_data = {
"users": [
{"id": 1, "name": "Alice", "email": "alice@example.com", "active": True},
{"id": 2, "name": "Bob", "email": "bob@example.com", "active": True},
{"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": False}
]
}
# Compare formats
result = estimate_savings(user_data)
print(f"JSON: {result['json_tokens']} tokens")
print(f"TOON: {result['toon_tokens']} tokens")
print(f"Savings: {result['savings_percent']:.1f}%")
# JSON: 112 tokens
# TOON: 68 tokens
# Savings: 39.3%Calculate actual dollar savings based on your API usage:
from toon_format import estimate_savings
# Your typical prompt data
prompt_data = {
"context": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Analyze this data"}
],
"data": [
{"id": i, "value": f"Item {i}", "score": i * 10}
for i in range(1, 101) # 100 items
]
}
result = estimate_savings(prompt_data["data"])
# GPT-5 pricing (example: $0.01 per 1K tokens)
cost_per_1k = 0.01
json_cost = (result['json_tokens'] / 1000) * cost_per_1k
toon_cost = (result['toon_tokens'] / 1000) * cost_per_1k
print(f"JSON cost per request: ${json_cost:.4f}")
print(f"TOON cost per request: ${toon_cost:.4f}")
print(f"Savings per request: ${json_cost - toon_cost:.4f}")
print(f"Savings per 10,000 requests: ${(json_cost - toon_cost) * 10000:.2f}")Get a formatted report for documentation or analysis:
from toon_format import compare_formats
api_response = {
"status": "success",
"results": [
{"id": 1, "score": 0.95, "category": "A"},
{"id": 2, "score": 0.87, "category": "B"},
{"id": 3, "score": 0.92, "category": "A"}
],
"total": 3
}
print(compare_formats(api_response))
# Format Comparison
# ββββββββββββββββββββββββββββββββββββββββββββββββ
# Format Tokens Size (chars)
# JSON 78 189
# TOON 48 112
# ββββββββββββββββββββββββββββββββββββββββββββββββ
# Savings: 30 tokens (38.5%)Use token counting in production to monitor savings:
import json
from toon_format import encode, count_tokens
def send_to_llm(data, use_toon=True):
"""Send data to LLM with optional TOON encoding."""
if use_toon:
formatted = encode(data)
format_type = "TOON"
else:
formatted = json.dumps(data, indent=2)
format_type = "JSON"
tokens = count_tokens(formatted)
print(f"[{format_type}] Sending {tokens} tokens")
# Your LLM API call here
# response = openai.ChatCompletion.create(...)
return formatted, tokens
# Example usage
data = {"items": [{"id": 1}, {"id": 2}]}
formatted, token_count = send_to_llm(data, use_toon=True)Prompt:
Extract user information from the text below. Respond in TOON format.
Text: "Alice (age 30) works at ACME. Bob (age 25) works at XYZ."
Format:
users[N,]{name,age,company}:
...
Model Response:
users[2,]{name,age,company}:
Alice,30,ACME
Bob,25,XYZ
Processing:
from toon_format import decode
response = """users[2,]{name,age,company}:
Alice,30,ACME
Bob,25,XYZ"""
data = decode(response)
# {'users': [
# {'name': 'Alice', 'age': 30, 'company': 'ACME'},
# {'name': 'Bob', 'age': 25, 'company': 'XYZ'}
# ]}Prompt:
Generate a server configuration in TOON format with:
- app: "myapp"
- port: 8080
- database settings (host, port, name)
- enabled features: ["auth", "logging", "cache"]
Model Response:
app: myapp
port: 8080
database:
host: localhost
port: 5432
name: myapp_db
features[3]: auth,logging,cache
Processing:
config = decode(response)
# Use config dict directly in your applicationPrompt:
Convert this data to TOON format for efficient transmission:
Products:
1. Widget A ($9.99, stock: 50)
2. Widget B ($14.50, stock: 30)
3. Widget C ($19.99, stock: 0)
Model Response:
products[3,]{id,name,price,stock}:
1,"Widget A",9.99,50
2,"Widget B",14.50,30
3,"Widget C",19.99,0
Provide examples in your prompt:
Convert the following to TOON format. Examples:
Input: {"name": "Alice", "age": 30}
Output:
name: Alice
age: 30
Input: [{"id": 1, "item": "A"}, {"id": 2, "item": "B"}]
Output:
[2,]{id,item}:
1,A
2,B
Now convert this: <your data>
Add explicit validation rules:
Respond in TOON format. Rules:
1. Array lengths MUST match actual count: [3] means exactly 3 items
2. Tabular arrays require uniform keys across all objects
3. Use quotes for: empty strings, keywords (null/true/false), numeric strings
4. Indentation: 2 spaces per level
If you cannot provide valid TOON, respond with an error message.
Choose delimiters based on your data:
# For data with commas (addresses, descriptions)
encode(data, {"delimiter": "\t"}) # Use tab
# For data with tabs (code snippets)
encode(data, {"delimiter": "|"}) # Use pipe
# For general use
encode(data, {"delimiter": ","}) # Use comma (default)Tell the model which delimiter to use:
"Use tab-separated values in tabular arrays due to commas in descriptions."
Always wrap TOON decoding in error handling:
from toon_format import decode, ToonDecodeError
def safe_decode(toon_str):
try:
return decode(toon_str)
except ToonDecodeError as e:
print(f"TOON decode error: {e}")
# Fall back to asking model to regenerate
return NoneIf decoding fails, ask the model to fix it:
The TOON you provided has an error: "Expected 3 items, but got 2"
Please regenerate with correct array lengths. Original:
items[3]: a,b
Should be either:
items[2]: a,b (fix length)
OR
items[3]: a,b,c (add missing item)
Less efficient (list format):
users[3]:
- id: 1
name: Alice
- id: 2
name: Bob
- id: 3
name: Charlie
More efficient (tabular format):
users[3,]{id,name}:
1,Alice
2,Bob
3,Charlie
Less efficient:
data:
metadata:
items:
list[2]: a,b
More efficient:
items[2]: a,b
Less efficient:
user_identification_number: 123
user_full_name: Alice
More efficient:
id: 123
name: Alice
# BAD: No validation
response = llm.generate(prompt)
data = decode(response) # May raise error# GOOD: Validate and handle errors
response = llm.generate(prompt)
try:
data = decode(response, {"strict": True})
except ToonDecodeError:
# Retry or fall backFirst response: JSON
Second response: TOON
Be consistent - stick to TOON throughout the conversation.
Model might produce:
code: 123 # Wrong! Numeric string needs quotes
Should be:
code: "123" # Correct
Solution: Explicitly mention quoting in prompts.
import openai
from toon_format import decode
def ask_for_toon_data(prompt):
response = openai.ChatCompletion.create(
model="gpt-5",
messages=[
{"role": "system", "content": "Respond using TOON format"},
{"role": "user", "content": prompt}
]
)
toon_str = response.choices[0].message.content
# Extract TOON from code blocks if wrapped
if "```toon" in toon_str:
toon_str = toon_str.split("```toon")[1].split("```")[0].strip()
elif "```" in toon_str:
toon_str = toon_str.split("```")[1].split("```")[0].strip()
return decode(toon_str)import anthropic
from toon_format import decode
def claude_toon(prompt):
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
messages=[{
"role": "user",
"content": f"{prompt}\n\nRespond in TOON format (Token-Oriented Object Notation)."
}]
)
toon_str = message.content[0].text
# Remove code blocks if present
if "```" in toon_str:
toon_str = toon_str.split("```")[1].strip()
if toon_str.startswith("toon\n"):
toon_str = toon_str[5:]
return decode(toon_str)Based on testing with gpt5 and Claude:
| Data Type | JSON Tokens | TOON Tokens | Reduction |
|---|---|---|---|
| Simple config (10 keys) | 45 | 28 | 38% |
| User list (50 users) | 892 | 312 | 65% |
| Nested structure | 234 | 142 | 39% |
| Mixed arrays | 178 | 95 | 47% |
Average reduction: 30-60% depending on data structure and tokenizer.
Note: Comprehensive benchmarks across gpt5, gpt5-mini, and other models are coming soon. See the roadmap for details.
Always log the raw TOON before decoding:
print("Raw TOON from model:")
print(repr(toon_str))
try:
data = decode(toon_str)
except ToonDecodeError as e:
print(f"Decode error: {e}")Enable strict validation during development:
decode(toon_str, {"strict": True}) # Strict validationDisable for production if lenient parsing is acceptable:
decode(toon_str, {"strict": False}) # LenientAfter decoding, validate the Python structure:
data = decode(toon_str)
# Validate structure
assert "users" in data
assert isinstance(data["users"], list)
assert all("id" in user for user in data["users"])- Format Specification - Complete TOON syntax reference
- API Reference - Function documentation
- Official Spec - Normative specification
- Benchmarks - Token efficiency analysis
Key Takeaways:
- Explicit prompting - Tell the model to use TOON format clearly
- Validation - Always validate model output with error handling
- Examples - Provide few-shot examples in prompts
- Consistency - Use TOON throughout the conversation
- Tabular format - Prefer tabular arrays for maximum efficiency
- Error recovery - Handle decode errors gracefully
TOON can reduce LLM costs by 30-60% while maintaining readability and structure. Start with simple use cases and expand as you become familiar with the format.