Token Savings

Real Results

OpenAI API Testing (gpt-4o-mini, 5 scenarios):

Variant	Compliance	Avg Token Reduction	Use Case
STRICT	80%	88.8%	Production systems
BALANCED	40%	95.4%	General use
MINIMAL	40%	95.7%	Testing/learning

Range (Strict): 75.0% - 94.2%
Compliance Note: % of tests outputting valid Vector-Native (strict highest reliability)
Measurement: Completion tokens only (system prompt overhead separate; strict has larger prompt but superior compliance)

Cost Impact

At Scale (1M Output Tokens, gpt-4o-mini $0.60/1M):

English: ~$343
Vector-Native (avg across variants): ~$22
Savings: 93.6% ($321)

Why: 88-95% reduction in completions compounds in A2A/system prompts. Strict: Reliable for prod; Balanced/Minimal: Max savings for testing.

What Gets Measured

Completions Only: Output token reduction (ignores system prompt). Total API cost varies: Strict (larger prompt) nets ~70% savings; Minimal ~95%.
Test Setup: 5 diverse scenarios (analysis, task creation, multi-op). Raw data: tests/test_results/.
Scalability: Every internal message/prompt benefits—no user-facing impact.

Why This Matters

English wastes tokens on filler in non-human paths. Vector-Native skips translation:

System Prompt Example:
English (~20 tokens): "You are helpful. Provide details. Focus on needs."
Vector-Native (~8 tokens): ●assistant|mode:helpful|detail:high|attention:needs
Savings: 60% per request → 93.6% at scale.

Ideal for agent-to-agent, tools, prompts—where humans don't read.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Savings

Real Results

Cost Impact

What Gets Measured

Why This Matters

FilesExpand file tree

token-savings.md

Latest commit

History

token-savings.md

File metadata and controls

Token Savings

Real Results

Cost Impact

What Gets Measured

Why This Matters