Cut LLM token costs by 60–90%. A Go library for prompt compression, output filtering, token estimation, and secrets scanning — built for AI coding agents.
tok is a library, not a CLI. It exposes token-efficiency primitives as a clean Go API:
- Prompt compression —
tok.PromptCompress/tok.Compressshrink verbose prompts 20–70% (six modes). - Output filtering — a 31-layer pipeline (entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, …) that strips noise from command output before it re-enters an LLM context.
- Token estimation —
tok.EstimateTokens*andtok.EstimateCostgive model-aware token counts and pricing. - Secrets scanning —
tok.SecretDetectorandtok.IsSensitiveFilenamecatch credentials before they leak into prompts. - Rate limiting & tracking — persistent SQLite-backed gain tracking (
tok.NewTracker).
It is consumed directly as a Go module, and it powers the tok commands inside Hawk (hawk tok ...), which imports it as a library.
go get github.com/GrayCodeAI/tokimport "github.com/GrayCodeAI/tok"tok ships no standalone
tokCLI binary. Its CLI surface is exposed through Hawk, which embeds this library:hawk tok compress,hawk tok estimate, andhawk tok scan.
out := tok.PromptCompress("Please implement authentication", tok.IntensityUltra)
// → "Implement auth."out, _ := tok.Compress(verboseOutput, tok.Aggressive)
// 200 lines → a few lines: pass/fail + failuresn := tok.EstimateTokensForModel(text, "gpt-4o")
cost := tok.EstimateCost(text, "gpt-4o")d := tok.NewSecretDetector()
findings := d.DetectSecrets(text)
redacted := d.RedactSecrets(text)tok.PromptCompress(text, intensity)— prompt compression (Lite / Full / Ultra). ~150 phrase substitutions, drop-lists for articles / filler / pleasantries, and auto-clarity (security/destructive segments pass through verbatim).intensityis monotonic:len(ultra) <= len(full) <= len(lite).tok.Compress(text, opts...)— the full output pipeline with options:WithCustomFilters,WithCodeAware(lang),WithPerplexityGuided(scorer, ratio), profile options, and more.tok.IsSensitiveFilename(path)— 3-layer filename detection (exact basename, sensitive directory, name token). Companion to the content-basedSecretDetector. Catches.env,id_rsa,~/.ssh/...,test_credentials.json, etc.tok.SmartTruncate(content, maxLines, lang)— code truncation that preserves function signatures and always reports the exact drop count (kept + dropped == total).tok.ExtractJSON / ExtractJSONArray / ExtractAllJSON— brace-balanced JSON extraction from LLM output with surrounding prose, markdown fences, and unterminated objects.tok.NewTracker(ctx)— persistent gain tracker (SQLite + WAL, 90-day retention, pure-Go viamodernc.org/sqlite).Aggregate,Recent,Prunequeries.tok.EstimateTokensFast / WithEncoding / ForModel— model-aware token estimation.filter.CompressWithRetry— validate-fix-retry loop: caller supplies aValidatorandAdjustFunc; the loop escalates mode/intensity and retries up to N times.filter.NewTOMLFilter/LoadTOMLFilterFile— full 8-stage TOML pipeline as a pluggableFilter.
Full reference: pkg.go.dev/github.com/GrayCodeAI/tok.
| Mode | Style | Savings |
|---|---|---|
lite |
Drop filler, keep grammar | ~20% |
full |
Drop articles, fragments OK | ~40% (default) |
ultra |
Telegraphic, abbreviations | ~60% |
wenyan-lite |
Classical Chinese light | ~30% |
wenyan |
Classical Chinese standard | ~50% |
wenyan-ultra |
Classical Chinese max | ~70% |
Research-backed algorithms: entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, attention sink preservation, semantic chunking, and 25+ more.
Define regex find/replace rules in a TOML file and plug them into the pipeline. Opt-in — no rules, no change.
# filters.toml
[[rule]]
name = "collapse-uuids"
pattern = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
replacement = "<uuid>"
priority = 10rules, _ := tok.LoadFilterRules("filters.toml")
out, _ := tok.Compress(text, tok.WithCustomFilters(rules))Bundle mode + tier + budget into a named, versioned TOML profile teams can share. Built-ins: default, aggressive, code-safe.
p, _ := tok.LoadProfile(".tok/profiles/code-safe.toml")
out, _ := tok.Compress(text, p.Options()...)tok.WithCodeAware(lang) marks input as source code and guards function/type/export signatures so compression can never strip them. Defaults to a dependency-light regex symbol provider; swap in an LSP-backed one via WithSymbolProvider.
tok.WithPerplexityGuided(scorer, ratio) (LLMLingua-style) drops the lowest-importance tokens first. Default HeuristicPerplexityScorer is zero-dependency; plug in your own PerplexityScorer, or tok's experimental OllamaScorer when built with -tags experimental_ollama. Opt-in.
Measured on this repo via benchmarks/run.sh (raw vs tok.Compress(..., tok.Aggressive)):
| fixture | raw bytes | raw tokens | tok bytes | tok tokens | saved |
|---|---|---|---|---|---|
| git log | 2,873 | 718 | 298 | 74 | 89 % |
| git diff | 385,051 | 96,262 | 1,117 | 279 | 99 % |
| ls -la | 66,341 | 16,585 | 148 | 37 | 99 % |
| find .go | 19,145 | 4,786 | 147 | 36 | 99 % |
Profile the hot paths with ./scripts/profile.sh [compress|tokens|filter|secrets|all].
tok
├── tok.go, *.go Public library API (top-level package)
├── internal/
│ ├── compress/ Input compression engine (6 modes)
│ ├── filter/ Output pipeline (31 layers)
│ ├── secrets/ Secret detection + redaction
│ ├── tracking/ SQLite token-usage database
│ ├── fastops/ Hot-path primitives (entropy, etc.)
│ └── core/ Token estimation & cost
├── benchmarks/ Token-savings benchmarks
└── evals/ Eval harness
Pure-Go, zero CGO, no runtime dependencies.
git clone https://github.com/GrayCodeAI/tok.git && cd tok
make test && make lint
./scripts/build.sh # verifies the library compiles (go build ./... + go vet)See CONTRIBUTING.md.