tok

Cut LLM token costs by 60–90%. A Go library for prompt compression, output filtering, token estimation, and secrets scanning — built for AI coding agents.

What tok Is

tok is a library, not a CLI. It exposes token-efficiency primitives as a clean Go API:

Prompt compression — tok.PromptCompress / tok.Compress shrink verbose prompts 20–70% (six modes).
Output filtering — a 31-layer pipeline (entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, …) that strips noise from command output before it re-enters an LLM context.
Token estimation — tok.EstimateTokens* and tok.EstimateCost give model-aware token counts and pricing.
Secrets scanning — tok.SecretDetector and tok.IsSensitiveFilename catch credentials before they leak into prompts.
Rate limiting & tracking — persistent SQLite-backed gain tracking (tok.NewTracker).

It is consumed directly as a Go module, and it powers the tok commands inside Hawk (hawk tok ...), which imports it as a library.

Install

go get github.com/GrayCodeAI/tok

import "github.com/GrayCodeAI/tok"

tok ships no standalone tok CLI binary. Its CLI surface is exposed through Hawk, which embeds this library: hawk tok compress, hawk tok estimate, and hawk tok scan.

Quick Start

Compress a prompt

out := tok.PromptCompress("Please implement authentication", tok.IntensityUltra)
// → "Implement auth."

Filter command output

out, _ := tok.Compress(verboseOutput, tok.Aggressive)
// 200 lines → a few lines: pass/fail + failures

Estimate tokens & cost

n    := tok.EstimateTokensForModel(text, "gpt-4o")
cost := tok.EstimateCost(text, "gpt-4o")

Scan for secrets

d := tok.NewSecretDetector()
findings := d.DetectSecrets(text)
redacted := d.RedactSecrets(text)

Library API Highlights

tok.PromptCompress(text, intensity) — prompt compression (Lite / Full / Ultra). ~150 phrase substitutions, drop-lists for articles / filler / pleasantries, and auto-clarity (security/destructive segments pass through verbatim). intensity is monotonic: len(ultra) <= len(full) <= len(lite).
tok.Compress(text, opts...) — the full output pipeline with options: WithCustomFilters, WithCodeAware(lang), WithPerplexityGuided(scorer, ratio), profile options, and more.
tok.IsSensitiveFilename(path) — 3-layer filename detection (exact basename, sensitive directory, name token). Companion to the content-based SecretDetector. Catches .env, id_rsa, ~/.ssh/..., test_credentials.json, etc.
tok.SmartTruncate(content, maxLines, lang) — code truncation that preserves function signatures and always reports the exact drop count (kept + dropped == total).
tok.ExtractJSON / ExtractJSONArray / ExtractAllJSON — brace-balanced JSON extraction from LLM output with surrounding prose, markdown fences, and unterminated objects.
tok.NewTracker(ctx) — persistent gain tracker (SQLite + WAL, 90-day retention, pure-Go via modernc.org/sqlite). Aggregate, Recent, Prune queries.
tok.EstimateTokensFast / WithEncoding / ForModel — model-aware token estimation.
filter.CompressWithRetry — validate-fix-retry loop: caller supplies a Validator and AdjustFunc; the loop escalates mode/intensity and retries up to N times.
filter.NewTOMLFilter / LoadTOMLFilterFile — full 8-stage TOML pipeline as a pluggable Filter.

Full reference: pkg.go.dev/github.com/GrayCodeAI/tok.

Compression Modes

Mode	Style	Savings
`lite`	Drop filler, keep grammar	~20%
`full`	Drop articles, fragments OK	~40% (default)
`ultra`	Telegraphic, abbreviations	~60%
`wenyan-lite`	Classical Chinese light	~30%
`wenyan`	Classical Chinese standard	~50%
`wenyan-ultra`	Classical Chinese max	~70%

Output Filtering (31-Layer Pipeline)

Research-backed algorithms: entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, attention sink preservation, semantic chunking, and 25+ more.

Custom Filter DSL

Define regex find/replace rules in a TOML file and plug them into the pipeline. Opt-in — no rules, no change.

# filters.toml
[[rule]]
name        = "collapse-uuids"
pattern     = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
replacement = "<uuid>"
priority    = 10

rules, _ := tok.LoadFilterRules("filters.toml")
out, _ := tok.Compress(text, tok.WithCustomFilters(rules))

Team / Shared Compression Profiles

Bundle mode + tier + budget into a named, versioned TOML profile teams can share. Built-ins: default, aggressive, code-safe.

p, _ := tok.LoadProfile(".tok/profiles/code-safe.toml")
out, _ := tok.Compress(text, p.Options()...)

Code-Aware (Symbol-Preserving) Compression

tok.WithCodeAware(lang) marks input as source code and guards function/type/export signatures so compression can never strip them. Defaults to a dependency-light regex symbol provider; swap in an LSP-backed one via WithSymbolProvider.

Perplexity-Guided Token Dropping

tok.WithPerplexityGuided(scorer, ratio) (LLMLingua-style) drops the lowest-importance tokens first. Default HeuristicPerplexityScorer is zero-dependency; plug in your own PerplexityScorer, or tok's experimental OllamaScorer when built with -tags experimental_ollama. Opt-in.

Benchmarks

Measured on this repo via benchmarks/run.sh (raw vs tok.Compress(..., tok.Aggressive)):

fixture	raw bytes	raw tokens	tok bytes	tok tokens	saved
git log	2,873	718	298	74	89 %
git diff	385,051	96,262	1,117	279	99 %
ls -la	66,341	16,585	148	37	99 %
find .go	19,145	4,786	147	36	99 %

Profile the hot paths with ./scripts/profile.sh [compress|tokens|filter|secrets|all].

Architecture

tok
├── tok.go, *.go         Public library API (top-level package)
├── internal/
│   ├── compress/        Input compression engine (6 modes)
│   ├── filter/          Output pipeline (31 layers)
│   ├── secrets/         Secret detection + redaction
│   ├── tracking/        SQLite token-usage database
│   ├── fastops/         Hot-path primitives (entropy, etc.)
│   └── core/            Token estimation & cost
├── benchmarks/          Token-savings benchmarks
└── evals/               Eval harness

Pure-Go, zero CGO, no runtime dependencies.

Contributing

git clone https://github.com/GrayCodeAI/tok.git && cd tok
make test && make lint
./scripts/build.sh        # verifies the library compiles (go build ./... + go vet)

See CONTRIBUTING.md.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tok

What tok Is

Install

Quick Start

Compress a prompt

Filter command output

Estimate tokens & cost

Scan for secrets

Library API Highlights

Compression Modes

Output Filtering (31-Layer Pipeline)

Custom Filter DSL

Team / Shared Compression Profiles

Code-Aware (Symbol-Preserving) Compression

Perplexity-Guided Token Dropping

Benchmarks

Architecture

Contributing

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

tok

What tok Is

Install

Quick Start

Compress a prompt

Filter command output

Estimate tokens & cost

Scan for secrets

Library API Highlights

Compression Modes

Output Filtering (31-Layer Pipeline)

Custom Filter DSL

Team / Shared Compression Profiles

Code-Aware (Symbol-Preserving) Compression

Perplexity-Guided Token Dropping

Benchmarks

Architecture

Contributing

License