Skip to content

Latest commit

 

History

History
191 lines (133 loc) · 7.62 KB

File metadata and controls

191 lines (133 loc) · 7.62 KB

tok

Go License CI Go Report Card Go Reference

Cut LLM token costs by 60–90%. A Go library for prompt compression, output filtering, token estimation, and secrets scanning — built for AI coding agents.


What tok Is

tok is a library, not a CLI. It exposes token-efficiency primitives as a clean Go API:

  • Prompt compressiontok.PromptCompress / tok.Compress shrink verbose prompts 20–70% (six modes).
  • Output filtering — a 31-layer pipeline (entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, …) that strips noise from command output before it re-enters an LLM context.
  • Token estimationtok.EstimateTokens* and tok.EstimateCost give model-aware token counts and pricing.
  • Secrets scanningtok.SecretDetector and tok.IsSensitiveFilename catch credentials before they leak into prompts.
  • Rate limiting & tracking — persistent SQLite-backed gain tracking (tok.NewTracker).

It is consumed directly as a Go module, and it powers the tok commands inside Hawk (hawk tok ...), which imports it as a library.


Install

go get github.com/GrayCodeAI/tok
import "github.com/GrayCodeAI/tok"

tok ships no standalone tok CLI binary. Its CLI surface is exposed through Hawk, which embeds this library: hawk tok compress, hawk tok estimate, and hawk tok scan.


Quick Start

Compress a prompt

out := tok.PromptCompress("Please implement authentication", tok.IntensityUltra)
// → "Implement auth."

Filter command output

out, _ := tok.Compress(verboseOutput, tok.Aggressive)
// 200 lines → a few lines: pass/fail + failures

Estimate tokens & cost

n    := tok.EstimateTokensForModel(text, "gpt-4o")
cost := tok.EstimateCost(text, "gpt-4o")

Scan for secrets

d := tok.NewSecretDetector()
findings := d.DetectSecrets(text)
redacted := d.RedactSecrets(text)

Library API Highlights

  • tok.PromptCompress(text, intensity) — prompt compression (Lite / Full / Ultra). ~150 phrase substitutions, drop-lists for articles / filler / pleasantries, and auto-clarity (security/destructive segments pass through verbatim). intensity is monotonic: len(ultra) <= len(full) <= len(lite).
  • tok.Compress(text, opts...) — the full output pipeline with options: WithCustomFilters, WithCodeAware(lang), WithPerplexityGuided(scorer, ratio), profile options, and more.
  • tok.IsSensitiveFilename(path) — 3-layer filename detection (exact basename, sensitive directory, name token). Companion to the content-based SecretDetector. Catches .env, id_rsa, ~/.ssh/..., test_credentials.json, etc.
  • tok.SmartTruncate(content, maxLines, lang) — code truncation that preserves function signatures and always reports the exact drop count (kept + dropped == total).
  • tok.ExtractJSON / ExtractJSONArray / ExtractAllJSON — brace-balanced JSON extraction from LLM output with surrounding prose, markdown fences, and unterminated objects.
  • tok.NewTracker(ctx) — persistent gain tracker (SQLite + WAL, 90-day retention, pure-Go via modernc.org/sqlite). Aggregate, Recent, Prune queries.
  • tok.EstimateTokensFast / WithEncoding / ForModel — model-aware token estimation.
  • filter.CompressWithRetry — validate-fix-retry loop: caller supplies a Validator and AdjustFunc; the loop escalates mode/intensity and retries up to N times.
  • filter.NewTOMLFilter / LoadTOMLFilterFile — full 8-stage TOML pipeline as a pluggable Filter.

Full reference: pkg.go.dev/github.com/GrayCodeAI/tok.


Compression Modes

Mode Style Savings
lite Drop filler, keep grammar ~20%
full Drop articles, fragments OK ~40% (default)
ultra Telegraphic, abbreviations ~60%
wenyan-lite Classical Chinese light ~30%
wenyan Classical Chinese standard ~50%
wenyan-ultra Classical Chinese max ~70%

Output Filtering (31-Layer Pipeline)

Research-backed algorithms: entropy pruning, perplexity filtering, AST-aware compression, H2O heavy-hitter, attention sink preservation, semantic chunking, and 25+ more.

Custom Filter DSL

Define regex find/replace rules in a TOML file and plug them into the pipeline. Opt-in — no rules, no change.

# filters.toml
[[rule]]
name        = "collapse-uuids"
pattern     = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
replacement = "<uuid>"
priority    = 10
rules, _ := tok.LoadFilterRules("filters.toml")
out, _ := tok.Compress(text, tok.WithCustomFilters(rules))

Team / Shared Compression Profiles

Bundle mode + tier + budget into a named, versioned TOML profile teams can share. Built-ins: default, aggressive, code-safe.

p, _ := tok.LoadProfile(".tok/profiles/code-safe.toml")
out, _ := tok.Compress(text, p.Options()...)

Code-Aware (Symbol-Preserving) Compression

tok.WithCodeAware(lang) marks input as source code and guards function/type/export signatures so compression can never strip them. Defaults to a dependency-light regex symbol provider; swap in an LSP-backed one via WithSymbolProvider.

Perplexity-Guided Token Dropping

tok.WithPerplexityGuided(scorer, ratio) (LLMLingua-style) drops the lowest-importance tokens first. Default HeuristicPerplexityScorer is zero-dependency; plug in your own PerplexityScorer, or tok's experimental OllamaScorer when built with -tags experimental_ollama. Opt-in.


Benchmarks

Measured on this repo via benchmarks/run.sh (raw vs tok.Compress(..., tok.Aggressive)):

fixture raw bytes raw tokens tok bytes tok tokens saved
git log 2,873 718 298 74 89 %
git diff 385,051 96,262 1,117 279 99 %
ls -la 66,341 16,585 148 37 99 %
find .go 19,145 4,786 147 36 99 %

Profile the hot paths with ./scripts/profile.sh [compress|tokens|filter|secrets|all].


Architecture

tok
├── tok.go, *.go         Public library API (top-level package)
├── internal/
│   ├── compress/        Input compression engine (6 modes)
│   ├── filter/          Output pipeline (31 layers)
│   ├── secrets/         Secret detection + redaction
│   ├── tracking/        SQLite token-usage database
│   ├── fastops/         Hot-path primitives (entropy, etc.)
│   └── core/            Token estimation & cost
├── benchmarks/          Token-savings benchmarks
└── evals/               Eval harness

Pure-Go, zero CGO, no runtime dependencies.


Contributing

git clone https://github.com/GrayCodeAI/tok.git && cd tok
make test && make lint
./scripts/build.sh        # verifies the library compiles (go build ./... + go vet)

See CONTRIBUTING.md.


License

MIT