All notable changes to pot-cli will be documented in this file.
plan-enrich-first-partyCLI command to enrich first-party JSONL traces with gold/reference metadataplan-enrich-source-pagesCLI command to enrich first-party browse evidence with fetched source-page metadata (<title>,<h1>, narrow acronym-aware page-text fallback)plan-build-source-claim-mapCLI command to derive source-claim support from first-party traces and gold/reference dataplan-sweep-first-partyCLI command to compare the same traces across multiple reference profiles- Per-profile
sourceClaimMapsupport in plan sweeps, with global fallback still supported - Per-profile
deriveSourceClaim: truesupport to auto-build source-claim evidence directly from enriched first-party traces --enrich-source-pagessupport inplan-build-source-claim-mapandplan-sweep-first-party- Compact sweep
summaryoutput with baseline counts, source-claim counts, and verdict transitions --format textmode for human-readable sweep reports- Stable hard-v2 threshold fixtures and regression test covering coarse/medium/fine plus fine+source-claim behavior
accepted_answerssupport in first-party gold maps for narrow alias-based correctness handling
- First-party enrichment logic is now shared instead of duplicated across commands
- Source-page enrichment logic is reusable across standalone enrichment and source-claim workflows
- Sweep evaluation reuses merged support across baseline and source-claim passes instead of recomputing the whole alignment stack twice
- Workflow docs now cover the full plan-level CLI path and clarify the narrow correctness method used in first-party enrichment
Breaking Changes: None (backward compatible!)
- Flexible Generator Configuration: Support for any OpenAI-compatible LLM provider
- New
.potrc.jsonformat with explicitgenerators,critic,synthesizerarrays - Each generator can specify:
name: Provider label (used in output)model: Model identifierbaseUrl: Custom OpenAI-compatible endpointapiKey: Per-generator API keyprovider: "anthropic": Flag for Anthropic Messages API (non-OpenAI-compatible)
- Auto-detection of base URLs for known providers (xai, moonshot, deepseek, openai)
createProvidersFromConfig()helper function inconfig.ts- Updated
pot configcommand to display new format beautifully
- types.ts: Added
GeneratorConfiginterface, made old fields optional - config.ts: Added migration logic from old format → new format (automatic)
- providers/openai.ts: Constructor accepts dynamic
baseUrl,apiKey,providerName - providers/anthropic.ts: Constructor accepts
apiKeyandproviderName - All commands (ask, audit, deep, debug, review): Simplified to use
createProvidersFromConfig() - Removed hardcoded provider instantiation and
getProviderForModel()string-matching
Old configs are automatically migrated at runtime. No manual changes needed!
Old format:
{
"models": {
"generator1": "grok-4-1-fast",
"generator2": "kimi-k2-turbo-preview",
...
},
"apiKeys": {
"anthropic": "sk-ant-...",
...
}
}New format:
{
"generators": [
{"name": "xAI", "model": "grok-4-1-fast", "baseUrl": "https://api.x.ai/v1/chat/completions", "apiKey": "xai-..."},
{"name": "Moonshot", "model": "kimi-k2-turbo-preview", "baseUrl": "https://api.moonshot.ai/v1/chat/completions", "apiKey": "sk-..."},
{"name": "Anthropic", "model": "claude-sonnet-4-5-20250929", "provider": "anthropic", "apiKey": "sk-ant-..."}
],
"critic": {"name": "Anthropic", "model": "claude-opus-4-6", "provider": "anthropic", "apiKey": "sk-ant-..."},
"synthesizer": {"name": "Anthropic", "model": "claude-opus-4-6", "provider": "anthropic", "apiKey": "sk-ant-..."}
}- Minimum 3 generators required (enforced via
createProvidersFromConfig) - Model diversity check now based on
namefield (not model string matching) - Anthropic provider uses
provider: "anthropic"flag → Messages API - All other providers assume OpenAI-compatible chat/completions endpoint
- Multi-model PoT pipeline (Generators → Critic → Synthesizer)
- Commands:
ask,list,show,config - Block storage as JSON
- Support for Anthropic, xAI, Moonshot, DeepSeek
- German/English language support
- Dry-run mode
- Model Diversity Index (MDI) calculation