Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,10 @@ jobs:
# local dev setup. hooks.test.js / pre-edit-guide.test.js were added
# in v0.31.2 after the matcher-regression bug; future additions
# should be vetted locally first, not blind-globbed.
run: node --test claude-plugin/scripts/lifecycle.test.js claude-plugin/scripts/lifecycle.e2e.test.js claude-plugin/scripts/auto-update.test.js claude-plugin/scripts/session-init.test.js claude-plugin/scripts/hooks.test.js claude-plugin/scripts/pre-edit-guide.test.js scripts/release-smoke.test.js
# pre-grep-guide.test.js + cg-answer.test.js added in v0.47.0
# (deny-with-answer); vetted locally — node-only, stub binary via
# _CG_ANSWER_BINARY, tmpdir fixtures, no cargo build required.
run: node --test claude-plugin/scripts/lifecycle.test.js claude-plugin/scripts/lifecycle.e2e.test.js claude-plugin/scripts/auto-update.test.js claude-plugin/scripts/session-init.test.js claude-plugin/scripts/hooks.test.js claude-plugin/scripts/pre-edit-guide.test.js claude-plugin/scripts/pre-grep-guide.test.js claude-plugin/scripts/cg-answer.test.js scripts/release-smoke.test.js

# Supply-chain CVE scan over Cargo.lock (376 transitive deps including the
# Candle ML stack). cargo-audit reads RustSec advisories and exits non-zero
Expand Down
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,30 @@
# Changelog

## v0.47.0 — feat: answer in the deny — denied greps now return the actual results

**What changes for users**: when the PreToolUse hook denies a symbol-shaped raw grep, the
deny message now CONTAINS the results of the AST-aware equivalent (`code-graph-mcp grep
"<pattern>" [path]`, run synchronously inside the hook, ~20ms warm / 2s timeout) instead of
only suggesting the command. Rationale: measured recommend→use transfer of suggestion-style
interventions is ~0% — the model rarely initiates a new tool call because a message told it
to, but it will use results already in front of it.
**Opt-out / revert**: `CODE_GRAPH_NO_ANSWER_IN_DENY=1` restores the v0.46 static deny;
`CODE_GRAPH_NO_BLOCK_GREP=1` still downgrades the whole block tier to hint.

- **Three deny outcomes** (new `cg-answer.js`, all failure modes degrade, never break the
tool call): ≥1 hit → deny with embedded results (truncated at line boundary to ≤4KB);
CLI missing/error/timeout → v0.46 static deny; **0 hits → the raw grep is ALLOWED** with a
one-line FYI (regex-dialect differences — BRE `\|` vs ripgrep — mean 0 hits is not proof
of absence, so a hard deny could mislead).
- **Funnel semantics**: deny records gain `answered: true|false`; no-hit fallthroughs record
`{action:"hint", fallthrough:"no-hits"}`. Rust readers ignore the extra fields (verified:
CLI `grep` does not write `usage.jsonl`, so hook-initiated runs cannot inflate deny→use).
**Reading note**: an answered deny satisfies the need in-place, so `Deny→use` will read
LOW even when this feature works — segment by `answered` when reading Piece 3.
- **Hook stdin hardening**: hooks now read fd 0 directly instead of `/dev/stdin` (the path
form fails silently when stdin is a socketpair, e.g. under `spawnSync({input})` test
harnesses; real Claude Code pipes were unaffected).

## v0.46.0 — feat: measure whether the DENY stick converts + honest conversion metric

The recommend→use conversion metric (v0.39.0) was producing **zero usable data** in this repo
Expand Down
107 changes: 107 additions & 0 deletions claude-plugin/scripts/cg-answer.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
#!/usr/bin/env node
'use strict';
// Synchronous "answer in the deny" runner (v0.47.0).
//
// When pre-grep-guide denies a symbol-shaped raw grep, the measured
// recommend→use transfer rate of a bare suggestion is ~0% — the model rarely
// initiates a NEW tool call just because a deny message told it to. This module
// closes that gap by running the AST-aware equivalent (`code-graph-mcp grep
// "<pattern>" [path]`) inside the hook and handing the deny path the actual
// results, so the model never has to choose.
//
// Posture mirrors recommendation-log.js: bounded and best-effort. Any failure
// (no binary, nonzero exit, timeout, oversized pattern) degrades to
// `unavailable` and the caller falls back to the static deny — answering is an
// enhancement, never a new failure mode for the tool call.
//
// Verified non-polluting: the CLI `grep` subcommand does not write
// usage.jsonl (only the MCP server's SessionMetrics does), so hook-initiated
// runs cannot inflate the deny→use conversion funnel.

const { spawnSync } = require('child_process');

const DEFAULT_TIMEOUT_MS = 2000;
// ~1000 tokens. A deny reason carrying more than this stops being an answer
// and starts being a context tax.
const DEFAULT_MAX_BYTES = 4000;
const MAX_PATTERN_LEN = 200;
// CLI empty-result contract (text mode): stable prefix owned by this repo.
const NO_MATCH_PREFIX = '[code-graph] No matches';

/**
* Truncate text to maxBytes, cutting at the last complete line that fits.
* Falls back to a hard byte cut when even the first line is oversized.
* @returns {{text: string, truncated: boolean}}
*/
function truncateAtLine(text, maxBytes) {
if (Buffer.byteLength(text, 'utf8') <= maxBytes) {
return { text, truncated: false };
}
const buf = Buffer.from(text, 'utf8');
const head = buf.subarray(0, maxBytes).toString('utf8');
// Drop a possibly half-cut trailing line (and any UTF-8 replacement char
// from a mid-codepoint cut rides along with it).
const lastNl = head.lastIndexOf('\n');
if (lastNl > 0) {
return { text: head.slice(0, lastNl), truncated: true };
}
return { text: buf.subarray(0, maxBytes).toString('latin1'), truncated: true };
}

/**
* Run `code-graph-mcp grep <pattern> [searchPath]` synchronously.
*
* @param {object} opts
* @param {string} opts.cwd project root (hook process.cwd())
* @param {string} opts.pattern the symbol-shaped pattern that triggered the deny
* @param {string} [opts.searchPath] optional path scope extracted from the denied command
* @param {string|null} [opts.binary] binary path; tests inject a stub. Defaults to
* `_CG_ANSWER_BINARY` env override, then findBinary().
* @param {number} [opts.timeoutMs]
* @param {number} [opts.maxBytes]
* @returns {{status: 'hits', text: string, truncated: boolean}
* | {status: 'no-hits'}
* | {status: 'unavailable'}}
*/
function runGrepAnswer(opts = {}) {
const {
cwd,
pattern,
searchPath,
timeoutMs = DEFAULT_TIMEOUT_MS,
maxBytes = DEFAULT_MAX_BYTES,
} = opts;
try {
if (!pattern || typeof pattern !== 'string' || pattern.length > MAX_PATTERN_LEN) {
return { status: 'unavailable' };
}
let binary = opts.binary;
if (binary === undefined) {
binary = process.env._CG_ANSWER_BINARY || require('./find-binary').findBinary();
}
if (!binary) return { status: 'unavailable' };

const args = ['grep', pattern];
if (searchPath) args.push(searchPath);
const res = spawnSync(binary, args, {
cwd,
timeout: timeoutMs,
encoding: 'utf8',
maxBuffer: 4 * 1024 * 1024,
stdio: ['ignore', 'pipe', 'ignore'],
});
if (res.error || res.signal || res.status !== 0) {
return { status: 'unavailable' };
}
const out = (res.stdout || '').trim();
if (!out || out.startsWith(NO_MATCH_PREFIX)) {
return { status: 'no-hits' };
}
const { text, truncated } = truncateAtLine(out, maxBytes);
return { status: 'hits', text, truncated };
} catch {
return { status: 'unavailable' };
}
}

module.exports = { runGrepAnswer, truncateAtLine };
135 changes: 135 additions & 0 deletions claude-plugin/scripts/cg-answer.test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
'use strict';
const test = require('node:test');
const assert = require('node:assert/strict');
const fs = require('fs');
const os = require('os');
const path = require('path');
const { runGrepAnswer, truncateAtLine } = require('./cg-answer');

// Stub "binary": a node script that reacts to its first real arg so one stub
// covers hits / no-hits / error / timeout cases.
let stubDir;
let stubPath;

test.before(() => {
stubDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-answer-test-'));
stubPath = path.join(stubDir, 'cg-stub.js');
fs.writeFileSync(stubPath, `#!/usr/bin/env node
'use strict';
const pattern = process.argv[3] || '';
if (pattern === 'HangForever') { setTimeout(() => {}, 60000); }
else if (pattern === 'ExplodePlease') { process.exit(3); }
else if (pattern === 'NothingHere') {
process.stdout.write('[code-graph] No matches for: NothingHere\\n');
} else {
process.stdout.write(
'src/storage/db.rs:42 fn ' + pattern + '() {\\n' +
' -> fn ' + pattern + ' (lines 42-60)\\n' +
'args=' + JSON.stringify(process.argv.slice(2)) + '\\n');
}
`);
});

test.after(() => {
fs.rmSync(stubDir, { recursive: true, force: true });
});

// Wrap the stub so spawnSync can exec it directly: binary = node, leading arg
// trick is not possible (runGrepAnswer controls args), so expose via a shim
// shell-free approach: point binary at node and prepend the script through
// _CG_ANSWER_BINARY handling is binary-only. Instead make the stub itself
// executable with a node shebang and rely on exec.
function stubBinary() {
fs.chmodSync(stubPath, 0o755);
return stubPath;
}

test('runGrepAnswer: hits → status hits with stdout text', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'fts5_search', binary: stubBinary() });
assert.equal(r.status, 'hits');
assert.match(r.text, /fn fts5_search/);
});

test('runGrepAnswer: passes grep subcommand, pattern and path as argv', () => {
const r = runGrepAnswer({
cwd: stubDir, pattern: 'fts5_search', searchPath: 'src/storage/', binary: stubBinary(),
});
assert.equal(r.status, 'hits');
assert.match(r.text, /args=\["grep","fts5_search","src\/storage\/"\]/);
});

test('runGrepAnswer: omits path argv when no searchPath', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'fts5_search', binary: stubBinary() });
assert.match(r.text, /args=\["grep","fts5_search"\]/);
});

test('runGrepAnswer: CLI "[code-graph] No matches" → status no-hits', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'NothingHere', binary: stubBinary() });
assert.equal(r.status, 'no-hits');
});

test('runGrepAnswer: nonzero exit → unavailable', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'ExplodePlease', binary: stubBinary() });
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: missing binary → unavailable', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'fts5_search', binary: null });
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: nonexistent binary path → unavailable', () => {
const r = runGrepAnswer({
cwd: stubDir, pattern: 'fts5_search', binary: path.join(stubDir, 'nope-bin'),
});
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: timeout → unavailable', () => {
const r = runGrepAnswer({
cwd: stubDir, pattern: 'HangForever', binary: stubBinary(), timeoutMs: 300,
});
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: empty pattern → unavailable (never spawns)', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: '', binary: stubBinary() });
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: oversized pattern (>200ch) → unavailable (never spawns)', () => {
const r = runGrepAnswer({ cwd: stubDir, pattern: 'A'.repeat(201), binary: stubBinary() });
assert.equal(r.status, 'unavailable');
});

test('runGrepAnswer: long output is truncated with marker', () => {
// Stub echoes args= line; force truncation via tiny maxBytes
const r = runGrepAnswer({
cwd: stubDir, pattern: 'fts5_search', binary: stubBinary(), maxBytes: 30,
});
assert.equal(r.status, 'hits');
assert.equal(r.truncated, true);
assert.ok(Buffer.byteLength(r.text, 'utf8') <= 30);
});

// ── truncateAtLine (pure) ───────────────────────────────────────────

test('truncateAtLine: under limit → unchanged, not truncated', () => {
const { text, truncated } = truncateAtLine('a\nb\nc', 100);
assert.equal(text, 'a\nb\nc');
assert.equal(truncated, false);
});

test('truncateAtLine: cuts at a line boundary', () => {
const input = 'line-one\nline-two\nline-three\n';
const { text, truncated } = truncateAtLine(input, 20);
assert.equal(truncated, true);
// 20-byte budget fits 'line-one\nline-two' (17B); the half-cut 'li' is dropped
assert.equal(text, 'line-one\nline-two');
});

test('truncateAtLine: single oversized line → hard cut', () => {
const { text, truncated } = truncateAtLine('x'.repeat(50), 10);
assert.equal(truncated, true);
assert.equal(Buffer.byteLength(text, 'utf8'), 10);
});
Loading
Loading