feat(metrics): deny→use conversion measurement + honest dark-metric visibility#21
Conversation
…isibility fix: make the recommend→use conversion metric honest - stats/health-check surface DARK state instead of silently skipping (absent|empty|live) - stop plugin_e2e metrics leaking into the real usage.jsonl (spawn_server .git marker) - purge of 8 historical test-leak entries done in gitignored data (not in this commit) feat: measure whether the DENY stick actually converts - per-session deny→use funnel via window-join (SessionMetrics.started_at + recs field) - doctor.js stale-path detection (present-but-old-version hook → metric silently dark) - spec: docs/ADOPTION-DENY-MEASUREMENT-PLAN.md (gitignored, local) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds hook coverage stale-entry detection to the Claude plugin doctor diagnostic, and implements recommendation event tracking with funnel analytics. Session metrics now capture timestamps and correlate recommendation events to session windows, while the CLI aggregates funnel counters to compute deny→use and hint→use conversion percentages, surfaced in both health-check and stats commands. ChangesHook Coverage Diagnostics
Recommendation Metrics & Funnel Tracking
Sequence DiagramsequenceDiagram
participant SessionMetrics
participant recommendations.jsonl
participant UsageSummary
participant CliCmd as cmd_stats/cmd_health_check
SessionMetrics->>SessionMetrics: initialize started_at
SessionMetrics->>recommendations.jsonl: read at flush time
recommendations.jsonl-->>SessionMetrics: deny/hint events with ts
SessionMetrics->>SessionMetrics: filter events by time window
SessionMetrics->>SessionMetrics: emit recs (deny, hint counts)
UsageSummary->>UsageSummary: aggregate sessions from usage.jsonl
UsageSummary->>UsageSummary: check recs.deny/recs.hint presence
UsageSummary->>UsageSummary: check tools for code-graph allowlist
UsageSummary->>UsageSummary: increment funnel counters
CliCmd->>recommendations.jsonl: check state (absent/empty/live)
CliCmd->>UsageSummary: read funnel counters
CliCmd->>CliCmd: compute deny→use, hint→use percentages
CliCmd->>CliCmd: format JSON and human output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cli.rs (1)
1074-1077:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
cg_tool_callsis counting every tool call, not just CG query tools.
summary.total_tool_calls()includes housekeeping and other non-CG tools, so this JSON field is mislabeled and the proxy ratio is inflated. The same issue also affects the human proxy line at Line 1143. Recompute both fromCG_QUERY_TOOLSor rename the field back tototal_tool_calls.💡 Suggested fix
+ let cg_tool_calls: u64 = summary + .tools + .iter() + .filter(|(name, _)| CG_QUERY_TOOLS.contains(&name.as_str())) + .map(|(_, agg)| agg.n) + .sum(); println!("{}", serde_json::json!({ @@ - "cg_tool_calls": summary.total_tool_calls(), + "cg_tool_calls": cg_tool_calls, "conversion_ratio": if recs.total > 0 { - (summary.total_tool_calls() as f64 / recs.total as f64 * 100.0).round() / 100.0 + (cg_tool_calls as f64 / recs.total as f64 * 100.0).round() / 100.0 } else { 0.0 },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cli.rs` around lines 1074 - 1077, The JSON field "cg_tool_calls" and its derived "conversion_ratio" are using summary.total_tool_calls(), which counts all tools; instead either rename the field to "total_tool_calls" or (preferred) recompute both values by summing only tools in the CG_QUERY_TOOLS set: iterate over summary's per-tool counts (use the map/structure that holds tool counts in summary) and sum counts for names in CG_QUERY_TOOLS to produce cg_tool_calls, then compute conversion_ratio as (cg_tool_calls as f64 / recs.total as f64 * 100.0) rounded as before; apply the same change to the analogous human-proxy calculation referenced by the human proxy code path so both use CG_QUERY_TOOLS-filtered counts instead of summary.total_tool_calls().
🧹 Nitpick comments (1)
tests/cli_e2e.rs (1)
779-799: ⚡ Quick winConsider adding test coverage for hint→use funnel.
The test verifies deny→use funnel attribution correctly. However, according to the stack context,
UsageSummarytracks both deny→use and hint→use funnels (viasessions_with_hintandsessions_with_hint_and_cgfields), and the stats output should include both conversions. Currently, no test verifies the hint→use funnel path—both sessions in this test havehint:0.🧪 Suggested test case for hint→use funnel
Add a test similar to this one but with sessions containing hint recommendations:
#[test] fn test_cli_stats_hint_to_use_funnel() { let project = setup_indexed_project(); let cg = project.path().join(code_graph_mcp::domain::CODE_GRAPH_DIR); // Two hint-sessions, one of which called a cg query tool → 1/2 = 50%. let s1 = "{\"ts\":\"2026-06-10T10:00:00Z\",\"v\":\"0.45.4\",\"tools\":{\"get_call_graph\":{\"n\":1,\"ms\":5,\"err\":0,\"max_ms\":5}},\"recs\":{\"deny\":0,\"hint\":1}}"; let s2 = "{\"ts\":\"2026-06-10T11:00:00Z\",\"v\":\"0.45.4\",\"tools\":{},\"recs\":{\"deny\":0,\"hint\":1}}"; std::fs::write(cg.join("usage.jsonl"), format!("{s1}\n{s2}\n")).unwrap(); let (stdout, _, code) = run_cli(&project, &["stats"]); assert_eq!(code, 0); assert!(stdout.contains("Hint→use:") && stdout.contains("50%"), "stats must print the hint→use funnel; got: {stdout:?}"); let (jstdout, _, jcode) = run_cli(&project, &["stats", "--json"]); assert_eq!(jcode, 0); let v: serde_json::Value = serde_json::from_str(jstdout.trim()).unwrap(); let funnel = &v["recommendations"]["funnel"]; assert_eq!(funnel["hint_sessions"], 2); assert_eq!(funnel["hint_then_cg"], 1); assert_eq!(funnel["hint_conversion"], 0.5); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/cli_e2e.rs` around lines 779 - 799, Add a new test mirroring test_cli_stats_deny_to_use_funnel to cover the hint→use funnel: create a fn test_cli_stats_hint_to_use_funnel that writes two usage.jsonl sessions (both with "recs":{"deny":0,"hint":1}, one including a get_call_graph tool) to the code_graph_mcp::domain::CODE_GRAPH_DIR, call run_cli(&project, &["stats"]) and assert stdout contains "Hint→use" and "50%", then call run_cli(&project, &["stats","--json"]) and assert the JSON at v["recommendations"]["funnel"] has hint_sessions == 2, hint_then_cg == 1, and hint_conversion == 0.5; use the same helpers and patterns from test_cli_stats_deny_to_use_funnel (setup_indexed_project, run_cli) to locate and implement the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@claude-plugin/scripts/doctor.js`:
- Around line 257-277: The loop currently marks entries as present when
isOurHookEntry(entry) is true even if entry.hooks[0].command is missing, which
lets empty/commandless registrations slip past the stale check; update the stale
calculation so a present key with no command is treated as stale by changing the
predicate in the stale computation to treat falsy presentCmd[k] as stale (e.g.
in the stale = expected.filter(...) expression, use present.has(k) &&
desiredCmd[k] && (!presentCmd[k] || presentCmd[k] !== desiredCmd[k])); this
keeps the isOurHookEntry / present set logic intact but ensures commandless
entries are reported as stale/missing.
In `@claude-plugin/scripts/doctor.test.js`:
- Around line 66-75: The test mutates a hook command using a hardcoded
'/scripts/' separator which fails on Windows; update the mutation in the test
(the line that replaces bash.hooks[0].command) to be separator-agnostic by
matching either POSIX or Windows separators (e.g. use a regex like
[\\/]+scripts[\\/] or build the replacement using path.sep) so the stale-path
branch exercised by surveyHookCoverage and settingsWithCurrentHooks still
triggers on all platforms.
In `@src/cli.rs`:
- Around line 1015-1022: The computed rec_state (based on recs,
rec_path.exists(), and aggregate_recommendations_jsonl) is never surfaced
because early returns (in functions producing stats) occur before any
recommendations payload is emitted; ensure the recommendations.state is always
included in the emitted stats payload even on cold-start/no-usage paths by
adding recommendations/state (using rec_state) to the stats or recommendations
object before any return path that currently omits it (references: rec_state,
recs, rec_path, aggregate_recommendations_jsonl, and the function producing
stats/recommendations); update the early-return branches to attach a minimal
recommendations object with state "absent"/"empty"/"live" so callers always
receive the field.
- Line 698: The JSON health-check returned by the no-index/onboarding path is
missing the conversion_metric field because the early-return branch (the block
that returns before the post-open path) doesn't include it; update that
early-return JSON construction in src/cli.rs to call
recommendation_metric_state(project_root) and include its result under the
"conversion_metric" key (the same value currently added in the post-open branch)
so both code paths produce identical fields for "conversion_metric".
In `@src/mcp/metrics.rs`:
- Around line 244-249: The current join only filters recommendations by
started_at; update the join to also bound by this session's end by passing the
session end timestamp into count_recs_in_window and enforcing started_at <= ts
<= ended_at. Concretely, change the call site that reads recommendations.jsonl
(the block using usage_path.parent() and calling count_recs_in_window(&content,
&self.started_at)) to pass self.ended_at (or a reference to it) as a second
argument, and update the count_recs_in_window function signature/logic to accept
the end timestamp and filter recommendation records so their ts is between
started_at and ended_at (inclusive) before counting deny/hint.
---
Outside diff comments:
In `@src/cli.rs`:
- Around line 1074-1077: The JSON field "cg_tool_calls" and its derived
"conversion_ratio" are using summary.total_tool_calls(), which counts all tools;
instead either rename the field to "total_tool_calls" or (preferred) recompute
both values by summing only tools in the CG_QUERY_TOOLS set: iterate over
summary's per-tool counts (use the map/structure that holds tool counts in
summary) and sum counts for names in CG_QUERY_TOOLS to produce cg_tool_calls,
then compute conversion_ratio as (cg_tool_calls as f64 / recs.total as f64 *
100.0) rounded as before; apply the same change to the analogous human-proxy
calculation referenced by the human proxy code path so both use
CG_QUERY_TOOLS-filtered counts instead of summary.total_tool_calls().
---
Nitpick comments:
In `@tests/cli_e2e.rs`:
- Around line 779-799: Add a new test mirroring
test_cli_stats_deny_to_use_funnel to cover the hint→use funnel: create a fn
test_cli_stats_hint_to_use_funnel that writes two usage.jsonl sessions (both
with "recs":{"deny":0,"hint":1}, one including a get_call_graph tool) to the
code_graph_mcp::domain::CODE_GRAPH_DIR, call run_cli(&project, &["stats"]) and
assert stdout contains "Hint→use" and "50%", then call run_cli(&project,
&["stats","--json"]) and assert the JSON at v["recommendations"]["funnel"] has
hint_sessions == 2, hint_then_cg == 1, and hint_conversion == 0.5; use the same
helpers and patterns from test_cli_stats_deny_to_use_funnel
(setup_indexed_project, run_cli) to locate and implement the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c933d38-d5a4-414f-b787-679b3eefa1ad
📒 Files selected for processing (6)
claude-plugin/scripts/doctor.jsclaude-plugin/scripts/doctor.test.jssrc/cli.rssrc/mcp/metrics.rstests/cli_e2e.rstests/plugin_e2e.rs
| if (isOurHookEntry(entry)) { | ||
| present.add(`${event}:${entry.matcher || '*'}`); | ||
| const key = `${event}:${entry.matcher || '*'}`; | ||
| present.add(key); | ||
| if (entry.hooks && entry.hooks[0] && entry.hooks[0].command) { | ||
| presentCmd[key] = entry.hooks[0].command; | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| const missing = expected.filter(k => !present.has(k)); | ||
| return { expected, present: [...present], missing }; | ||
| // Stale = present but the registered command no longer matches what we'd write | ||
| // now (points at an old plugin-cache version dir / moved path). A stale path can | ||
| // run pre-recordRecommendation hook code, so the hook fires but the conversion | ||
| // metric stays dark — invisible to a present/absent check. This is the | ||
| // 0.45.1-registered-while-0.45.4-active case the RCA surfaced. | ||
| const stale = expected.filter(k => | ||
| present.has(k) && desiredCmd[k] && presentCmd[k] && presentCmd[k] !== desiredCmd[k] | ||
| ); | ||
| return { expected, present: [...present], missing, stale }; |
There was a problem hiding this comment.
Don't count commandless hook entries as covered.
isOurHookEntry() accepts entries by description alone, so an entry with the right (event, matcher) but an empty hooks array still reaches Line 259. From there it is marked present, but Line 275 drops it from stale because presentCmd[k] is falsy. That makes Lines 209-214 report coverage as OK even though Claude has nothing executable for that hook. Treat missing/empty commands as stale, or keep those entries out of present, so doctor doesn't false-negative broken registrations.
Suggested fix
for (const entry of entries) {
if (isOurHookEntry(entry)) {
const key = `${event}:${entry.matcher || '*'}`;
present.add(key);
- if (entry.hooks && entry.hooks[0] && entry.hooks[0].command) {
- presentCmd[key] = entry.hooks[0].command;
- }
+ presentCmd[key] = entry.hooks && entry.hooks[0]
+ ? entry.hooks[0].command || null
+ : null;
}
}
}
}
@@
const stale = expected.filter(k =>
- present.has(k) && desiredCmd[k] && presentCmd[k] && presentCmd[k] !== desiredCmd[k]
+ present.has(k) && desiredCmd[k] && presentCmd[k] !== desiredCmd[k]
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (isOurHookEntry(entry)) { | |
| present.add(`${event}:${entry.matcher || '*'}`); | |
| const key = `${event}:${entry.matcher || '*'}`; | |
| present.add(key); | |
| if (entry.hooks && entry.hooks[0] && entry.hooks[0].command) { | |
| presentCmd[key] = entry.hooks[0].command; | |
| } | |
| } | |
| } | |
| } | |
| } | |
| const missing = expected.filter(k => !present.has(k)); | |
| return { expected, present: [...present], missing }; | |
| // Stale = present but the registered command no longer matches what we'd write | |
| // now (points at an old plugin-cache version dir / moved path). A stale path can | |
| // run pre-recordRecommendation hook code, so the hook fires but the conversion | |
| // metric stays dark — invisible to a present/absent check. This is the | |
| // 0.45.1-registered-while-0.45.4-active case the RCA surfaced. | |
| const stale = expected.filter(k => | |
| present.has(k) && desiredCmd[k] && presentCmd[k] && presentCmd[k] !== desiredCmd[k] | |
| ); | |
| return { expected, present: [...present], missing, stale }; | |
| if (isOurHookEntry(entry)) { | |
| const key = `${event}:${entry.matcher || '*'}`; | |
| present.add(key); | |
| presentCmd[key] = entry.hooks && entry.hooks[0] | |
| ? entry.hooks[0].command || null | |
| : null; | |
| } | |
| } | |
| } | |
| } | |
| const missing = expected.filter(k => !present.has(k)); | |
| // Stale = present but the registered command no longer matches what we'd write | |
| // now (points at an old plugin-cache version dir / moved path). A stale path can | |
| // run pre-recordRecommendation hook code, so the hook fires but the conversion | |
| // metric stays dark — invisible to a present/absent check. This is the | |
| // 0.45.1-registered-while-0.45.4-active case the RCA surfaced. | |
| const stale = expected.filter(k => | |
| present.has(k) && desiredCmd[k] && presentCmd[k] !== desiredCmd[k] | |
| ); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@claude-plugin/scripts/doctor.js` around lines 257 - 277, The loop currently
marks entries as present when isOurHookEntry(entry) is true even if
entry.hooks[0].command is missing, which lets empty/commandless registrations
slip past the stale check; update the stale calculation so a present key with no
command is treated as stale by changing the predicate in the stale computation
to treat falsy presentCmd[k] as stale (e.g. in the stale = expected.filter(...)
expression, use present.has(k) && desiredCmd[k] && (!presentCmd[k] ||
presentCmd[k] !== desiredCmd[k])); this keeps the isOurHookEntry / present set
logic intact but ensures commandless entries are reported as stale/missing.
| test('surveyHookCoverage flags a present-but-stale hook path', () => { | ||
| const settings = settingsWithCurrentHooks(); | ||
| // Repoint one PreToolUse entry at an old plugin-cache version dir — present, | ||
| // recognized as ours (description unchanged), but command no longer current. | ||
| const bash = settings.hooks.PreToolUse.find(e => e.matcher === 'Bash'); | ||
| bash.hooks[0].command = bash.hooks[0].command.replace('/scripts/', '/0.0.1-old/scripts/'); | ||
| const cov = surveyHookCoverage(settings); | ||
| assert.equal(cov.missing.length, 0, 'entry is present, not missing'); | ||
| assert.ok(cov.stale.includes('PreToolUse:Bash'), | ||
| `stale Bash path should be flagged; got stale=${JSON.stringify(cov.stale)}`); |
There was a problem hiding this comment.
Make the stale-path rewrite separator-agnostic.
Line 71 hardcodes '/scripts/', but buildSettingsHookEntries() builds these commands with path.join(...). On Windows that leaves \scripts\ untouched, so this test stops creating a stale command and no longer covers the branch it claims to validate. Use path.sep or a [\\/]scripts[\\/] regex here.
Suggested fix
- bash.hooks[0].command = bash.hooks[0].command.replace('/scripts/', '/0.0.1-old/scripts/');
+ bash.hooks[0].command = bash.hooks[0].command.replace(
+ /([\\/])scripts\1/,
+ (_, sep) => `${sep}0.0.1-old${sep}scripts${sep}`,
+ );📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| test('surveyHookCoverage flags a present-but-stale hook path', () => { | |
| const settings = settingsWithCurrentHooks(); | |
| // Repoint one PreToolUse entry at an old plugin-cache version dir — present, | |
| // recognized as ours (description unchanged), but command no longer current. | |
| const bash = settings.hooks.PreToolUse.find(e => e.matcher === 'Bash'); | |
| bash.hooks[0].command = bash.hooks[0].command.replace('/scripts/', '/0.0.1-old/scripts/'); | |
| const cov = surveyHookCoverage(settings); | |
| assert.equal(cov.missing.length, 0, 'entry is present, not missing'); | |
| assert.ok(cov.stale.includes('PreToolUse:Bash'), | |
| `stale Bash path should be flagged; got stale=${JSON.stringify(cov.stale)}`); | |
| test('surveyHookCoverage flags a present-but-stale hook path', () => { | |
| const settings = settingsWithCurrentHooks(); | |
| // Repoint one PreToolUse entry at an old plugin-cache version dir — present, | |
| // recognized as ours (description unchanged), but command no longer current. | |
| const bash = settings.hooks.PreToolUse.find(e => e.matcher === 'Bash'); | |
| bash.hooks[0].command = bash.hooks[0].command.replace( | |
| /([\\/])scripts\1/, | |
| (_, sep) => `${sep}0.0.1-old${sep}scripts${sep}`, | |
| ); | |
| const cov = surveyHookCoverage(settings); | |
| assert.equal(cov.missing.length, 0, 'entry is present, not missing'); | |
| assert.ok(cov.stale.includes('PreToolUse:Bash'), | |
| `stale Bash path should be flagged; got stale=${JSON.stringify(cov.stale)}`); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@claude-plugin/scripts/doctor.test.js` around lines 66 - 75, The test mutates
a hook command using a hardcoded '/scripts/' separator which fails on Windows;
update the mutation in the test (the line that replaces bash.hooks[0].command)
to be separator-agnostic by matching either POSIX or Windows separators (e.g.
use a regex like [\\/]+scripts[\\/] or build the replacement using path.sep) so
the stale-path branch exercised by surveyHookCoverage and
settingsWithCurrentHooks still triggers on all platforms.
| "embedding_status": embedding_status, | ||
| "model_available": model_available, | ||
| "snapshot": snapshot_block, | ||
| "conversion_metric": recommendation_metric_state(project_root), |
There was a problem hiding this comment.
The no-index JSON health-check still omits conversion_metric.
This field is only added in the post-open branch. When index.db is missing, Line 576 through Line 596 return earlier, so JSON callers lose the new state exactly on the onboarding/no-index path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cli.rs` at line 698, The JSON health-check returned by the
no-index/onboarding path is missing the conversion_metric field because the
early-return branch (the block that returns before the post-open path) doesn't
include it; update that early-return JSON construction in src/cli.rs to call
recommendation_metric_state(project_root) and include its result under the
"conversion_metric" key (the same value currently added in the post-open branch)
so both code paths produce identical fields for "conversion_metric".
| let rec_exists = rec_path.exists(); | ||
| let recs = std::fs::read_to_string(&rec_path).ok() | ||
| .map(|c| aggregate_recommendations_jsonl(&c)) | ||
| .unwrap_or_default(); | ||
| // Recording-side state of the conversion metric, made explicit so a dark | ||
| // metric (file absent → PreToolUse hooks not recording here) is never | ||
| // silently indistinguishable from "feature absent" or "no data yet". | ||
| let rec_state = if recs.total > 0 { "live" } else if rec_exists { "empty" } else { "absent" }; |
There was a problem hiding this comment.
stats still hides recommendations.state on the no-usage paths.
rec_state is computed here, but Line 994 through Line 1005 and Line 1024 through Line 1031 both return before any recommendations payload is emitted. That makes absent vs empty vs live invisible in the cold-start cases this diagnostic is supposed to explain.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cli.rs` around lines 1015 - 1022, The computed rec_state (based on recs,
rec_path.exists(), and aggregate_recommendations_jsonl) is never surfaced
because early returns (in functions producing stats) occur before any
recommendations payload is emitted; ensure the recommendations.state is always
included in the emitted stats payload even on cold-start/no-usage paths by
adding recommendations/state (using rec_state) to the stats or recommendations
object before any return path that currently omits it (references: rec_state,
recs, rec_path, aggregate_recommendations_jsonl, and the function producing
stats/recommendations); update the early-return branches to attach a minimal
recommendations object with state "absent"/"empty"/"live" so callers always
receive the field.
| if let Some(dir) = usage_path.parent() { | ||
| if let Ok(content) = std::fs::read_to_string(dir.join("recommendations.jsonl")) { | ||
| let (deny, hint) = count_recs_in_window(&content, &self.started_at); | ||
| if deny > 0 || hint > 0 { | ||
| record["recs"] = serde_json::json!({ "deny": deny, "hint": hint }); | ||
| } |
There was a problem hiding this comment.
Bound the recommendation join by session end.
This only constrains ts >= started_at. If two MCP sessions share a repo, recommendations emitted by a later overlapping session can be attributed to whichever earlier session flushes afterward, which writes inflated recs into usage.jsonl and skews the deny/hint funnel. Pass this session’s end timestamp into the helper and require started_at <= ts <= ended_at.
💡 Suggested fix
- let (deny, hint) = count_recs_in_window(&content, &self.started_at);
+ let ended_at = record
+ .get("ts")
+ .and_then(|v| v.as_str())
+ .unwrap_or(&self.started_at)
+ .to_owned();
+ let (deny, hint) =
+ count_recs_in_window(&content, &self.started_at, &ended_at);
if deny > 0 || hint > 0 {
record["recs"] = serde_json::json!({ "deny": deny, "hint": hint });
}
}
}
@@
-pub(crate) fn count_recs_in_window(rec_content: &str, started_at: &str) -> (u64, u64) {
+pub(crate) fn count_recs_in_window(
+ rec_content: &str,
+ started_at: &str,
+ ended_at: &str,
+) -> (u64, u64) {
@@
- if ts < started_at {
+ if ts < started_at || ts > ended_at {
continue;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if let Some(dir) = usage_path.parent() { | |
| if let Ok(content) = std::fs::read_to_string(dir.join("recommendations.jsonl")) { | |
| let (deny, hint) = count_recs_in_window(&content, &self.started_at); | |
| if deny > 0 || hint > 0 { | |
| record["recs"] = serde_json::json!({ "deny": deny, "hint": hint }); | |
| } | |
| if let Some(dir) = usage_path.parent() { | |
| if let Ok(content) = std::fs::read_to_string(dir.join("recommendations.jsonl")) { | |
| let ended_at = record | |
| .get("ts") | |
| .and_then(|v| v.as_str()) | |
| .unwrap_or(&self.started_at) | |
| .to_owned(); | |
| let (deny, hint) = | |
| count_recs_in_window(&content, &self.started_at, &ended_at); | |
| if deny > 0 || hint > 0 { | |
| record["recs"] = serde_json::json!({ "deny": deny, "hint": hint }); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/mcp/metrics.rs` around lines 244 - 249, The current join only filters
recommendations by started_at; update the join to also bound by this session's
end by passing the session end timestamp into count_recs_in_window and enforcing
started_at <= ts <= ended_at. Concretely, change the call site that reads
recommendations.jsonl (the block using usage_path.parent() and calling
count_recs_in_window(&content, &self.started_at)) to pass self.ended_at (or a
reference to it) as a second argument, and update the count_recs_in_window
function signature/logic to accept the end timestamp and filter recommendation
records so their ts is between started_at and ended_at (inclusive) before
counting deny/hint.
Two related batches on the recommend→use conversion metric. No published tool-schema (
src/mcp/tools.rs) changes; all surfaces are diagnostic (stats,health-check,doctor) or internal (metrics, tests).fix: make the conversion metric honest
The v0.39.0 recommend→use metric was producing zero usable data in this repo and was silently invisible.
stats(text + JSONrecommendations.state) andhealth-checknow surfaceabsent | empty | liveinstead of silently skipping the block. You can no longer mistake "hooks not recording" for "feature absent".resolve_project_root_fromwalks up to the nearest ancestor.git;plugin_e2e::spawn_serverwrote only aCargo.toml, so a fixture under the repo flushed test metrics (nonexistent_tool,dur_s:0) into the realusage.jsonl, corrupting the conversion denominator. Fix = test isolation (spawn_serverdrops a.gitmarker); prod resolve logic unchanged (subdir-CLI behavior is intended). Regression testtest_stdio_metrics_isolated_from_ancestor_repo.usage.jsonlbyte-for-byte unchanged.feat: measure whether the DENY stick actually converts
HINT-tier nudges convert ≈0; DENY (
pre-grep-guide.jspermissionDecision:'deny') is the only lever that mechanically changes tool selection — but it was never cleanly measured. This makes it measurable, without adding/broadening any intervention.SessionMetricsstampsstarted_at; flush counts in-window deny/hint fromrecommendations.jsonlinto the session's usage record ("recs":{deny,hint}, additive — no schema bump).statsprintsDeny→use: M/N = X%(+ hint); JSON addsrecommendations.funnel. "Used cg" = the 14 understanding tools (excludes housekeeping).recordRecommendationcode, keeping the metric dark. Addedstale[]tosurveyHookCoverage→ routes to the existing re-register fix.Follow-up (not in this PR)
docs/ADOPTION-DENY-MEASUREMENT-PLAN.md(gitignored, local).Validation
cargo testall green (lib 476, cli_e2e 143, etc.);cargo +1.95.0 clippy --all-targetsclean;node --test doctor.test.js lifecycle.test.js33/33; full suite leavesusage.jsonlunchanged (no leak).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Tests