Skip to content

Commit 4ac6296

Browse files
authored
feat(session-log): /log-session command + 6-field format spec (#114)
## Context Roadmap **P0.2** from the [2026-04-19 overnight research implementation roadmap](https://github.com/mojwang/mojwang.tech/blob/main/../../../../../ai/workspace/claude/resources/agentic-research/2026-04-19-implementation-roadmap.md). Unblocks **P0.3** (\`/grade-session\` retroactive grading) and **P5.1** (meta-agent) — both need a real log with real rows. PR #112 shipped documentation for a 5-field cost-log format but never shipped a writer. This PR ships the writer and extends the format to 6 fields: adds \`agent_sha\` so later analysis can correlate outcomes with specific agent-prompt versions (P5.1 requirement). ## Format (pipe-separated) \`\`\` YYYY-MM-DDTHH:MM:SS | topic | dispatches | models | outcome | agent_sha \`\`\` Outcome enum: \`shipped | partial | reverted | blocked | plan-only | (empty)\`. Empty = "grade later," filled retroactively by \`/grade-session\` (forthcoming in P0.3). ## Design decisions - **Slash command + shell script, NOT a Stop hook.** Claude Code's \`Stop\` hook fires every turn, not session end, so a hook-driven autolog would misfire on partial turns. One explicit \`/log-session\` per session is cleaner and gives the user control over topic/dispatches/models (fields no hook could auto-capture). - **Timestamp instead of just date** — multiple sessions per day are common. - **Auto-captured fields:** timestamp (\`date\`), agent_sha (\`git log\` on \`.claude/agents/\`). - **\`--outcome\` sentinel:** \`--outcome \"\"\` explicitly marks ungraded, skipping the prompt; missing \`--outcome\` prompts interactively. - **Pipe-char sanitization:** user fields replace \`|\` with \`;\` so malformed input doesn't break column parsing. - **Log file gitignored** — personal state, not a shared artifact. ## Files - \`scripts/log-session.sh\` — new, executable, \`shellcheck\` clean, follows \`#!/usr/bin/env bash\` + \`set -euo pipefail\` convention - \`.claude/commands/log-session.md\` — new slash-command wrapper - \`docs/CLAUDE_AGENTS.md\` — updated format doc (6 fields, broader outcome enum, usage example) - \`.gitignore\` — adds \`scripts/.session-cost.log\` ## Usage \`\`\` /log-session --topic \"P0.2 session-logger infra\" --dispatches \"—\" --models \"sonnet\" --outcome \"shipped\" \`\`\` Missing flags prompt interactively. ## Verification - [x] \`shellcheck scripts/log-session.sh\` — clean - [x] Happy path: \`--topic … --outcome shipped\` writes a row - [x] Empty outcome: \`--outcome \"\"\` writes a row with blank 5th field - [x] Invalid outcome: \`--outcome bogus\` exits 2 with clear error - [x] Pipe sanitization: topic containing \`|\` is rewritten as \`;\` - [x] Gitignored: log file doesn't appear in \`git status\` ## Out of scope (deferred to P0.3) - \`/grade-session\` retroactive grading (walks ungraded rows) - \`weekly-review\` command update to surface ungraded sessions - SessionStart hook nudge (\"N ungraded sessions\") - Auto-capture of dispatches + models via transcript parsing (Claude Code doesn't expose transcripts to hooks)
1 parent 0c5ac08 commit 4ac6296

4 files changed

Lines changed: 187 additions & 9 deletions

File tree

.claude/commands/log-session.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
description: Append a row to scripts/.session-cost.log capturing this session's directional cost signal (topic, dispatches, models, outcome). Auto-captures timestamp + agent_sha. Outcome can be left empty to grade retroactively via /grade-session.
3+
allowed-tools: Bash
4+
---
5+
6+
# Log Session
7+
8+
Run the session logger script, forwarding any user-supplied args:
9+
10+
```
11+
./scripts/log-session.sh $ARGUMENTS
12+
```
13+
14+
## Usage
15+
16+
All flags are optional — the script prompts for missing required fields:
17+
18+
```
19+
/log-session --topic "<one-liner>" --dispatches "<agent x tier or —>" --models "<comma-sep>" --outcome "<enum or empty>"
20+
```
21+
22+
Example:
23+
24+
```
25+
/log-session --topic "P0.2 session-logger infra" --dispatches "—" --models "sonnet" --outcome "shipped"
26+
```
27+
28+
`--outcome ""` is valid — a blank outcome means "grade later." Use `/grade-session` (forthcoming in P0.3) to fill in retroactively.
29+
30+
## Fields
31+
32+
- **topic**: one-line session summary
33+
- **dispatches**: agent count × tier (e.g. `Explore x3, Plan x1` or `` for direct work)
34+
- **models**: comma-separated as provided (e.g. `haiku,sonnet`) — logged as-is, no dedup
35+
- **outcome**: one of `shipped | partial | reverted | blocked | plan-only` — or empty to grade later
36+
37+
Captured automatically:
38+
- **timestamp**: ISO 8601 local time
39+
- **agent_sha**: short SHA of the most recent commit that touched `.claude/agents/` as of logging time (correlates outcomes with prompt versions; reflects end-of-session state if agents were edited mid-session)

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,3 +90,6 @@ product-brief.md
9090

9191
# Claude Code worktree metadata
9292
.claude/worktrees/
93+
94+
# Session cost log (populated by /log-session at end of each session)
95+
scripts/.session-cost.log

docs/CLAUDE_AGENTS.md

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -176,20 +176,28 @@ Model routing is enforced via agent frontmatter. Each agent has a `model:` field
176176

177177
### Session cost log
178178

179-
At session end, append a single line to `scripts/.session-cost.log` (append-only, gitignored, reviewed at weekly review):
179+
At session end, append a single line to `scripts/.session-cost.log` (append-only, gitignored, reviewed at weekly review). Populated via the `/log-session` slash command, which runs `scripts/log-session.sh`:
180180

181181
```
182-
YYYY-MM-DD | <session_topic> | <dispatches> | <models_used> | <outcome>
182+
YYYY-MM-DDTHH:MM:SS | <session_topic> | <dispatches> | <models_used> | <outcome> | <agent_sha>
183183
```
184184

185-
Fields:
186-
- **date**: ISO date
187-
- **session_topic**: one-line summary ("/mind aspects Phase 6 B-C-D")
188-
- **dispatches**: agent count × tier, e.g. `Explore×3, Plan×1, impl×5` or `` for direct work
189-
- **models_used**: comma-separated, deduped — `haiku,sonnet` / `sonnet,opus`
190-
- **outcome**: one-word rollup — `shipped`, `blocked`, `partial`, `plan-only`
185+
Fields (pipe-separated, six total):
186+
- **timestamp**: ISO 8601 local time (auto-captured). Use timestamps, not just date — multiple sessions per day are common.
187+
- **session_topic**: one-line summary ("/mind aspects Phase 6 B-C-D"). User-supplied.
188+
- **dispatches**: agent count × tier, e.g. `Explore×3, Plan×1, impl×5` or `` for direct work. User-supplied.
189+
- **models_used**: comma-separated, user-supplied as-is (e.g. `haiku,sonnet` / `sonnet,opus`). No dedup — the order and count the user types is what gets logged.
190+
- **outcome**: one of `shipped | partial | reverted | blocked | plan-only`. User-supplied at session end; empty is valid and means "grade later" (fill in retroactively via `/grade-session`).
191+
- **agent_sha**: short SHA of the most recent commit that touched `.claude/agents/` as of logging time (auto-captured via `git log -1`). Ties a row to a specific agent-prompt version so later analysis can correlate outcomes with prompt changes. Caveat: if agents are edited mid-session, this SHA reflects end-of-session state, not start.
191192

192-
Not exact token counting — Claude Code doesn't expose that. Gives directional cost awareness over time. Patterns emerge at weekly review: "research-heavy sessions cost more than implementation sessions," "sessions with >10 dispatches usually needed Graphite instead of manual stacks."
193+
Usage:
194+
```
195+
/log-session --topic "session summary" --dispatches "Explore×3" --models "haiku,sonnet" --outcome "shipped"
196+
```
197+
198+
Missing flags are prompted interactively. `--outcome ""` explicitly marks the row ungraded.
199+
200+
Not exact token counting — Claude Code doesn't expose that. Gives directional cost awareness over time. Patterns emerge at weekly review: "research-heavy sessions cost more than implementation sessions," "sessions with >10 dispatches usually needed Graphite instead of manual stacks," "the planner has a higher shipped-rate on SHA X than SHA Y."
193201

194202
See [`STACKED_PRS.md`](./STACKED_PRS.md) for when manual stacking indicates missing tooling.
195203

scripts/log-session.sh

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
#!/usr/bin/env bash
2+
# Append a single row to scripts/.session-cost.log capturing this
3+
# session's directional cost signal (dispatches, models, outcome) plus
4+
# an agent_sha that ties the row to a specific `.claude/agents/` version.
5+
#
6+
# Invoked by the /log-session slash command at end of a session, or
7+
# directly from the shell. Fields are documented in
8+
# docs/CLAUDE_AGENTS.md.
9+
#
10+
# Roadmap reference: P0.2 from the 2026-04-19 overnight research.
11+
12+
set -euo pipefail
13+
14+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
15+
REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
16+
LOG_PATH="$REPO_ROOT/scripts/.session-cost.log"
17+
18+
TOPIC=""
19+
DISPATCHES=""
20+
MODELS=""
21+
OUTCOME=""
22+
# Only outcome has a legitimate empty value (grade later via /grade-session).
23+
# A sentinel lets `--outcome ""` bypass the prompt explicitly; missing
24+
# --outcome still prompts interactively.
25+
OUTCOME_SET=0
26+
27+
usage() {
28+
cat <<EOF
29+
Usage: $(basename "$0") [--topic "..."] [--dispatches "..."] [--models "..."] [--outcome "..."]
30+
31+
Appends one row to scripts/.session-cost.log.
32+
33+
Missing required fields are prompted interactively. Outcome is validated
34+
against the enum: shipped | partial | reverted | blocked | plan-only
35+
(empty allowed — grade later via /grade-session).
36+
37+
Fields:
38+
--topic One-line session summary (required)
39+
--dispatches Agent count x tier, e.g. "Explore x3, Plan x1" or "—"
40+
--models Comma-separated models used, e.g. "haiku,sonnet"
41+
--outcome One of: shipped | partial | reverted | blocked | plan-only | (empty)
42+
EOF
43+
}
44+
45+
# Defense against `--flag` passed with no value (e.g. last arg of the
46+
# line) or with another flag as its value. Without these guards, `set -u`
47+
# aborts on `$2` with "unbound variable" — opaque for the user. The
48+
# short-circuit `$# -lt 2` keeps `$2` from being evaluated when there's
49+
# only one arg left.
50+
_require_value() {
51+
local flag="$1"; shift
52+
if [[ $# -lt 1 || "$1" == -* ]]; then
53+
echo "Error: $flag requires a value" >&2
54+
usage >&2
55+
exit 2
56+
fi
57+
}
58+
59+
while [[ $# -gt 0 ]]; do
60+
case "$1" in
61+
--topic) _require_value "$1" "${@:2:1}"; TOPIC="$2"; shift 2 ;;
62+
--dispatches) _require_value "$1" "${@:2:1}"; DISPATCHES="$2"; shift 2 ;;
63+
--models) _require_value "$1" "${@:2:1}"; MODELS="$2"; shift 2 ;;
64+
--outcome) _require_value "$1" "${@:2:1}"; OUTCOME="$2"; OUTCOME_SET=1; shift 2 ;;
65+
-h|--help) usage; exit 0 ;;
66+
*) echo "Unknown arg: $1" >&2; usage >&2; exit 2 ;;
67+
esac
68+
done
69+
70+
# Prompt for missing required fields. Outcome is optional (grade later).
71+
if [[ -z "$TOPIC" ]]; then
72+
read -r -p "Session topic (one line): " TOPIC
73+
fi
74+
if [[ -z "$DISPATCHES" ]]; then
75+
read -r -p "Dispatches (e.g. 'Explore x3, Plan x1' or '—'): " DISPATCHES
76+
fi
77+
if [[ -z "$MODELS" ]]; then
78+
read -r -p "Models used (comma-separated, e.g. 'haiku,sonnet'): " MODELS
79+
fi
80+
if [[ $OUTCOME_SET -eq 0 ]]; then
81+
read -r -p "Outcome (shipped|partial|reverted|blocked|plan-only, empty = grade later): " OUTCOME
82+
fi
83+
84+
# Validate outcome enum when non-empty.
85+
case "$OUTCOME" in
86+
""|shipped|partial|reverted|blocked|plan-only) ;;
87+
*)
88+
echo "Invalid outcome: '$OUTCOME'" >&2
89+
echo "Must be one of: shipped | partial | reverted | blocked | plan-only | (empty)" >&2
90+
exit 2
91+
;;
92+
esac
93+
94+
# Guard against pipe characters in fields — they'd break the pipe-
95+
# delimited format. Replace with ' ; ' (with spaces) as a deliberately
96+
# visible sentinel so sanitized rows stand out in any weekly-review
97+
# listing and prompt a manual cleanup.
98+
sanitize() {
99+
local value="${1//|/ ; }"
100+
value="${value//$'\n'/ }"
101+
printf '%s' "$value"
102+
}
103+
TOPIC=$(sanitize "$TOPIC")
104+
DISPATCHES=$(sanitize "$DISPATCHES")
105+
MODELS=$(sanitize "$MODELS")
106+
107+
TIMESTAMP=$(date '+%Y-%m-%dT%H:%M:%S')
108+
109+
# Short SHA of the most recent commit that touched .claude/agents/.
110+
# Ties a row to a specific agent-prompt version so we can correlate
111+
# outcomes with prompt changes later (P5.1 meta-agent).
112+
AGENT_SHA="unknown"
113+
if command -v git >/dev/null 2>&1 && git -C "$REPO_ROOT" rev-parse --git-dir >/dev/null 2>&1; then
114+
sha=$(git -C "$REPO_ROOT" log -1 --format=%h -- .claude/agents/ 2>/dev/null || true)
115+
if [[ -n "$sha" ]]; then
116+
AGENT_SHA="$sha"
117+
fi
118+
fi
119+
120+
# Ensure the log file exists (gitignored; first run creates it).
121+
mkdir -p "$(dirname "$LOG_PATH")"
122+
touch "$LOG_PATH"
123+
124+
ROW="$TIMESTAMP | $TOPIC | $DISPATCHES | $MODELS | $OUTCOME | $AGENT_SHA"
125+
printf '%s\n' "$ROW" >> "$LOG_PATH"
126+
127+
echo "Logged: $ROW"
128+
echo "$LOG_PATH"

0 commit comments

Comments
 (0)