feat: deep Safety Policy foundation — ports, vocabulary, central enforcement#6
feat: deep Safety Policy foundation — ports, vocabulary, central enforcement#6stevehansen wants to merge 1 commit into
Conversation
…rcement Builds the owning abstraction for "is this command safe?" (RFC-safety-policy.md, ARCHITECTURE_DEEPENING.md Candidate 1). Behavior-neutral; migration steps 1+2 of 4. - Add IRepoProbe/IWorkspace domain ports + cached GitRepoProbe and boundary-preserving FileSystemWorkspace adapters; grow Ports to four members (handler signature unchanged, so issue #2's migration is unaffected). - Deep Policy: ten fluent rule builders, Allow/Block/Rewrite verdicts, SafetyContext, PolicyDecision, and Flag.Base flag normalization (closes the --force=true bypass class at the rule level). - CommandDefinition.Policy field (default Policy.Default) + central CommandDispatcher enforcement: a blocked command renders a uniform envelope (incl. --json) and structurally cannot spawn the tool. - Migrate bun as the proof-of-concept (behavior identical); simplify Run.Bun. - Tests: 92 passing — full rule/fold/flag/workspace vocabulary, dispatch enforcement, the real FileSystemWorkspace /proj-vs-/projEvil boundary, and AllowSubcommands token-boundary matching. New FakeRepoProbe/FakeWorkspace. The other ten handlers' inline validation is untouched (steps 3+4); their latent defects (dead proxy AllowedFlags, --json fork, --force=true in prod paths) remain pending that migration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request implements a centralized, declarative safety policy engine for SafeCommands, introducing a single enforcement seam (CommandDispatcher) and domain-shaped ports (IRepoProbe, IWorkspace) along with comprehensive unit and integration tests. The review feedback identifies a critical security vulnerability where case-insensitive path checks on Linux could lead to a sandbox bypass, missing support for the standard -- option delimiter in flag and subcommand rules, and a potential out-of-bounds exception if a negative argument index is configured.
| public bool IsWithinProject(string canonicalPath) | ||
| { | ||
| var rootWithSep = ProjectRoot.EndsWith(Path.DirectorySeparatorChar) | ||
| ? ProjectRoot | ||
| : ProjectRoot + Path.DirectorySeparatorChar; | ||
|
|
||
| return canonicalPath.Equals(rootWithSep.TrimEnd(Path.DirectorySeparatorChar), StringComparison.OrdinalIgnoreCase) | ||
| || canonicalPath.StartsWith(rootWithSep, StringComparison.OrdinalIgnoreCase); | ||
| } |
There was a problem hiding this comment.
On case-sensitive filesystems (such as Linux), using StringComparison.OrdinalIgnoreCase for path containment checks can lead to a sandbox bypass. For example, if the project root is /home/user/Proj and the user attempts to access /home/user/proj/evil_file, the check will incorrectly succeed because /home/user/proj/evil_file starts with /home/user/Proj/ under case-insensitive comparison, even though they are completely different directories on Linux.
To fix this, use System.OperatingSystem.IsWindows() to dynamically select the appropriate string comparison (case-insensitive for Windows, case-sensitive for Unix-like systems).
public bool IsWithinProject(string canonicalPath)
{
var comparison = System.OperatingSystem.IsWindows()
? StringComparison.OrdinalIgnoreCase
: StringComparison.Ordinal;
var rootWithSep = ProjectRoot.EndsWith(Path.DirectorySeparatorChar)
? ProjectRoot
: ProjectRoot + Path.DirectorySeparatorChar;
return canonicalPath.Equals(rootWithSep.TrimEnd(Path.DirectorySeparatorChar), comparison)
|| canonicalPath.StartsWith(rootWithSep, comparison);
}| public override PolicyResult Evaluate(string[] args, in SafetyContext ctx) | ||
| { | ||
| var result = new List<string>(); | ||
| for (int i = 0; i < args.Length; i++) | ||
| { | ||
| var arg = args[i]; | ||
| if (arg.StartsWith('-')) | ||
| { | ||
| var flagBase = arg.Contains('=') ? arg[..arg.IndexOf('=')] : arg; | ||
| if (_allowed.Contains(flagBase) || _allowed.Contains(arg)) | ||
| { | ||
| result.Add(arg); | ||
| if (!arg.Contains('=') && _valueFlags.Contains(flagBase) && i + 1 < args.Length) | ||
| result.Add(args[++i]); | ||
| } | ||
| // Skip unknown flags silently. | ||
| } | ||
| else if (KeepPositionals) | ||
| { | ||
| result.Add(arg); | ||
| } | ||
| } | ||
| return new PolicyResult.Rewrite(result.ToArray()); | ||
| } |
There was a problem hiding this comment.
The AllowOnlyFlagsRule does not handle the standard -- option delimiter. If a user passes -- to separate flags from positional arguments, -- will be treated as an unknown flag and silently stripped. Furthermore, any subsequent positional arguments that happen to start with - (e.g., pathspecs or negative numbers) will also be treated as flags and stripped, corrupting the command.
To fix this, detect -- and immediately stop filtering, appending all remaining arguments as positionals if KeepPositionals is true.
public override PolicyResult Evaluate(string[] args, in SafetyContext ctx)
{
var result = new List<string>();
for (int i = 0; i < args.Length; i++)
{
var arg = args[i];
if (arg == "--")
{
result.Add(arg);
if (KeepPositionals)
{
for (int j = i + 1; j < args.Length; j++)
{
result.Add(args[j]);
}
}
break;
}
if (arg.StartsWith('-'))
{
var flagBase = arg.Contains('=') ? arg[..arg.IndexOf('=')] : arg;
if (_allowed.Contains(flagBase) || _allowed.Contains(arg))
{
result.Add(arg);
if (!arg.Contains('=') && _valueFlags.Contains(flagBase) && i + 1 < args.Length)
result.Add(args[++i]);
}
// Skip unknown flags silently.
}
else if (KeepPositionals)
{
result.Add(arg);
}
}
return new PolicyResult.Rewrite(result.ToArray());
}| for (int i = prefixTokens; i < args.Length; i++) | ||
| { | ||
| if (args[i].StartsWith('-') && !allowedFlags.Contains(Flag.Base(args[i]))) | ||
| return new PolicyResult.Block( | ||
| $"Flag '{args[i]}' is not allowed for this subcommand", | ||
| $"Allowed flags: {string.Join(", ", sub.AllowedFlags)}"); | ||
| } |
There was a problem hiding this comment.
The AllowSubcommandsRule does not handle the standard -- option delimiter. If a user passes -- followed by positional arguments that start with - (e.g., safe proxy run pr list -- -state), the rule will incorrectly treat those positional arguments as flags and block the command.
To fix this, stop checking for disallowed flags once -- is encountered.
for (int i = prefixTokens; i < args.Length; i++)
{
if (args[i] == "--") break;
if (args[i].StartsWith('-') && !allowedFlags.Contains(Flag.Base(args[i])))
return new PolicyResult.Block(
$"Flag '{args[i]}' is not allowed for this subcommand",
$"Allowed flags: {string.Join(", ", sub.AllowedFlags)}");
}| { | ||
| public override PolicyResult Evaluate(string[] args, in SafetyContext ctx) | ||
| { | ||
| if (args.Length <= ArgIndex) return new PolicyResult.Allow(); |
There was a problem hiding this comment.
What & why
SafeCommands' core value — validate before running — had no owning abstraction.
Safety/Policy.csheld one rule (AllowOnlyScriptsRule) used by one group (bun), while every other safety decision was hand-rolled inline across ~11 handlers as ≥5 structurally distinct idioms. This PR builds the deepPolicymodule that owns "is this command safe?" and enforces it once, centrally.Design-of-record:
specs/RFC-safety-policy.md(Candidate 1 ofspecs/ARCHITECTURE_DEEPENING.md), both included in this PR.This is steps 1+2 of the RFC's 4-step migration and is behavior-neutral: only
buncarries a non-default policy, so user-facing behavior is unchanged.What landed
SafeCommands.Infrastructure.Ports):IRepoProbe—IsGitRepo/IsCleanTree/IsHeadPushed; productionGitRepoProbecaches each answer (onegitspawn per question per run vs. today's ~25× re-spawn).IWorkspace—ProjectRoot/Resolve/IsWithinProject; productionFileSystemWorkspaceis the only placePath.GetFullPath/Directory.GetCurrentDirectoryare read, and it preserves the/proj-vs-/projEvilboundary trick exactly.Portsgrows from(IExecutor, IRenderer)to four members; the handler signatureFunc<Ports,string[],int>is unchanged (so issue RFC: Introduce ports-and-adapters seam for handlers (IExecutor, IRenderer, IGitRepo) with a thin Run.* sugar layer #2's migration is unaffected).Policy(SafeCommands.Safety): a fluent vocabulary of 10 rule builders (BlockFlags,BlockSubstrings,AllowOnlyFirstArg,AllowOnlyFlags,AllowSubcommands,RequirePathWithinProject,RequireGitRepo,RequireCleanTree,RequireHeadNotPushed,Custom), three verdicts (Allow/Block/Rewrite),SafetyContext,PolicyDecision, andFlag.Baseflag normalization.CommandDefinition.Policyfield (defaultPolicy.Default) +CommandDispatcher.Executeevaluates the policy before the handler. A blocked command renders a uniform envelope (including under--json) and structurally cannot spawn the tool.bunmigrated as the proof-of-concept (AllowOnlyScripts→AllowOnlyFirstArg(…, "Script")), behavior identical;Run.Bunsimplified.Defect groundwork
Flag.Basenormalization closes the--force=truebypass class at the rule level, andAllowSubcommandsfinally enforces per-subcommand flag allowlists (the formerly-dead proxyAllowedFlags). These fixes don't take effect in production until the relevant handlers are migrated (steps 3+4) — this PR makes the mechanism exist and proves it under test.Tests
92 passing, 0 skipped. Boundary suite over the full rule/fold/flag vocabulary + dispatch enforcement ("blocked never spawns" generalized beyond
bun), the realFileSystemWorkspace/proj-vs-/projEvilboundary, andAllowSubcommandstoken-boundary matching. NewFakeRepoProbe/FakeWorkspacesiblings of the existing fakes.Review notes (resolved in this PR)
A code-review + spec-conformance pass surfaced three should-fix items, all in not-yet-wired rules; all addressed:
AllowSubcommandsRulematched its prefix by stringStartsWith(so"status"would accept["status-quo"]) → switched to token-boundary matching + 3 regression tests.AllowOnlyFlagsis deliberately case-sensitive (mirrors git's real flag casing / legacyFilterFlags) whileBlockFlags/AllowSubcommandslowercase → guarded with a comment so it isn't "fixed" into a behavior change.RequirePathWithinProjectcan throw on a malformed path; it fails closed via the dispatcher's global catch → documented.Out of scope (follow-up)
AllowedFlags,--jsonblocked-envelope fork,--force=true— get fixed in production paths).FilterFlags/ 2× path-containment duplications; updateSTRIDE.md(E1, proxy gap).🤖 Generated with Claude Code