Skip to content

feat: deep Safety Policy foundation — ports, vocabulary, central enforcement#6

Open
stevehansen wants to merge 1 commit into
mainfrom
feat/safety-policy-foundation
Open

feat: deep Safety Policy foundation — ports, vocabulary, central enforcement#6
stevehansen wants to merge 1 commit into
mainfrom
feat/safety-policy-foundation

Conversation

@stevehansen
Copy link
Copy Markdown
Member

What & why

SafeCommands' core value — validate before running — had no owning abstraction. Safety/Policy.cs held one rule (AllowOnlyScriptsRule) used by one group (bun), while every other safety decision was hand-rolled inline across ~11 handlers as ≥5 structurally distinct idioms. This PR builds the deep Policy module that owns "is this command safe?" and enforces it once, centrally.

Design-of-record: specs/RFC-safety-policy.md (Candidate 1 of specs/ARCHITECTURE_DEEPENING.md), both included in this PR.

This is steps 1+2 of the RFC's 4-step migration and is behavior-neutral: only bun carries a non-default policy, so user-facing behavior is unchanged.

What landed

  • Two domain ports (SafeCommands.Infrastructure.Ports):
    • IRepoProbeIsGitRepo/IsCleanTree/IsHeadPushed; production GitRepoProbe caches each answer (one git spawn per question per run vs. today's ~25× re-spawn).
    • IWorkspaceProjectRoot/Resolve/IsWithinProject; production FileSystemWorkspace is the only place Path.GetFullPath/Directory.GetCurrentDirectory are read, and it preserves the /proj-vs-/projEvil boundary trick exactly.
    • Ports grows from (IExecutor, IRenderer) to four members; the handler signature Func<Ports,string[],int> is unchanged (so issue RFC: Introduce ports-and-adapters seam for handlers (IExecutor, IRenderer, IGitRepo) with a thin Run.* sugar layer #2's migration is unaffected).
  • Deep Policy (SafeCommands.Safety): a fluent vocabulary of 10 rule builders (BlockFlags, BlockSubstrings, AllowOnlyFirstArg, AllowOnlyFlags, AllowSubcommands, RequirePathWithinProject, RequireGitRepo, RequireCleanTree, RequireHeadNotPushed, Custom), three verdicts (Allow/Block/Rewrite), SafetyContext, PolicyDecision, and Flag.Base flag normalization.
  • Central enforcement: CommandDefinition.Policy field (default Policy.Default) + CommandDispatcher.Execute evaluates the policy before the handler. A blocked command renders a uniform envelope (including under --json) and structurally cannot spawn the tool.
  • bun migrated as the proof-of-concept (AllowOnlyScriptsAllowOnlyFirstArg(…, "Script")), behavior identical; Run.Bun simplified.

Defect groundwork

Flag.Base normalization closes the --force=true bypass class at the rule level, and AllowSubcommands finally enforces per-subcommand flag allowlists (the formerly-dead proxy AllowedFlags). These fixes don't take effect in production until the relevant handlers are migrated (steps 3+4) — this PR makes the mechanism exist and proves it under test.

Tests

92 passing, 0 skipped. Boundary suite over the full rule/fold/flag vocabulary + dispatch enforcement ("blocked never spawns" generalized beyond bun), the real FileSystemWorkspace /proj-vs-/projEvil boundary, and AllowSubcommands token-boundary matching. New FakeRepoProbe/FakeWorkspace siblings of the existing fakes.

Review notes (resolved in this PR)

A code-review + spec-conformance pass surfaced three should-fix items, all in not-yet-wired rules; all addressed:

  1. AllowSubcommandsRule matched its prefix by string StartsWith (so "status" would accept ["status-quo"]) → switched to token-boundary matching + 3 regression tests.
  2. AllowOnlyFlags is deliberately case-sensitive (mirrors git's real flag casing / legacy FilterFlags) while BlockFlags/AllowSubcommands lowercase → guarded with a comment so it isn't "fixed" into a behavior change.
  3. RequirePathWithinProject can throw on a malformed path; it fails closed via the dispatcher's global catch → documented.

Out of scope (follow-up)

  • Step 3 — migrate the other 10 handlers' inline validation onto declared policies (where the latent defects — dead proxy AllowedFlags, --json blocked-envelope fork, --force=true — get fixed in production paths).
  • Step 4 — collapse the 3× script allowlist / 2× FilterFlags / 2× path-containment duplications; update STRIDE.md (E1, proxy gap).

🤖 Generated with Claude Code

…rcement

Builds the owning abstraction for "is this command safe?" (RFC-safety-policy.md,
ARCHITECTURE_DEEPENING.md Candidate 1). Behavior-neutral; migration steps 1+2 of 4.

- Add IRepoProbe/IWorkspace domain ports + cached GitRepoProbe and
  boundary-preserving FileSystemWorkspace adapters; grow Ports to four members
  (handler signature unchanged, so issue #2's migration is unaffected).
- Deep Policy: ten fluent rule builders, Allow/Block/Rewrite verdicts,
  SafetyContext, PolicyDecision, and Flag.Base flag normalization (closes the
  --force=true bypass class at the rule level).
- CommandDefinition.Policy field (default Policy.Default) + central
  CommandDispatcher enforcement: a blocked command renders a uniform envelope
  (incl. --json) and structurally cannot spawn the tool.
- Migrate bun as the proof-of-concept (behavior identical); simplify Run.Bun.
- Tests: 92 passing — full rule/fold/flag/workspace vocabulary, dispatch
  enforcement, the real FileSystemWorkspace /proj-vs-/projEvil boundary, and
  AllowSubcommands token-boundary matching. New FakeRepoProbe/FakeWorkspace.

The other ten handlers' inline validation is untouched (steps 3+4); their latent
defects (dead proxy AllowedFlags, --json fork, --force=true in prod paths) remain
pending that migration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a centralized, declarative safety policy engine for SafeCommands, introducing a single enforcement seam (CommandDispatcher) and domain-shaped ports (IRepoProbe, IWorkspace) along with comprehensive unit and integration tests. The review feedback identifies a critical security vulnerability where case-insensitive path checks on Linux could lead to a sandbox bypass, missing support for the standard -- option delimiter in flag and subcommand rules, and a potential out-of-bounds exception if a negative argument index is configured.

Comment on lines +16 to +24
public bool IsWithinProject(string canonicalPath)
{
var rootWithSep = ProjectRoot.EndsWith(Path.DirectorySeparatorChar)
? ProjectRoot
: ProjectRoot + Path.DirectorySeparatorChar;

return canonicalPath.Equals(rootWithSep.TrimEnd(Path.DirectorySeparatorChar), StringComparison.OrdinalIgnoreCase)
|| canonicalPath.StartsWith(rootWithSep, StringComparison.OrdinalIgnoreCase);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

On case-sensitive filesystems (such as Linux), using StringComparison.OrdinalIgnoreCase for path containment checks can lead to a sandbox bypass. For example, if the project root is /home/user/Proj and the user attempts to access /home/user/proj/evil_file, the check will incorrectly succeed because /home/user/proj/evil_file starts with /home/user/Proj/ under case-insensitive comparison, even though they are completely different directories on Linux.

To fix this, use System.OperatingSystem.IsWindows() to dynamically select the appropriate string comparison (case-insensitive for Windows, case-sensitive for Unix-like systems).

    public bool IsWithinProject(string canonicalPath)
    {
        var comparison = System.OperatingSystem.IsWindows()
            ? StringComparison.OrdinalIgnoreCase
            : StringComparison.Ordinal;

        var rootWithSep = ProjectRoot.EndsWith(Path.DirectorySeparatorChar)
            ? ProjectRoot
            : ProjectRoot + Path.DirectorySeparatorChar;

        return canonicalPath.Equals(rootWithSep.TrimEnd(Path.DirectorySeparatorChar), comparison)
            || canonicalPath.StartsWith(rootWithSep, comparison);
    }

Comment on lines +64 to +87
public override PolicyResult Evaluate(string[] args, in SafetyContext ctx)
{
var result = new List<string>();
for (int i = 0; i < args.Length; i++)
{
var arg = args[i];
if (arg.StartsWith('-'))
{
var flagBase = arg.Contains('=') ? arg[..arg.IndexOf('=')] : arg;
if (_allowed.Contains(flagBase) || _allowed.Contains(arg))
{
result.Add(arg);
if (!arg.Contains('=') && _valueFlags.Contains(flagBase) && i + 1 < args.Length)
result.Add(args[++i]);
}
// Skip unknown flags silently.
}
else if (KeepPositionals)
{
result.Add(arg);
}
}
return new PolicyResult.Rewrite(result.ToArray());
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The AllowOnlyFlagsRule does not handle the standard -- option delimiter. If a user passes -- to separate flags from positional arguments, -- will be treated as an unknown flag and silently stripped. Furthermore, any subsequent positional arguments that happen to start with - (e.g., pathspecs or negative numbers) will also be treated as flags and stripped, corrupting the command.

To fix this, detect -- and immediately stop filtering, appending all remaining arguments as positionals if KeepPositionals is true.

    public override PolicyResult Evaluate(string[] args, in SafetyContext ctx)
    {
        var result = new List<string>();
        for (int i = 0; i < args.Length; i++)
        {
            var arg = args[i];
            if (arg == "--")
            {
                result.Add(arg);
                if (KeepPositionals)
                {
                    for (int j = i + 1; j < args.Length; j++)
                    {
                        result.Add(args[j]);
                    }
                }
                break;
            }
            if (arg.StartsWith('-'))
            {
                var flagBase = arg.Contains('=') ? arg[..arg.IndexOf('=')] : arg;
                if (_allowed.Contains(flagBase) || _allowed.Contains(arg))
                {
                    result.Add(arg);
                    if (!arg.Contains('=') && _valueFlags.Contains(flagBase) && i + 1 < args.Length)
                        result.Add(args[++i]);
                }
                // Skip unknown flags silently.
            }
            else if (KeepPositionals)
            {
                result.Add(arg);
            }
        }
        return new PolicyResult.Rewrite(result.ToArray());
    }

Comment on lines +103 to +109
for (int i = prefixTokens; i < args.Length; i++)
{
if (args[i].StartsWith('-') && !allowedFlags.Contains(Flag.Base(args[i])))
return new PolicyResult.Block(
$"Flag '{args[i]}' is not allowed for this subcommand",
$"Allowed flags: {string.Join(", ", sub.AllowedFlags)}");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The AllowSubcommandsRule does not handle the standard -- option delimiter. If a user passes -- followed by positional arguments that start with - (e.g., safe proxy run pr list -- -state), the rule will incorrectly treat those positional arguments as flags and block the command.

To fix this, stop checking for disallowed flags once -- is encountered.

            for (int i = prefixTokens; i < args.Length; i++)
            {
                if (args[i] == "--") break;
                if (args[i].StartsWith('-') && !allowedFlags.Contains(Flag.Base(args[i])))
                    return new PolicyResult.Block(
                        $"Flag '{args[i]}' is not allowed for this subcommand",
                        $"Allowed flags: {string.Join(", ", sub.AllowedFlags)}");
            }

{
public override PolicyResult Evaluate(string[] args, in SafetyContext ctx)
{
if (args.Length <= ArgIndex) return new PolicyResult.Allow();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If ArgIndex is configured to be negative, args.Length <= ArgIndex will evaluate to false, and args[ArgIndex] will throw an IndexOutOfRangeException. Guard against negative indices to ensure robustness.

        if (ArgIndex < 0 || args.Length <= ArgIndex) return new PolicyResult.Allow();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant