Skip to content

fix(sec): resolve CodeQL CWE-22 path-injection (#90,#86) and weak-hash (#84)#57

Merged
cdeust merged 3 commits into
mainfrom
fix/codeql-pathinj-weakhash
Jun 9, 2026
Merged

fix(sec): resolve CodeQL CWE-22 path-injection (#90,#86) and weak-hash (#84)#57
cdeust merged 3 commits into
mainfrom
fix/codeql-pathinj-weakhash

Conversation

@cdeust

@cdeust cdeust commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Resolves three open CodeQL alerts on main.

#90, #86 — Uncontrolled data in path expression (CWE-22, High)

mcp_server/server/http_file_diff.py :108 / :179

The prior inline pathlib sanitiser (.resolve() + is_relative_to, from #53) did not clear CodeQL: the .resolve() sink fired on raw tainted input and the is_relative_to barrier didn't propagate across the _contained_resolved return boundary. Replaced with the canonical, CodeQL-recognised barrier — os.path.realpath + os.path.commonpath(...) == root — placed so it locally dominates every filesystem sink:

  • _within(real, root) — segment-aware commonpath containment (also fixes the naive-startswith sibling-prefix bug: /home/user-evil/home/user).
  • _contained_resolvedrealpath then _within; returns the validated Path (contract preserved for the http_standalone_trace caller).
  • _first_existing_dir_within — ancestor walk that re-asserts _within before every is_dir(), so the barrier dominates even as the walk ascends; the probe can't climb out to / or /etc.
  • Removed dead _under_allowed_root (docstring-only references).

Defenses unchanged in spirit (loopback-only server, .. rejection, containment to $HOME/cwd/temp) — just expressed in the form CodeQL's dataflow recognises.

#84 — Weak cryptographic hash (CWE-327/328, High)

mcp_server/core/workflow_graph_schema.py :126 (+ workflow_graph_source.py :47)

SHA-1 → SHA-256 for the deterministic node-id factories. A genuine fix, not usedforsecurity= suppression. These IDs feed only workflow_graph_layout, a position cache keyed by (node_id, topology_fingerprint, layout_version) — when the id scheme changes the fingerprint changes and the layout recomputes, so there's no cross-build stability to preserve.

Tests

New tests_py/server/test_http_file_diff_path_containment.py — 14 CWE-22 regression tests (/etc blocked, .. rejected, null-byte rejected, sibling-prefix rejected, happy-path repo resolution preserved). 49/49 server tests pass; 39/39 workflow-graph node-id tests pass; ruff check + format clean.

🤖 Generated with Claude Code

…h (#84)

Path injection — mcp_server/server/http_file_diff.py (#90 :108, #86 :179):
the prior inline pathlib sanitiser (`.resolve()` + `is_relative_to`) was
not recognised by CodeQL's dataflow — the `.resolve()` sink fired on raw
tainted input and the `is_relative_to` barrier did not propagate across
the `_contained_resolved` return. Replace with the canonical, CodeQL-
recognised barrier: `os.path.realpath` + `os.path.commonpath(...) == root`.

  * `_within(real, root)` — segment-aware commonpath containment (fixes the
    naive-startswith sibling-prefix bug: /home/user-evil ⊄ /home/user).
  * `_contained_resolved` — realpath then `_within`; returns the validated
    Path (contract preserved for http_standalone_trace caller).
  * `_first_existing_dir_within` — ancestor walk that re-asserts `_within`
    before every `is_dir()` sink, so the barrier dominates locally even as
    the walk ascends; the probe can never climb out to / or /etc.
  * Remove dead `_under_allowed_root` (referenced only in docstrings).

Weak hashing — workflow_graph_schema.py:126 (#84) + workflow_graph_source.py:47:
SHA-1 → SHA-256 for the deterministic node-id factories. These IDs feed only
`workflow_graph_layout`, a position cache keyed by
(node_id, topology_fingerprint, layout_version) — when the id scheme changes
the fingerprint changes and the layout recomputes, so there is no cross-build
stability requirement to preserve. SHA-256 is a genuine fix (non-broken algo),
not `usedforsecurity=` suppression.

Add 14 CWE-22 containment regression tests (tests_py/server/
test_http_file_diff_path_containment.py): /etc blocked, .. rejected,
null-byte rejected, sibling-prefix rejected, happy-path repo resolution
preserved. All 49 server tests pass; 39 workflow-graph node-id tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread mcp_server/server/http_file_diff.py Fixed
The first refactor moved the two path-injection sinks to a single
`contained.is_dir()` in `_first_existing_dir_within`, but CodeQL still
flagged it (new alert, http_file_diff.py:156): the `commonpath`
containment check was behind the `_within` helper + an `any(...)`
generator, so CodeQL's barrier-guard dataflow could not connect the
sanitiser to the sink.

Inline the guard into the exact shape CodeQL recognises (and that the
query's own Recommendation documents): the `is_dir()` filesystem access
now sits directly inside `if os.path.commonpath([root, real]) == root:`,
so the barrier dominates the sink on `real`. Behaviour and the 14 CWE-22
containment tests are unchanged; 49/49 server tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread mcp_server/server/http_file_diff.py Fixed
The inline-commonpath guard (6f10991) still left CodeQL alert #92 at the
is_dir() sink: the ancestor up-walk re-derived the probed path from
realpath(user_input) on every iteration, so the loop-carried value stayed
tainted and the per-root commonpath guard was not accepted as a sanitiser.

Redesign: replace the up-walk with a DOWNWARD descent from a constant
allowed probe root via os.scandir (_descend_trusted). User path components
select among trusted enumerated entries by `entry.name == name` equality
only — they never construct a probed path. The value reaching is_dir() /
`git rev-parse --cwd` is composed entirely from the constant root plus
scandir output, breaking the taint flow exactly as
git_diff._match_in_whitelist does for the file whitelist.

_contained_resolved / _within / _allowed_probe_roots unchanged (no sinks;
external caller http_standalone_trace.py contract preserved).

Tests: 14/14 CWE-22 containment, 49/49 server. ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cdeust cdeust merged commit 9c1d2a3 into main Jun 9, 2026
11 checks passed
@cdeust cdeust deleted the fix/codeql-pathinj-weakhash branch June 9, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants