Skip to content

feat(mcp): queue local stdio artifact execution#2280

Open
daryllimyt wants to merge 5 commits intoeng-1321-execute-read-remote-mcp-artifacts-through-tracecat-mcpfrom
eng-1322-execute-read-local-stdio-mcp-artifacts-through-tracecat-sandbox-queue
Open

feat(mcp): queue local stdio artifact execution#2280
daryllimyt wants to merge 5 commits intoeng-1321-execute-read-remote-mcp-artifacts-through-tracecat-mcpfrom
eng-1322-execute-read-local-stdio-mcp-artifacts-through-tracecat-sandbox-queue

Conversation

@daryllimyt
Copy link
Copy Markdown
Contributor

@daryllimyt daryllimyt commented Mar 6, 2026

Checklist

  • Read CONTRIBUTING.md.
  • PR title is short and non-generic (see previously merged PRs for examples).
  • PR only implements a single feature or fixes a single bug.
  • Tests passing (uv run pytest tests)?
  • Lint / pre-commits passing (pre-commit run --all-files)?

Description

Executes persisted local stdio MCP artifacts through the dedicated MCP Temporal queue and worker path instead of rejecting them at the wrapper layer.

This PR adds three pieces:

  • queue-backed local MCP artifact workflows and dual-queue agent worker registration
  • ephemeral stdio execution with shared per-org cache mounts and nsjail wrapping when available
  • per-integration outbound policy wiring for local stdio MCPs via an egress guard preload, with explicit degraded-mode warnings when nsjail is unavailable

Wrapper behavior now covers local tool execution, resource reads with truncation metadata, and prompt retrieval while preserving the persisted catalog + MCP policy authorization path.

Related Issues

  • ENG-1322

Screenshots / Recordings

  • None

Steps to QA

  1. Run uv run ruff check tracecat/config.py tracecat/integrations/service.py tracecat/agent/preset/service.py tracecat/agent/mcp/sandbox/workflow.py tracecat/agent/mcp/sandbox/types.py tracecat/mcp/catalog/types.py tracecat/mcp/catalog/artifact_service.py tracecat/agent/worker.py tests/unit/test_local_mcp_sandbox_workflow.py tests/unit/test_mcp_artifact_service.py
  2. Run uv run basedpyright tracecat/config.py tracecat/integrations/service.py tracecat/agent/preset/service.py tracecat/agent/mcp/sandbox/workflow.py tracecat/agent/mcp/sandbox/types.py tracecat/mcp/catalog/types.py tracecat/mcp/catalog/artifact_service.py tracecat/agent/worker.py tests/unit/test_local_mcp_sandbox_workflow.py tests/unit/test_mcp_artifact_service.py
  3. Run TRACECAT__SERVICE_KEY=test-service-key uv run pytest tests/unit/test_local_mcp_sandbox_workflow.py tests/unit/test_mcp_artifact_service.py tests/unit/test_user_mcp_client.py tests/unit/test_mcp_catalog_service.py tests/unit/test_mcp_policy_service.py
  4. Optionally compile the egress guard locally with cc -shared -fPIC -O2 -Wall -Wextra -o /tmp/libtracecat_mcp_egress_guard.so tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c -ldl -pthread

Summary by cubic

Queues local stdio MCP artifact execution on a dedicated Temporal MCP queue and wraps processes with optional nsjail plus enforced egress policy. Implements ENG-1322 by routing stdio tool/resource/prompt calls through a sandboxed workflow; HTTP artifacts remain unchanged.

  • New Features

    • Added RunLocalMCPArtifactWorkflow and a separate MCP worker queue (TRACECAT__MCP_QUEUE); HTTP artifacts still execute directly.
    • Optional nsjail sandbox with per‑org caches (npm/pip/uv/XDG via TRACECAT__MCP_SANDBOX_CACHE_DIR); enforces egress allow/deny via an LD_PRELOAD guard library built and shipped in the Docker image.
    • Worker runs a second MCP queue with separate limits; new config: TRACECAT__MCP_MAX_CONCURRENT_ACTIVITIES, TRACECAT__MCP_THREADPOOL_MAX_WORKERS, TRACECAT__MCP_MAX_CONCURRENT_LOCAL_SANDBOXES, TRACECAT__MCP_SANDBOX_CACHE_DIR.
    • Centralized stdio_env decryption and template resolution in IntegrationService; artifact service dispatches stdio tools/resources/prompts via the workflow and returns truncation metadata (truncated, max_content_chars, total_content_chars).
    • Added tests for sandbox config, egress policy, and artifact dispatch.
  • Bug Fixes

    • Hardened nsjail config: validate paths and reject unsafe stdio args to prevent injection; only jail commands within supported rootfs paths, otherwise fall back to direct mode with warnings (including egress policy degradation).
    • Strengthened egress guard: stricter host:port parsing, DNS resolution checks (IPv4/IPv6), allowlist caching, and fail‑closed behavior on allowlist cache overflow.

Written for commit d680094. Summary will update on new commits.

Copy link
Copy Markdown
Contributor Author

daryllimyt commented Mar 6, 2026

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 12 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tracecat/agent/mcp/sandbox/workflow.py">

<violation number="1" location="tracecat/agent/mcp/sandbox/workflow.py:183">
P0: Security: nsjail config injection via unvalidated `command_args`. Each argument from `target.stdio_args` is interpolated into the nsjail protobuf config without sanitization. An arg containing `"` or `}` could break out of the string literal and inject arbitrary nsjail directives (e.g., mount host paths, change isolation settings). Validate each arg the same way `command_path` is validated, using the existing `_validate_path`-style dangerous character check or a dedicated arg validator.</violation>
</file>

<file name="tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c">

<violation number="1" location="tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c:103">
P1: Missing NULL check on `dlsym` results for `real_connect` and `real_getaddrinfo`. If `dlsym(RTLD_NEXT, ...)` returns NULL, calling through the pointer is undefined behavior (segfault). A security guard should fail closed — either abort or return an error — rather than crash.</violation>
</file>

<file name="tests/unit/test_mcp_artifact_service.py">

<violation number="1" location="tests/unit/test_mcp_artifact_service.py:528">
P2: The queue assertion is too weak: it only checks truthiness, so the test can pass even when stdio artifacts are dispatched to the wrong queue.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread tracecat/agent/mcp/sandbox/workflow.py Outdated
Comment thread tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c
Comment thread tests/unit/test_mcp_artifact_service.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8243ff35e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Dockerfile
Comment thread tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tracecat/agent/mcp/sandbox/workflow.py">

<violation number="1" location="tracecat/agent/mcp/sandbox/workflow.py:112">
P2: The new stdio arg filter is over-restrictive and can reject valid integrations (e.g., args containing `'` or `{}`), causing hard workflow failures in nsjail mode.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread tracecat/agent/mcp/sandbox/workflow.py
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tracecat/agent/mcp/sandbox/workflow.py">

<violation number="1" location="tracecat/agent/mcp/sandbox/workflow.py:112">
P2: Runtime stdio arg validation is stricter than persistence validation, so previously accepted integrations can fail at execution in nsjail mode.</violation>
</file>

<file name="tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c">

<violation number="1" location="tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c:212">
P2: Silent cache overflow causes getaddrinfo/connect inconsistency: when `resolved_allow_count` reaches `MAX_RESOLVED_ENDPOINTS`, new resolved IPs are silently dropped from the cache but `getaddrinfo` still returns success. Subsequent `connect()` calls to those uncached IPs will be denied with `EACCES` because the allow-list check in `connect` cannot match hostnames against raw IPs and the cache lookup misses.

Consider at minimum logging a warning when the cache is full, or failing the `getaddrinfo` call when caching fails so the caller doesn't receive addresses it cannot connect to.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread tracecat/agent/mcp/sandbox/workflow.py
Comment thread tracecat/agent/mcp/sandbox/tracecat_mcp_egress_guard.c Outdated
@daryllimyt daryllimyt force-pushed the eng-1322-execute-read-local-stdio-mcp-artifacts-through-tracecat-sandbox-queue branch from 2d8c371 to fbda439 Compare March 6, 2026 04:02
@daryllimyt daryllimyt force-pushed the eng-1321-execute-read-remote-mcp-artifacts-through-tracecat-mcp branch from b09a16a to 45b72cd Compare March 6, 2026 04:02
@daryllimyt daryllimyt force-pushed the eng-1322-execute-read-local-stdio-mcp-artifacts-through-tracecat-sandbox-queue branch from 1b5c1f8 to d680094 Compare March 6, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant