Skip to content

MCP卡住问题修复#466

Merged
Nomikfk1215 merged 1 commit into
1024XEngineer:mainfrom
voicepeak:MCP-fix
May 13, 2026
Merged

MCP卡住问题修复#466
Nomikfk1215 merged 1 commit into
1024XEngineer:mainfrom
voicepeak:MCP-fix

Conversation

@voicepeak

Copy link
Copy Markdown
Collaborator

背景

配置 MCP 后,首次对话会在 turn 开始前同步 MCP 工具列表。如果 MCP server 通过 npx 启动时卡住,原逻辑会导致:

  • MCP 启动超时未按 server 配置生效
  • Windows 下只杀掉 cmd.exe,子 node.exe 可能残留并继续占用 stdout 管道
  • stdout 读循环永久阻塞
  • MCP discovery 超时被当成整个 turn 的失败,导致普通对话也无法继续

修改

  • 将 MCP startup/call timeout 改为真正限制本次 MCP 操作,即使父 context 已有更长 deadline 也会生效。
  • MCP stdio 读取改为受 context 控制,避免一直等待 stdout EOF。
  • Windows 下取消 MCP 子进程时使用进程树终止,避免 node.exe 残留。
  • MCP discovery 启动超时只会让对应 MCP 降级,不再阻塞普通对话。
  • syncExtensionTools 对扩展内部超时做 best-effort 处理,只有 turn 本身被取消/超时时才中断对话。
  • 增加回归测试覆盖启动超时、stdout 被子进程持有、以及 MCP 超时不阻塞基础对话。

验证

go test ./internal/agent ./internal/extensions/mcp ./internal/extensionsruntime
go test ./...

@codecov

codecov Bot commented May 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 78.04878% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/extensions/mcp/client.go 80.00% 3 Missing and 2 partials ⚠️
internal/extensions/mcp/process_tree.go 66.66% 1 Missing and 1 partial ⚠️
internal/extensions/mcp/process_tree_other.go 50.00% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@fennoai fennoai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One performance/resource-management issue found in the new timeout/cancellation path; no additional code quality, security, or documentation findings were noteworthy.

"os/exec"
)

func terminateCommand(cmd *exec.Cmd) error {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: on non-Windows we still only kill the direct child process here. If an MCP launcher spawns a helper that inherits stdout (the same shape of failure this PR is addressing), readRPCResponseContext will return on ctx.Done(), but the blocked reader goroutine and the descendant process will both stay alive until that helper exits on its own. Repeated timeouts can therefore accumulate leaked goroutines/processes on Unix-like hosts. Consider launching the command in its own process group and terminating the whole group here, not just cmd.Process.

@Nomikfk1215 Nomikfk1215 merged commit a48daff into 1024XEngineer:main May 13, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants