opencode harness captures zero completions — config doesn't include gateway baseURL

<img width="1589" height="78" alt="Image" src="https://github.com/user-attachments/assets/64dbb019-7b61-44e4-964c-b507889fa576" />


Title: opencode harness captures zero completions — config doesn't include gateway baseURL                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                  
  Labels: bug, agent-harness, opencode                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                  
  Summary                                                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                
  在 SWE-bench Verified 评测中跑 opencode harness，10 个 task 全部 0/10 resolved。诊断发现 agent 容器里 opencode CLI 跑 ~4 秒就退出，根本没发出任何 chat completion 请求——polar gateway 日志里 session id 没有任何 POST /v1/chat/completions（只有 polar node 的 GET /sessions/...
   心跳轮询）。                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                  
  Reproduction                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                
假设 polar 已经在跑:                                                                                                                                                                                                                                                           
  uv run python examples/swebench_verified/submit_swebench_tasks.py \
    --harness opencode --max-tasks 5 --num-samples 1 \
    --runtime-backend docker --model-name Qwen3.6-27B                                                                                                                                                                                                                             
   
  Observed state                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                
  - rollout_results/task_*/ses_*.json: error: "no completions", record_count: 0                                                                                                                                                                                                   
  - trajectory.metadata.api_type: None
  - run_ms: ~4000（仅 4 秒就退出）                                                                                                                                                                                                                                                
  - gateway 日志: 该 session 没有任何 POST /v1/chat/completions，只有极少的 GET /sessions/... 心跳                                                                                                                                                                              
                                                                                                                                                                                                                                                                                  
  Root cause
                                                                                                                                                                                                                                                                                  
  看 src/polar/agent/presets/opencode.py:31-44 的 setup()，写到 $HOME/.config/opencode/opencode.json 的 config 包含 provider 模型但不包含 baseURL 和 apiKey：                                                                                                                     
                                                                                                                                                                                                                                                                                
  config: dict = {                                                                                                                                                                                                                                                                
      "provider": {provider: {"models": {model_id: {}}}},  # ← 没有 options.baseURL                                                                                                                                                                                             
      "permission": {...},                                                                                                                                                                                                                                                        
  }
                                                                                                                                                                                                                                                                                  
  opencode CLI 默认会用自己的内置 OpenAI base URL（api.openai.com），不会读 OPENAI_BASE_URL env var（不像 claude_code / openclaw / hermes / openhands_sdk 等需要走 placeholder 替换的 preset）。结果就是：                                                                        
  - agent 启动 → opencode 不知道该往哪发 → 启动失败 / 静默退出                                                                                                                                                                                                                  
  - session 4 秒就 end，没有任何 completion 记录                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                
  对比 codex.py:80-82（base_url="$OPENAI_BASE_URL"）和 openclaw.py:66-68（placeholder + sed 替换），opencode 是这套机制里唯一没配置 baseURL 的 preset。                                                                                                                           
                                                                                                                                                                                                                                                                                  
  Suggested fix                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                  
  在 opencode.py 的 config dict 里加 options.baseURL 和 options.apiKey，用 placeholder 模式（运行时 sed 替换）：                                                                                                                                                                  

```                                                                                                                                                                                                                                           
  import os                                                                                                                                                                                                                                                                       
  _BASE_URL_PLACEHOLDER = "__POLAR_GATEWAY_BASE_URL__"                                                                                                                                                                                                                          
  _API_KEY_PLACEHOLDER = "__POLAR_GATEWAY_API_KEY__"                                                                                                                                                                                                                            

  config["provider"] = {
      provider: {
          "models": {model_id: {}},
          "options": {
              "baseURL": _BASE_URL_PLACEHOLDER,                                                                                                                                                                                                                                   
              "apiKey": _API_KEY_PLACEHOLDER,
          },                                                                                                                                                                                                                                                                      
      }                                                                                                                                                                                                                                                                         
  }                                                                                                                                                                                                                                                                             

  # setup() 末尾用 sed 替换
  await runtime.exec(
      f"sed -i 's|{_BASE_URL_PLACEHOLDER}|$OPENAI_BASE_URL|g; "
      f"s|{_API_KEY_PLACEHOLDER}|$OPENAI_API_KEY|g' "                                                                                                                                                                                                                             
      f"{self._config_dir}/opencode.json"
  )                                                                                                                                                                                                                                                                               
```
                                                                                                                                                                                                                                 
  Environment                                                                                                                                                                                                                                                                   

  - polar commit: 8bc67cc3 (stable)                                                                                                                                                                                                                                               
  - opencode-ai: 1.4.6
  - vLLM: Qwen3.6-27B on http://100.8.68.123:55900                                                                                                                                                                                                                                 
  - 对照组: qwen_code 同设置下 5/10 resolved, claude_code 5/10  





---
  Issue 2 — Investigation: codex exits after 4 turns with empty patch

  Title: codex harness exits early on Qwen3.6-27B — agent stops after a text-only response

  Labels: bug, agent-harness, codex, investigation

  Summary

  codex harness 在 Qwen3.6-27B 上跑 4 轮就退出，patch 为空（empty_generation: true）。对比：同模型同任务下 qwen_code 5/10, claude_code 5/10, codex 0/10。这可能不是 polar bug，但需要排查 gateway 的 Responses 协议实现。

  Reproduction

  uv run python examples/swebench_verified/submit_swebench_tasks.py \
    --harness codex --max-tasks 5 --num-samples 1 \
    --runtime-backend docker --model-name Qwen3.6-27B

  Observed state（与 opencode 不同的失败模式）

  rollout_results/task_*/ses_*.json：
  - status: COMPLETED, error: None
  - record_count: 4 ← 4 个 completion 成功捕获
  - api_type: "openai_responses" ✓
  - trace_count: 1, 8 messages（3 个 tool call + 1 个 text-only 终止消息）
  - run_ms: ~19000（19 秒）
  - trajectory.metadata.evaluation.report.empty_generation: true（patch 为空）

  关键观察：polar gateway 正确记录了 4 个 completion，openai_responses 协议转换工作正常。失败原因在第 4 个 response 是 text-only（"Now I can see the issue. Let me trace through the logic..."），codex CLI 收到后就把整个 task 标记完成退出了。

  三种可能根因（待排查）

  1. model 行为：Qwen3.6-27B 在 agentic RL 数据上没训到位，应该 tool_call 的地方它 output text。同一模型下 qwen_code / claude_code 都能继续 → 说明模型本身能用。
  2. codex CLI 0.121.0 行为：codex 在 Responses 协议下看到 output_text 而非 function_call 就认为本轮结束。实测 claude_code 也用 OpenAI Chat 协议（polar 转 anthropic），qwen_code 用 OpenAI Chat——这俩都不存在"text-only 终止"的歧义。
  3. polar gateway Responses 协议 bug：看了 src/polar/gateway/transform/openai_responses.py:482-554 的 transform_response()——对 vLLM 返回的 chat completion response，转换为 Responses 输出时只支持 reasoning + message + function_call 三种 output item。没有"refusal /
  incomplete / 隐式停止信号"的处理路径。如果 vLLM 在某种情况返回了 finish_reason != "stop" 或 finish_reason != "tool_calls"（比如 "length"），transformer 可能丢失这个信息，导致 codex 收到一个看起来"正常完成"的 Responses 输出。

  排查建议

  步骤 1：看 vLLM 实际返回的 chunk 里 finish_reason 是什么。改 polar gateway 临时加一行 log：
```  
# openai_responses.py:572 附近
  if choices and choices[0].get("finish_reason"):
      logger.warning("codex session=%s finish_reason=%s", session_id, choices[0]["finish_reason"])
      events.extend(state.finalize())
```

  步骤 2：检查 vLLM 27B 模型是否对 codex 风格的 system prompt 有兼容问题。codex 的 instructions 字段（openai_responses.py:394-396）被转成 system message。Qwen3.6-27B 的 chat template 可能不会以 codex 期望的格式响应 tool_call。

  步骤 3：临时改 agent_env 强制加 streaming，看看 codex 看到 streaming chunks 时会不会继续。具体：在 gateway_url 后追加 &stream=true（如果 codex 支持这个 query param）。

  优先级

  低。codex 用户少，且可能是 model 行为问题。但 opencode 那条是 blocker——opencode 0/10 完全是 polar 的锅。

  Environment

  - polar commit: 8bc67cc3
  - @openai/codex: 0.121.0
  - vLLM: Qwen3.6-27B, max_model_len=262144, --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser qwen3_coder
  - 4 calls × 19s = 远少于其他 harness 正常 SWE-bench 任务的 5-15 min

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencode harness captures zero completions — config doesn't include gateway baseURL #39

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

opencode harness captures zero completions — config doesn't include gateway baseURL #39

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions