Skip to content

Community fixes 20260601#15

Open
GO1984 wants to merge 2 commits into
outsourc-e:mainfrom
GO1984:community-fixes-20260601
Open

Community fixes 20260601#15
GO1984 wants to merge 2 commits into
outsourc-e:mainfrom
GO1984:community-fixes-20260601

Conversation

@GO1984
Copy link
Copy Markdown

@GO1984 GO1984 commented Jun 1, 2026

Summary

This PR fixes several runtime issues found while using BenchLoop with local OpenAI-compatible servers and newer NVIDIA hardware.

Changes

  • Detect OpenAI-compatible endpoints by common ports and hosts.
  • Use /v1/chat/completions for OpenAI-compatible preflight checks.
  • Skip Ollama version checks for OpenAI-compatible endpoints.
  • Add endpoint-specific API key support via BENCHLOOP_OPENAI_KEYS.
  • Forward endpoint-specific auth headers through model listing, chat, and streaming calls.
  • Tolerate [N/A] values from nvidia-smi.
  • Omit internal asyncio task objects from active run API responses.

Why

Some OpenAI-compatible servers, such as llama.cpp, expose:

  • /v1/models
  • /v1/chat/completions

but do not expose Ollama routes like:

  • /api/tags
  • /api/chat

Before this change, BenchLoop could treat those endpoints as Ollama and fail with errors like:

Health check failed (404): {"error":{"message":"File Not Found","type":"not_found_error","code":404}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant