Summary
I am using DeepSeek BYOK models through Factory/Droid and need separate entries for the same upstream model with different thinking settings, for example High, Max, and None.
There are two separate issues:
- Main pain point: custom droids cannot target a specific DeepSeek BYOK thinking variant when multiple entries share the same upstream
model value. DeepSeek requires the API model name to remain exactly deepseek-v4-flash or deepseek-v4-pro, so I cannot encode variants like High/Max/None in the model field. Droid frontmatter can only say model: custom:deepseek-v4-flash or model: custom:deepseek-v4-pro, which is ambiguous when several BYOK entries share that value.
- Secondary issue: DeepSeek thinking-mode tool calls fail through
generic-chat-completion-api. Thinking-enabled DeepSeek Pro and Flash can emit tool calls and receive tool results, but the next request can fail because reasoning_content is not passed back in the format DeepSeek expects. A workaround exists for direct DeepSeek API keys: use provider: "anthropic" with baseUrl: "https://api.deepseek.com/anthropic". That workaround fixes tool calling for both Pro and Flash thinking modes, but it does not solve the custom droid model-variant targeting problem.
The /models UI/command can select repeated DeepSeek BYOK entries correctly for interactive sessions. The limitation is specifically custom droid frontmatter, which targets only the duplicated upstream model value instead of a unique Factory-local BYOK entry.
Environment
- Factory CLI version observed in logs:
0.123.0
- OS: Linux Mint 22.3 (
ID_LIKE="ubuntu debian", UBUNTU_CODENAME=noble)
- Kernel:
linux 6.17.0-19-generic
- Failing BYOK provider/base URL:
generic-chat-completion-api with https://api.deepseek.com
- Working workaround provider/base URL:
anthropic with https://api.deepseek.com/anthropic
- Tooling involved:
Execute, TodoWrite, and custom droids invoked through the Task tool
Redacted BYOK configuration metadata
No API keys or secrets are included below. These are the relevant non-secret fields from ~/.factory/settings.json.
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (High)","extraArgs":{"reasoning_effort":"high","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"}
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (Max)","extraArgs":{"reasoning_effort":"max","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"}
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (None)","extraArgs":{"thinking":{"type":"disabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"}
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (High)","extraArgs":{"reasoning_effort":"high","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"}
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (Max)","extraArgs":{"reasoning_effort":"max","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"}
{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (None)","extraArgs":{"thinking":{"type":"disabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"}
Issue 1: custom droids cannot target repeated DeepSeek BYOK thinking variants
DeepSeek's API only accepts these upstream model names:
deepseek-v4-pro
deepseek-v4-flash
If I change the BYOK model field to a unique value such as deepseek-v4-pro(high) or deepseek-v4-flash(none) , DeepSeek rejects the request:
The supported API model names are deepseek-v4-pro or deepseek-v4-flash, but you passed deepseek-v4-pro(high).
That means every DeepSeek Pro variant must share model: deepseek-v4-pro, and every DeepSeek Flash variant must share model: deepseek-v4-flash. The only distinguishing fields are displayName and extraArgs.
This works fine when selecting models interactively with /models; I can choose repeated entries such as:
- DeepSeek V4 Pro (High)
- DeepSeek V4 Pro (Max)
- DeepSeek V4 Pro (None)
- DeepSeek V4 Flash (High)
- DeepSeek V4 Flash (Max)
- DeepSeek V4 Flash (None)
However, custom droid frontmatter only supports:
model: custom:deepseek-v4-flash
or:
model: custom:deepseek-v4-pro
That does not let me specify which BYOK entry to use. For reliable custom droid routing today, I need to keep only one entry per upstream DeepSeek model in settings.json, which defeats the purpose of having High/Max/None variants available.
Issue 2: thinking-mode tool calls fail through generic-chat-completion-api
Reproduction
Select a thinking-enabled DeepSeek BYOK entry through the generic provider:
{
"provider": "generic-chat-completion-api",
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-v4-pro",
"extraArgs": {"reasoning_effort": "high", "thinking": {"type": "enabled"}}
}
or:
{
"provider": "generic-chat-completion-api",
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-v4-flash",
"extraArgs": {"reasoning_effort": "high", "thinking": {"type": "enabled"}}
}
Then run any agent or subagent workflow that uses tools across more than one turn.
The model emits tool calls and tool results are returned successfully. The next model request fails with:
BYOK Error: 400 The `reasoning_content` in the thinking mode must be passed back to the API.
Upstream error: The `reasoning_content` in the thinking mode must be passed back to the API.
The parent task receives no final assistant output, only a failed subagent process, for example:
Task subagent process exited with code 1
No output received from task subagent.
Tools executed: TodoWrite, Execute
This is not a subagent-specific issue. It can happen in any Droid agent tool loop. Subagents only make the model-selection problem more visible because their model is fixed through droid frontmatter.
Model/provider behavior observed
| Model |
Provider / endpoint |
Thinking mode |
Tool-loop result |
| DeepSeek V4 Pro |
generic-chat-completion-api / https://api.deepseek.com |
High / Max |
Fails after tool results with reasoning_content error |
| DeepSeek V4 Flash |
generic-chat-completion-api / https://api.deepseek.com |
High / Max |
Fails after tool results with reasoning_content error |
| DeepSeek V4 Pro |
generic-chat-completion-api / https://api.deepseek.com |
Disabled / None |
Works |
| DeepSeek V4 Flash |
generic-chat-completion-api / https://api.deepseek.com |
Disabled / None |
Works |
| DeepSeek V4 Pro |
anthropic / https://api.deepseek.com/anthropic |
High / Max |
Works |
| DeepSeek V4 Flash |
anthropic / https://api.deepseek.com/anthropic |
High / Max |
Works |
Direct DeepSeek access through the Anthropic-compatible endpoint works around the tool-loop failure for both thinking-capable DeepSeek models. It still feels patchy because it bypasses the generic provider rather than making that provider preserve DeepSeek reasoning_content.
Expected behavior
Ideally, Factory/Droid would support both:
-
Stable BYOK entry targeting for custom droids
Allow custom droids to target a Factory-local BYOK entry by display name or internal ID while still sending the canonical upstream model value to the provider.
-
Correct DeepSeek thinking-mode tool loops through the generic provider, or clear docs
Preserve and replay reasoning_content across tool-result follow-up calls in whatever format DeepSeek Pro/Flash require, or document that DeepSeek thinking-mode tool loops must use provider: "anthropic" with baseUrl: "https://api.deepseek.com/anthropic".
For example, one of these patterns would solve the routing problem:
model: custom-entry:"DeepSeek V4 Flash (None)"
or:
model: custom:deepseek-v4-flash
modelEntry: "DeepSeek V4 Flash (None)"
or a settings-level split between a Factory-local identifier and upstream API model name:
{
"id": "deepseek-v4-flash-none",
"displayName": "DeepSeek V4 Flash (None)",
"model": "deepseek-v4-flash",
"extraArgs": {"thinking": {"type": "disabled"}}
}
Then droid frontmatter could target:
model: custom:deepseek-v4-flash-none
while the upstream payload still sends:
{"model": "deepseek-v4-flash"}
Current workarounds
Workaround A: use DeepSeek's Anthropic-compatible endpoint for tool calling
Per #1091 (comment), direct DeepSeek API keys can use:
{
"provider": "anthropic",
"baseUrl": "https://api.deepseek.com/anthropic",
"model": "deepseek-v4-flash"
}
This makes tool calling work, including the cases that fail through generic-chat-completion-api. This workaround is useful, but it feels patchy because it relies on DeepSeek's Anthropic-compatible endpoint rather than fixing or documenting how Droid's generic provider should preserve DeepSeek reasoning_content.
Workaround B: keep only one BYOK entry per upstream DeepSeek model for custom droids
For custom droids, I can remove/disable duplicate entries and leave only one deepseek-v4-pro and one deepseek-v4-flash entry available.
With that setup, this droid works:
---
name: git-committer-flash
description: Performs bounded local git commit preparation using DeepSeek V4 Flash.
model: custom:deepseek-v4-flash
tools: ["Read", "LS", "Grep", "Glob", "Execute"]
---
Questions
- Is there a supported way for custom droids to target a specific BYOK entry when multiple entries share the same upstream
model value?
- If the generic provider should support it, should Factory preserve
reasoning_content in a provider-specific way after tool results?
- Is my DeepSeek BYOK
extraArgs format correct for thinking-enabled entries?
- Is DeepSeek Pro/Flash thinking-mode tool calling intended to be supported through
generic-chat-completion-api, or should users use provider: "anthropic" with baseUrl: "https://api.deepseek.com/anthropic" for DeepSeek thinking/tool-call workflows?
- If not, can Factory add a Factory-local BYOK entry identifier separate from the upstream API
model field?
- Until then, is the recommended workaround to keep only one BYOK entry per upstream DeepSeek model when using custom droids, or to use the Anthropic-compatible endpoint for all DeepSeek tool-calling droids?
Summary
I am using DeepSeek BYOK models through Factory/Droid and need separate entries for the same upstream model with different thinking settings, for example
High,Max, andNone.There are two separate issues:
modelvalue. DeepSeek requires the API model name to remain exactlydeepseek-v4-flashordeepseek-v4-pro, so I cannot encode variants like High/Max/None in themodelfield. Droid frontmatter can only saymodel: custom:deepseek-v4-flashormodel: custom:deepseek-v4-pro, which is ambiguous when several BYOK entries share that value.generic-chat-completion-api. Thinking-enabled DeepSeek Pro and Flash can emit tool calls and receive tool results, but the next request can fail becausereasoning_contentis not passed back in the format DeepSeek expects. A workaround exists for direct DeepSeek API keys: useprovider: "anthropic"withbaseUrl: "https://api.deepseek.com/anthropic". That workaround fixes tool calling for both Pro and Flash thinking modes, but it does not solve the custom droid model-variant targeting problem.The
/modelsUI/command can select repeated DeepSeek BYOK entries correctly for interactive sessions. The limitation is specifically custom droid frontmatter, which targets only the duplicated upstreammodelvalue instead of a unique Factory-local BYOK entry.Environment
0.123.0ID_LIKE="ubuntu debian",UBUNTU_CODENAME=noble)linux 6.17.0-19-genericgeneric-chat-completion-apiwithhttps://api.deepseek.comanthropicwithhttps://api.deepseek.com/anthropicExecute,TodoWrite, and custom droids invoked through theTasktoolRedacted BYOK configuration metadata
No API keys or secrets are included below. These are the relevant non-secret fields from
~/.factory/settings.json.{"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (High)","extraArgs":{"reasoning_effort":"high","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"} {"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (Max)","extraArgs":{"reasoning_effort":"max","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"} {"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Pro (None)","extraArgs":{"thinking":{"type":"disabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-pro","provider":"generic-chat-completion-api"} {"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (High)","extraArgs":{"reasoning_effort":"high","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"} {"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (Max)","extraArgs":{"reasoning_effort":"max","thinking":{"type":"enabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"} {"baseUrl":"https://api.deepseek.com","displayName":"DeepSeek V4 Flash (None)","extraArgs":{"thinking":{"type":"disabled"}},"maxOutputTokens":16384,"model":"deepseek-v4-flash","provider":"generic-chat-completion-api"}Issue 1: custom droids cannot target repeated DeepSeek BYOK thinking variants
DeepSeek's API only accepts these upstream model names:
If I change the BYOK
modelfield to a unique value such asdeepseek-v4-pro(high)ordeepseek-v4-flash(none), DeepSeek rejects the request:That means every DeepSeek Pro variant must share
model: deepseek-v4-pro, and every DeepSeek Flash variant must sharemodel: deepseek-v4-flash. The only distinguishing fields aredisplayNameandextraArgs.This works fine when selecting models interactively with
/models; I can choose repeated entries such as:However, custom droid frontmatter only supports:
or:
That does not let me specify which BYOK entry to use. For reliable custom droid routing today, I need to keep only one entry per upstream DeepSeek model in
settings.json, which defeats the purpose of having High/Max/None variants available.Issue 2: thinking-mode tool calls fail through
generic-chat-completion-apiReproduction
Select a thinking-enabled DeepSeek BYOK entry through the generic provider:
{ "provider": "generic-chat-completion-api", "baseUrl": "https://api.deepseek.com", "model": "deepseek-v4-pro", "extraArgs": {"reasoning_effort": "high", "thinking": {"type": "enabled"}} }or:
{ "provider": "generic-chat-completion-api", "baseUrl": "https://api.deepseek.com", "model": "deepseek-v4-flash", "extraArgs": {"reasoning_effort": "high", "thinking": {"type": "enabled"}} }Then run any agent or subagent workflow that uses tools across more than one turn.
The model emits tool calls and tool results are returned successfully. The next model request fails with:
The parent task receives no final assistant output, only a failed subagent process, for example:
This is not a subagent-specific issue. It can happen in any Droid agent tool loop. Subagents only make the model-selection problem more visible because their model is fixed through droid frontmatter.
Model/provider behavior observed
generic-chat-completion-api/https://api.deepseek.comreasoning_contenterrorgeneric-chat-completion-api/https://api.deepseek.comreasoning_contenterrorgeneric-chat-completion-api/https://api.deepseek.comgeneric-chat-completion-api/https://api.deepseek.comanthropic/https://api.deepseek.com/anthropicanthropic/https://api.deepseek.com/anthropicDirect DeepSeek access through the Anthropic-compatible endpoint works around the tool-loop failure for both thinking-capable DeepSeek models. It still feels patchy because it bypasses the generic provider rather than making that provider preserve DeepSeek
reasoning_content.Expected behavior
Ideally, Factory/Droid would support both:
Stable BYOK entry targeting for custom droids
Allow custom droids to target a Factory-local BYOK entry by display name or internal ID while still sending the canonical upstream
modelvalue to the provider.Correct DeepSeek thinking-mode tool loops through the generic provider, or clear docs
Preserve and replay
reasoning_contentacross tool-result follow-up calls in whatever format DeepSeek Pro/Flash require, or document that DeepSeek thinking-mode tool loops must useprovider: "anthropic"withbaseUrl: "https://api.deepseek.com/anthropic".For example, one of these patterns would solve the routing problem:
or:
or a settings-level split between a Factory-local identifier and upstream API model name:
{ "id": "deepseek-v4-flash-none", "displayName": "DeepSeek V4 Flash (None)", "model": "deepseek-v4-flash", "extraArgs": {"thinking": {"type": "disabled"}} }Then droid frontmatter could target:
while the upstream payload still sends:
{"model": "deepseek-v4-flash"}Current workarounds
Workaround A: use DeepSeek's Anthropic-compatible endpoint for tool calling
Per #1091 (comment), direct DeepSeek API keys can use:
{ "provider": "anthropic", "baseUrl": "https://api.deepseek.com/anthropic", "model": "deepseek-v4-flash" }This makes tool calling work, including the cases that fail through
generic-chat-completion-api. This workaround is useful, but it feels patchy because it relies on DeepSeek's Anthropic-compatible endpoint rather than fixing or documenting how Droid's generic provider should preserve DeepSeekreasoning_content.Workaround B: keep only one BYOK entry per upstream DeepSeek model for custom droids
For custom droids, I can remove/disable duplicate entries and leave only one
deepseek-v4-proand onedeepseek-v4-flashentry available.With that setup, this droid works:
Questions
modelvalue?reasoning_contentin a provider-specific way after tool results?extraArgsformat correct for thinking-enabled entries?generic-chat-completion-api, or should users useprovider: "anthropic"withbaseUrl: "https://api.deepseek.com/anthropic"for DeepSeek thinking/tool-call workflows?modelfield?