Honor AI_AGENT and pass raw values through#1454
Open
renaudhartert-db wants to merge 1 commit into
Open
Conversation
Mirror databricks-sdk-go PR #1683 in the User-Agent agent detection. Add AI_AGENT (the Vercel @vercel/detect-agent convention) as a secondary fallback env var, consulted only when AGENT (the agents.md standard) is unset or empty. AGENT takes precedence when both are non-empty. Explicit product-specific env vars (CLAUDECODE, CURSOR_AGENT, etc.) still win over both. Change the fallback behavior so an unrecognized value is passed through rather than coerced to "unknown". The raw value is sanitized to the User-Agent allowlist (disallowed characters become "-") and capped at 64 characters. This applies to both AGENT and AI_AGENT.
ec300e4 to
6fd4e30
Compare
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The Python SDK detects AI coding agents and surfaces them as
agent/<name>in the User-Agent. Today the generic fallback (when no proprietary env var fires) only honors the agents.mdAGENT=<name>standard. Vercel's@vercel/detect-agentlibrary uses a parallelAI_AGENT=<name>convention that tools in the Vercel ecosystem set instead; we currently miss those.Separately, the existing fallback coerces any unrecognized value to the literal string
"unknown". That buries useful signal: a tool settingAI_AGENT=claude-code_2-1-141_agentends up asagent/unknown, discarding the very signal (tool name plus version variant) we want to see. Bucketing arbitrary names is an ETL concern, not the SDK's.This mirrors the Go SDK change in databricks/databricks-sdk-go#1683.
Changes
Two behavior changes in
databricks/sdk/useragent.py:AI_AGENTfallback. AddAI_AGENT=<name>as a secondary fallback afterAGENT=<name>.AGENTwins when both are set to non-empty values; empty is treated as unset for both. Explicit product matchers (e.g.CLAUDECODE) still always win over both.Raw passthrough instead of
"unknown". Drop the known-product lookup in the fallback. The value is sanitized (disallowed chars become-, satisfying the User-Agent allowlist[0-9A-Za-z_.+-]+) and capped at 64 chars to keep the header bounded. Known products likecursororclaude-codepass through unchanged because they already satisfy the allowlist.Same change is landing in
databricks-sdk-javaas a sibling PR.Test plan
pytest tests/test_user_agent.pypasses (54 tests)ruff format/ruff checkcleanAI_AGENT=<known product>returns the product nameAI_AGENT=<unrecognized>returns the raw sanitized value (no longer"unknown")AGENTwins overAI_AGENTwhen both are non-emptyAGENTfalls through toAI_AGENTAGENT/AI_AGENTare sanitized to-CLAUDECODE) still wins over both fallbacks