NVIDIA · AbhiramDwivedi · Jun 17, 2026 · Jun 21, 2026 · Jun 21, 2026 · Jun 21, 2026
diff --git a/README.md b/README.md
@@ -188,6 +188,8 @@ inference gateways.
 | `anthropic_proxy` | `ANTHROPIC_PROXY_API_KEY` + `ANTHROPIC_PROXY_ENDPOINT_URL` | Any Vertex-style raw-predict proxy | `claude-sonnet-4-6` |
 | `bedrock` | `AWS_PROFILE` (optional) + `AWS_REGION` — SigV4 via boto3 | AWS Bedrock Runtime | `us.anthropic.claude-sonnet-4-6-20250915-v1:0` |
 | `nv_build` | `NVIDIA_INFERENCE_KEY` | build.nvidia.com | `deepseek-ai/deepseek-v4-flash` |
+| `claude_cli` | _(none — uses local CLI auth)_ | local `claude` binary | `claude-sonnet-4-6` |
+| `codex_cli` | _(none — uses local CLI auth)_ | local `codex` binary | `o4-mini` |
 
 ```bash
 # Stock OpenAI
@@ -224,6 +226,16 @@ export SKILLSPECTOR_PROVIDER=nv_build
 export NVIDIA_INFERENCE_KEY=nvapi-...
 skillspector scan ./my-skill/
 
+# Local Claude CLI — no API key; uses your existing `claude auth login` session
+# Requires: claude CLI installed and authenticated (claude auth login)
+export SKILLSPECTOR_PROVIDER=claude_cli
+skillspector scan ./my-skill/
+
+# Local Codex CLI — no API key; uses your existing `codex login` session
+# Requires: codex CLI installed and authenticated
+export SKILLSPECTOR_PROVIDER=codex_cli
+skillspector scan ./my-skill/
+
 # Local Ollama or any OpenAI-compatible endpoint
 export SKILLSPECTOR_PROVIDER=openai
 export OPENAI_API_KEY=ollama
@@ -514,7 +526,7 @@ Issues (2)
 
 | Variable | Description | Required |
 |----------|-------------|----------|
-| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai`, `anthropic`, `anthropic_proxy`, `bedrock`, or `nv_build`. Each provider has its own bundled `model_registry.yaml` and default model (see the LLM Analysis table above). Defaults to `nv_build`. | Optional |
+| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai`, `anthropic`, `anthropic_proxy`, `bedrock`, `nv_build`, `claude_cli`, `codex_cli`, or `gemini_cli`. Each provider has its own bundled `model_registry.yaml` and default model (see the LLM Analysis table above). Defaults to `nv_build`. | Optional |
 | `NVIDIA_INFERENCE_KEY` | Credential for the `nv_build` provider (build.nvidia.com). | Required for LLM analysis when `SKILLSPECTOR_PROVIDER=nv_build` |
 | `OPENAI_API_KEY` | Credential for the OpenAI provider (`SKILLSPECTOR_PROVIDER=openai`). Also serves as the tier-2 fallback in the credential waterfall when the active provider returns no credentials. | Required for LLM analysis when `SKILLSPECTOR_PROVIDER=openai` |
 | `OPENAI_BASE_URL` | Override the OpenAI endpoint (e.g. point at Ollama). | Optional |
@@ -528,6 +540,8 @@ Issues (2)
 | `SKILLSPECTOR_MODEL_REGISTRY` | Override the bundled per-provider YAML registry (`src/skillspector/providers/<provider>/model_registry.yaml`) with a custom path. | Optional |
 | `SKILLSPECTOR_LOG_LEVEL` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR` (default: `WARNING`). | Optional |
 
+> **CLI providers** (`claude_cli`, `codex_cli`): No API key is needed. Authentication is managed entirely by the agent CLI's own login session (`claude auth login` / `codex login`). SkillSpector never reads or forwards API keys when these providers are active. The subprocess is run in a hardened sandbox: tools disabled, no MCP, read-only sandbox mode (codex), and untrusted skill content is delivered only via stdin.
+
 ### CLI Options
 
 ```bash

diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md
@@ -265,12 +265,14 @@ Copy [.env.example](../.env.example) to `.env` in the project root and set value
 
 | Variable | Description | Example |
 |----------|-------------|---------|
-| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai` \| `anthropic` \| `nv_build`. Defaults to `nv_build`. | `openai` |
+| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai` \| `anthropic` \| `nv_build` \| `claude_cli` \| `codex_cli`. Defaults to `nv_build`. | `claude_cli` |
 | `NVIDIA_INFERENCE_KEY` | Credential for `nv_build`. | `nvapi-...` |
 | `OPENAI_API_KEY` | Credential for `SKILLSPECTOR_PROVIDER=openai`. Also tier-2 fallback for non-OpenAI providers. | `sk-...` |
 | `OPENAI_BASE_URL` | Override the OpenAI endpoint (e.g. point at Ollama). | `http://localhost:11434/v1` |
 | `ANTHROPIC_API_KEY` | Credential for `SKILLSPECTOR_PROVIDER=anthropic`. | `sk-ant-...` |
-| `SKILLSPECTOR_MODEL` | Override the active provider's bundled default model (see [README.md](../README.md) for per-provider defaults). | `gpt-5.2` |
+| `SKILLSPECTOR_MODEL` | Override the active provider's bundled default model (see [README.md](../README.md) for per-provider defaults). For `claude_cli`, this is passed as `--model` to the `claude` binary. | `gpt-5.2` |
+
+> **CLI providers** (`claude_cli`, `codex_cli`): no credential env var is needed. Authentication is managed by the agent CLI's own session (`claude auth login` / `codex login`). The subprocess is heavily sandboxed — see [providers/_agent_cli.py](../src/skillspector/providers/_agent_cli.py).
 
 ### Live provider tests
 
@@ -291,8 +293,18 @@ Base URL env vars are not needed for live provider tests; the tests intentionall
   - **`get_max_input_tokens(model)`** — input budget per LLM request (75% of resolved context window).
   - **`get_max_output_tokens(model)`** — output budget per LLM request (min of 25% context, registry's `max_output_tokens` cap if set).
   - Batch budget overhead is computed per-prompt via `estimate_tokens(base_prompt)` rather than a fixed constant.
-- **Providers** ([providers/](../src/skillspector/providers/)): pluggable credential + token-budget resolvers. Each provider is a subpackage with its own `provider.py` and bundled `model_registry.yaml`; [registry.py](../src/skillspector/providers/registry.py) exposes `lookup_context_length` / `lookup_max_output_tokens` utilities the providers call directly. The active provider is chosen by `SKILLSPECTOR_PROVIDER` (default: `nv_build`) — see [providers/`__init__`.py](../src/skillspector/providers/__init__.py): `nv_build/` (build.nvidia.com), `openai/`, or `anthropic/`.
-- **LLM calls** ([llm_utils.py](../src/skillspector/llm_utils.py)): **`get_chat_model()`** and **`chat_completion()`** resolve credentials in two tiers — active NVIDIA provider (`NVIDIA_INFERENCE_KEY` → endpoint) → standard `OPENAI_API_KEY` / `OPENAI_BASE_URL` — against any OpenAI-compatible endpoint. `max_tokens` is auto-bound to `get_max_output_tokens(model)` from `model_info`.
+- **Providers** ([providers/](../src/skillspector/providers/)): pluggable credential + token-budget resolvers. Each provider is a subpackage with its own `provider.py` and bundled `model_registry.yaml`; [registry.py](../src/skillspector/providers/registry.py) exposes `lookup_context_length` / `lookup_max_output_tokens` utilities the providers call directly. The active provider is chosen by `SKILLSPECTOR_PROVIDER` (default: `nv_build`):
+  - `nv_build/` — build.nvidia.com (HTTP, `NVIDIA_INFERENCE_KEY`)
+  - `openai/` — api.openai.com or any OpenAI-compatible URL (`OPENAI_API_KEY`)
+  - `anthropic/` — api.anthropic.com (`ANTHROPIC_API_KEY`)
+  - `claude_cli/` — **local `claude` binary; no API key**. Uses the CLI's own auth session (`claude auth login`). Set `SKILLSPECTOR_PROVIDER=claude_cli`.
+  - `codex_cli/` — **local `codex` binary; no API key**. Uses the CLI's own auth session (`codex login`). Set `SKILLSPECTOR_PROVIDER=codex_cli`.
+
+  CLI providers (`claude_cli`, `codex_cli`) implement the optional `AgentCLICapable` interface (`is_available()` + `complete()`) defined in [providers/base.py](../src/skillspector/providers/base.py). `has_cli_capability(provider)` detects this at runtime.  All subprocess calls go through the hardened helper [providers/_agent_cli.py](../src/skillspector/providers/_agent_cli.py) which enforces: no shell (`shell=False`), untrusted content via stdin only, capability stripping (tools disabled / sandboxed), environment scrubbing (no API keys forwarded), per-call timeout, and fail-closed error handling.
+
+- **LLM calls** ([llm_utils.py](../src/skillspector/llm_utils.py)): **`get_chat_model()`** and **`chat_completion()`** dispatch based on the active provider:
+  - **HTTP providers**: resolve credentials in two tiers — active provider (`NVIDIA_INFERENCE_KEY` / `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` → endpoint) — against any OpenAI-compatible endpoint. `max_tokens` is auto-bound to `get_max_output_tokens(model)` from `model_info`.
+  - **CLI providers** (`claude_cli`, `codex_cli`): `get_chat_model()` returns an `AgentCLIChatModel` adapter backed by `provider.complete()`, so the analyzers' `.invoke()` / `.with_structured_output(schema).invoke()` calls work with no API key (structured output is produced by prompting for JSON, then Pydantic-validating). `chat_completion()` routes through `get_chat_model()` as well. `is_llm_available()` calls `provider.is_available()` instead of credential resolution.
 - **LLM analyzer base** ([llm_analyzer_base.py](../src/skillspector/nodes/llm_analyzer_base.py)): `LLMAnalyzerBase` provides per-file/per-chunk batching, token-budget-aware chunking, and a run loop for all LLM-based analyzers. `LLMMetaAnalyzer` extends it for filter/enrich (meta_analyzer node). Future semantic analyzers extend `LLMAnalyzerBase` for discovery mode.
 
 ---

diff --git a/src/skillspector/llm_utils.py b/src/skillspector/llm_utils.py
@@ -13,13 +13,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-"""Shared LLM utilities.
+"""Shared LLM utilities (OpenAI-compatible chat models + agent CLI transports).
 
 Credentials are resolved in this order:
-    1. The active SkillSpector provider (see :mod:`skillspector.providers`) —
-       reads its own credential env var and supplies the matching client.
+    1. The active provider (see :mod:`skillspector.providers`):
+       - CLI providers (``claude_cli``, ``codex_cli``, ``gemini_cli``): use
+         ``is_available()`` and ``complete()`` — no API key needed.
+       - HTTP providers (``anthropic``, ``openai``, ``nv_build``): read their
+         respective credential env vars and supply a base URL.
     2. ``OPENAI_API_KEY`` / ``OPENAI_BASE_URL`` (the langchain-openai
-       defaults).
+       defaults) — only consulted for HTTP providers when the provider's
+       own credential env var is unset.
 
 There is no SkillSpector-specific credential env var: setting
 ``NVIDIA_INFERENCE_KEY`` configures whichever NVIDIA endpoint the
@@ -30,13 +34,18 @@
 
 from __future__ import annotations
 
+import asyncio
+import json
+from typing import NoReturn
+
 from langchain_core.language_models.chat_models import BaseChatModel
-from langchain_core.messages import BaseMessage
 
 from skillspector.model_info import get_max_input_tokens, get_max_output_tokens
 from skillspector.providers import (
     create_chat_model,
+    get_active_provider,
     get_metadata_provider,
+    has_cli_capability,
     raise_no_llm_api_key_configured,
     resolve_chat_model_credentials,
     resolve_provider_credentials,
@@ -47,8 +56,9 @@
 def _resolve_llm_credentials() -> tuple[str, str | None]:
     """Return ``(api_key, base_url)`` resolved from the environment.
 
-    Tries the active NVIDIA provider first; falls back to ``OPENAI_API_KEY``
-    / ``OPENAI_BASE_URL`` when the provider is not configured.
+    Tries the active SkillSpector provider first; falls back to
+    ``OPENAI_API_KEY`` / ``OPENAI_BASE_URL`` when the provider is not
+    configured.
 
     Raises:
         ValueError: when no API key can be resolved from any source.
@@ -72,7 +82,15 @@ def _resolve_default_chat_model() -> str:
 
 
 def is_llm_available() -> tuple[bool, str | None]:
-    """Return ``(available, error_message)`` describing LLM credential status."""
+    """Return ``(available, error_message)`` describing LLM availability.
+
+    For CLI providers (``claude_cli``, ``codex_cli``, ``gemini_cli``) the check
+    delegates to the provider's ``is_available()`` method (binary on PATH +
+    auth).  For HTTP providers, it falls back to credential resolution.
+    """
+    provider = get_active_provider()
+    if has_cli_capability(provider):
+        return provider.is_available()  # type: ignore[attr-defined]
     try:
         _resolve_llm_credentials()
     except ValueError as exc:
@@ -85,12 +103,157 @@ def fetch_model_token_limits(model_label: str) -> tuple[int, int]:
     return get_max_input_tokens(model_label), get_max_output_tokens(model_label)
 
 
-def get_chat_model(model: str | None = None) -> BaseChatModel:
-    """Return the active provider's native LangChain chat model.
+# ---------------------------------------------------------------------------
+# Agent CLI chat-model adapter
+# ---------------------------------------------------------------------------
+#
+# The LLM analyzers (meta_analyzer, semantic_*) obtain a model from
+# ``get_chat_model()`` and call ``.invoke()`` / ``.with_structured_output(
+# schema).invoke()`` on it (see ``llm_analyzer_base``) — they never go through
+# ``chat_completion``. To support CLI providers there, ``get_chat_model``
+# returns this minimal adapter, which mimics the slice of the ``ChatOpenAI``
+# interface the analyzers rely on, backed by the provider's ``complete()``
+# subprocess transport.
+
+
+class _AgentCLIMessage:
+    """Minimal stand-in for a LangChain message: exposes ``.content``."""
+
+    def __init__(self, content: str) -> None:
+        self.content = content
+
+
+def _extract_json_object(raw: str) -> dict:
+    """Extract a single JSON object from a CLI model's text response.
+
+    Tolerates markdown code fences and surrounding prose. Raises ``ValueError``
+    (fail-closed) when no JSON object can be parsed.
+    """
+    text = raw.strip()
+    if text.startswith("```"):
+        # Drop the opening fence line (``` or ```json) and any closing fence.
+        text = text.split("\n", 1)[1] if "\n" in text else ""
+        fence = text.rfind("```")
+        if fence != -1:
+            text = text[:fence]
+        text = text.strip()
+    try:
+        obj = json.loads(text)
+        if isinstance(obj, dict):
+            return obj
+    except json.JSONDecodeError:
+        pass
+    start, end = text.find("{"), text.rfind("}")
+    if start != -1 and end > start:
+        try:
+            obj = json.loads(text[start : end + 1])
+            if isinstance(obj, dict):
+                return obj
+        except json.JSONDecodeError:
+            pass
+    raise ValueError(f"could not extract a JSON object from CLI response: {raw[:200]!r}")
+
+
+class _StructuredAgentCLIModel:
+    """Mimics ``ChatOpenAI.with_structured_output(schema)`` for a CLI provider.
+
+    ``invoke`` augments the prompt with the schema, calls the provider's
+    ``complete()``, then parses and validates the response into *schema*.
+    """
+
+    def __init__(self, provider: object, model: str, max_output_tokens: int, schema: type) -> None:
+        self._provider = provider
+        self._model = model
+        self._max_output_tokens = max_output_tokens
+        self._schema = schema
+
+    def _augment(self, prompt: str) -> str:
+        schema_json = json.dumps(self._schema.model_json_schema(), indent=2)
+        return (
+            f"{prompt}\n\n"
+            "Respond with ONLY a single JSON object conforming to the JSON Schema "
+            "below. Do not wrap it in markdown code fences and do not add any prose "
+            f"before or after the JSON.\n\nJSON Schema:\n{schema_json}"
+        )
+
+    def invoke(self, prompt: str) -> object:
+        raw = self._provider.complete(  # type: ignore[attr-defined]
+            self._augment(prompt),
+            model=self._model,
+            max_output_tokens=self._max_output_tokens,
+        )
+        return self._schema.model_validate(_extract_json_object(raw))
+
+    async def ainvoke(self, prompt: str) -> object:
+        return await asyncio.to_thread(self.invoke, prompt)
+
+
+class AgentCLIChatModel:
+    """Minimal ``ChatOpenAI``-compatible adapter backed by a CLI provider.
+
+    Implements only the surface the analyzers use: ``invoke`` (returns an
+    object with ``.content``), ``ainvoke``, and ``with_structured_output``.
+    The rest of the ``BaseChatModel`` surface (``batch``, ``stream``,
+    callbacks) is intentionally unsupported; the stubs below make that boundary
+    explicit so a future analyzer reaching for it fails loudly with a clear
+    message rather than a confusing ``AttributeError``.
+    """
+
+    def __init__(self, provider: object, model: str, max_output_tokens: int) -> None:
+        self._provider = provider
+        self._model = model
+        self._max_output_tokens = max_output_tokens
+
+    def batch(self, *args: object, **kwargs: object) -> NoReturn:
+        raise NotImplementedError(
+            "AgentCLIChatModel supports only invoke/ainvoke/with_structured_output; "
+            "batch() is not available for CLI providers."
+        )
+
+    def stream(self, *args: object, **kwargs: object) -> NoReturn:
+        raise NotImplementedError(
+            "AgentCLIChatModel supports only invoke/ainvoke/with_structured_output; "
+            "stream() is not available for CLI providers."
+        )
+
+    def invoke(self, prompt: str) -> _AgentCLIMessage:
+        text = self._provider.complete(  # type: ignore[attr-defined]
+            prompt,
+            model=self._model,
+            max_output_tokens=self._max_output_tokens,
+        )
+        return _AgentCLIMessage(text)
+
+    async def ainvoke(self, prompt: str) -> _AgentCLIMessage:
+        return await asyncio.to_thread(self.invoke, prompt)
+
+    def with_structured_output(self, schema: type) -> _StructuredAgentCLIModel:
+        return _StructuredAgentCLIModel(
+            self._provider, self._model, self._max_output_tokens, schema
+        )
+
+
+def get_chat_model(model: str | None = None) -> BaseChatModel | AgentCLIChatModel:
+    """Return a chat model for the active provider.
+
+    For CLI providers (``claude_cli``, ``codex_cli``, ``gemini_cli``) this
+    returns an :class:`AgentCLIChatModel` adapter backed by the provider's
+    ``complete()`` subprocess transport — so the LLM analyzers (which use
+    ``.invoke()`` and ``.with_structured_output()``) work with no API key.
+
+    For HTTP providers it delegates to
+    :func:`skillspector.providers.create_chat_model`, which uses the
+    provider's own native client (e.g. ``ChatAnthropic`` for Anthropic) with
+    an ``OPENAI_API_KEY`` / ``ChatOpenAI`` fallback.
 
     Raises:
-        ValueError: when no API key is configured (see ``is_llm_available``).
+        ValueError: when an HTTP provider has no API key configured.
     """
+    provider = get_active_provider()
+    if has_cli_capability(provider):
+        resolved_model = model or provider.resolve_model()
+        return AgentCLIChatModel(provider, resolved_model, get_max_output_tokens(resolved_model))
+
     model = model or _resolve_default_chat_model()
     return create_chat_model(
         model=model,
@@ -100,9 +263,16 @@ def get_chat_model(model: str | None = None) -> BaseChatModel:
 
 
 def chat_completion(prompt: str, *, model: str | None = None) -> str:
-    """Request a single chat completion and return the assistant text."""
-    llm = get_chat_model(model=model)
-    response = llm.invoke(prompt)
-    if not isinstance(response, BaseMessage):
-        raise TypeError(f"Expected BaseMessage from chat model, got {type(response).__name__}")
-    return str(response.text)
+    """Request a single chat completion and return the assistant content.
+
+    Routes through :func:`get_chat_model`, which dispatches to the CLI adapter
+    for CLI providers and to the provider's native chat model for HTTP providers.
+
+    Uses ``.text`` when available (real LangChain ``BaseMessage`` objects,
+    which normalise content blocks to a single string) and falls back to
+    ``.content`` for the CLI adapter's ``_AgentCLIMessage``.
+    """
+    response = get_chat_model(model=model).invoke(prompt)
+    if hasattr(response, "text"):
+        return response.text  # type: ignore[union-attr]
+    return response.content or ""  # type: ignore[union-attr]