Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
version: 2
updates:
# Python dependencies (pyproject.toml + uv.lock)
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "python"
groups:
python-dev-dependencies:
dependency-type: "development"
patterns:
- "*"

# GitHub Actions used in CI / release workflows
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "github-actions"
13 changes: 13 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,16 @@ repos:
args: [-c, pyproject.toml]
additional_dependencies: ["bandit[toml]"]
exclude: ^(tests|docs|examples)/

# Run mypy from the project venv (synced via `uv sync --dev`) so it resolves
# real dependencies and honours the [tool.mypy] config in pyproject.toml.
# Checks the whole package once (pass_filenames: false) rather than per-file.
- repo: local
hooks:
- id: mypy
name: mypy
entry: uv run mypy agentflow/
language: system
types: [python]
pass_filenames: false
exclude: ^(tests|docs|examples|normal_tests)/
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ persistence, tools, memory, evaluation, and event publishing. Inspired by LangGr
the README and several docstrings still show pre-refactor paths (see Known Doc Drift).
- **Surgical edits.** This is `Development Status :: 5 - Production/Stable`. Don't refactor
module boundaries or rename exports without checking every `__init__.py` that re-exports them.
- **Keep coverage green.** `pytest` enforces `--cov-fail-under=70`. New code needs tests.
- **Keep coverage green.** `pytest` enforces `--cov-fail-under=80`. New code needs tests.
- **Optional deps are optional.** Provider SDKs, MCP, Postgres, Redis, Qdrant, Mem0, Kafka,
RabbitMQ, OTEL, a2a are all extras. Guard imports; never make core import a hard optional dep.

Expand Down Expand Up @@ -152,7 +152,7 @@ already present.

```bash
# from this folder (agentflow/)
.venv/bin/python -m pytest # full suite (enforces coverage >= 70%)
.venv/bin/python -m pytest # full suite (enforces coverage >= 80%)
.venv/bin/python -m pytest tests/graph # one area
ruff check . && ruff format . # lint + format (line-length 100, py312)
# editable install with extras for local dev:
Expand Down
112 changes: 112 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Contributing to Agentflow

Thanks for your interest in improving `10xscale-agentflow`. This guide covers the
core Python framework that lives in this folder. For the API server, TypeScript
client, docs, or playground, see the `CONTRIBUTING`/`CLAUDE.md` in their
respective packages.

- Package (PyPI): `10xscale-agentflow`
- Requires: Python >= 3.12
- The importable package is the nested `agentflow/` directory; this folder is the
repo root for the core library.

## Getting set up

We use [`uv`](https://docs.astral.sh/uv/) for environment and dependency
management.

```bash
# from this folder (the core library root)
uv sync --dev # create .venv and install the package + dev tools
uv run pre-commit install # enable the git hooks (optional but recommended)
```

If you work on optional subsystems, install the matching extras, e.g.:

```bash
uv pip install -e ".[google-genai,openai,mcp,pg_checkpoint]"
```

## Before you open a pull request

Run the same checks CI runs. All must pass:

```bash
uv run pre-commit run --all-files # ruff format + lint, bandit, mypy, hooks
uv run pytest --cov --cov-branch # tests + coverage gate (>= 80%)
```

You can also run pieces individually:

```bash
uv run ruff check . && uv run ruff format .
uv run mypy agentflow/
uv run pytest tests/graph # one area
```

### What the gates enforce

- **Formatting & linting:** `ruff` (line length 100, target py312). Most issues
are auto-fixed by `ruff format` / `ruff check --fix`.
- **Types:** `mypy` runs in pre-commit. The codebase is on *phased* typing: a set
of modules with pre-existing errors is listed under `[[tool.mypy.overrides]]`
in `pyproject.toml` with `ignore_errors = true`. New code is type-checked.
Improving a listed module's types and removing it from that list is a welcome
contribution; please don't add new modules to it.
- **Security:** `bandit`.
- **Coverage:** `pytest` fails under 80% line coverage. New code needs tests.

## Tests

- Tests live in `tests/`, mirroring the package layout (`graph/`, `state/`,
`storage/`, `publisher/`, `prebuilt/`, `evaluation/`, `testing/`, plus
`chaos/`, `benchmarks/`, `integration/`).
- Markers: `asyncio`, `integration` (needs real databases — Redis/Postgres),
`slow`. Integration tests are skipped unless their backends are available.
- Prefer the in-repo test helpers in `agentflow.qa.testing` (`TestAgent`,
`MockMCPClient`, `MockToolRegistry`) to exercise graphs without live LLM calls.

## Import paths (read this before referencing symbols)

The package is organised into `core/`, `storage/`, `runtime/`, `qa/`. There are
**no** top-level `agentflow.graph` / `agentflow.state` / `agentflow.checkpointer`
shims — use the canonical paths:

```python
from agentflow.core.graph import StateGraph, Agent, ToolNode, CompiledGraph
from agentflow.core.state import AgentState, Message
from agentflow.core.llm import call_llm, create_llm_client, detect_provider
from agentflow.storage.checkpointer import InMemoryCheckpointer, PgCheckpointer
```

`examples/` uses current import paths and is the most reliable usage reference.

## Optional dependencies

Provider SDKs (OpenAI, Google GenAI), MCP, Postgres, Redis, Qdrant, Mem0, Kafka,
RabbitMQ, OTEL, and a2a are all **extras**. Guard their imports inside the
functions that need them so the core package never hard-imports an optional
dependency. See `agentflow/core/llm/client_factory.py` for the pattern.

## Commit and PR conventions

- Use clear, conventional-style commit subjects (`feat:`, `fix:`, `docs:`,
`refactor:`, `test:`, `chore:`), matching the existing history.
- Keep changes surgical. This package is `Development Status :: 5 -
Production/Stable`; avoid renaming exports or moving module boundaries without
checking every `__init__.py` that re-exports the symbol.
- Update docs/examples when you change public behaviour. Prefer fixing a stale
doc/example to match the code over the reverse.
- One logical change per PR. Describe the motivation and how you tested it.

## Reporting bugs and security issues

- **Bugs / feature requests:** open an issue at
https://github.com/10xHub/agentflow/issues with a minimal reproduction.
- **Security vulnerabilities:** do **not** open a public issue — follow
[`SECURITY.md`](SECURITY.md).

## License

By contributing, you agree that your contributions are licensed under the
project's [MIT License](LICENSE).
64 changes: 64 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Security Policy

## Supported versions

`10xscale-agentflow` is pre-1.0 and ships from a single release line. Security
fixes are applied to the latest published release only. Pin a known-good version
in production and upgrade promptly when a security release is announced.

| Version | Supported |
| ------- | ------------------ |
| 0.7.x | :white_check_mark: |
| < 0.7 | :x: |

## Reporting a vulnerability

**Please do not open a public GitHub issue for security problems.**

Report privately through either channel:

- **GitHub Security Advisories** (preferred): open a private report at
https://github.com/10xHub/agentflow/security/advisories/new
- **Email:** contact@10xscale.ai (you may also CC shudiptotrafder@gmail.com)

Include as much of the following as you can:

- A description of the issue and the impact you believe it has.
- The affected version(s) and, if known, the affected module/import path
(e.g. `agentflow.core.llm.client_factory`).
- A minimal reproduction or proof of concept.
- Any suggested remediation.

### What to expect

- **Acknowledgement** within 3 business days.
- An initial assessment and severity triage within 7 business days.
- Coordinated disclosure: we will agree on a disclosure timeline with you and
credit you in the advisory unless you prefer to remain anonymous.

## Scope

This policy covers the `10xscale-agentflow` core Python package in this
repository. Issues in the API server (`10xscale-agentflow-cli`), the TypeScript
client, or third-party dependencies should be reported against their respective
projects, though we are happy to help route a report.

### Things that are expected behaviour, not vulnerabilities

- **Tools execute arbitrary code by design.** Tools you register with a
`ToolNode` run with the privileges of the host process. Only register trusted
tools and treat tool inputs derived from model output as untrusted.
- **Provider API keys are read from the environment** (`OPENAI_API_KEY`,
`GEMINI_API_KEY`, etc.). Protecting that environment is the deployer's
responsibility.
- **Prompt injection** against an LLM is a property of the model/application
design. Reports demonstrating a concrete privilege escalation or data
exfiltration path *through the framework* are in scope; generic "the model can
be jailbroken" reports are not.

## Good practice for deployers

- Keep `IS_DEBUG=false` and `MODE=production` in production.
- Never set `ORIGINS=*` in production.
- Use a secrets manager rather than committing `.env` files.
- Constrain which tools and MCP servers an agent can reach.
95 changes: 95 additions & 0 deletions agentflow/core/graph/agent_internal/circuit_breaker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
"""A small circuit breaker for LLM calls.

Complements retry + fallback: once a model/provider has failed
``failure_threshold`` times in a row, its circuit *opens* and further calls to
it are short-circuited (skipped, moving straight to the next fallback) for
``reset_timeout`` seconds. After that cooldown a single trial is allowed
(*half-open*); success closes the circuit, another failure re-opens it.

This stops a dead provider from being retried on every single invocation.
"""

from __future__ import annotations

import time
from collections.abc import Callable
from enum import Enum


class CircuitState(str, Enum):
"""Lifecycle state of a :class:`CircuitBreaker`."""

closed = "closed" # normal operation, calls allowed
open = "open" # failing, calls skipped until the cooldown elapses
half_open = "half_open" # cooldown elapsed, one trial call allowed


class CircuitBreakerOpenError(RuntimeError):
"""Raised/used as the recorded error when a call is skipped by an open circuit."""

def __init__(self, key: object, retry_after: float) -> None:
self.key = key
self.retry_after = retry_after
super().__init__(f"Circuit breaker open for {key!r}; retry in {retry_after:.1f}s")


class CircuitBreaker:
"""Per-target failure tracker with open/half-open/closed states.

Args:
failure_threshold: Consecutive failures that trip the circuit (>= 1).
reset_timeout: Seconds to stay open before allowing a half-open trial.
time_func: Monotonic clock source; injectable for testing.
"""

def __init__(
self,
failure_threshold: int = 5,
reset_timeout: float = 30.0,
time_func: Callable[[], float] = time.monotonic,
) -> None:
if failure_threshold < 1:
raise ValueError("failure_threshold must be >= 1")
if reset_timeout <= 0:
raise ValueError("reset_timeout must be > 0")
self.failure_threshold = failure_threshold
self.reset_timeout = reset_timeout
self._time = time_func
self._failures = 0
self._state = CircuitState.closed
self._opened_at = 0.0

@property
def state(self) -> CircuitState:
return self._state

@property
def failure_count(self) -> int:
return self._failures

def allow(self) -> bool:
"""Return True if a call may proceed, transitioning open -> half-open if due."""
if self._state is CircuitState.open:
if self._time() - self._opened_at >= self.reset_timeout:
self._state = CircuitState.half_open
return True
return False
return True

def record_success(self) -> None:
"""Reset the breaker to closed after a successful call."""
self._failures = 0
self._state = CircuitState.closed

def record_failure(self) -> None:
"""Register a failure, opening the circuit at/over threshold or from half-open."""
self._failures += 1
if self._state is CircuitState.half_open or self._failures >= self.failure_threshold:
self._state = CircuitState.open
self._opened_at = self._time()

def retry_after(self) -> float:
"""Seconds remaining before an open circuit allows a half-open trial."""
if self._state is not CircuitState.open:
return 0.0
return max(0.0, self.reset_timeout - (self._time() - self._opened_at))
10 changes: 10 additions & 0 deletions agentflow/core/graph/agent_internal/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ class RetryConfig:
max_delay: Upper-bound cap on exponential back-off delay (default ``30.0``).
backoff_factor: Multiplier applied after each retry (default ``2.0``).
retryable_status_codes: HTTP status codes considered transient/retryable.
circuit_breaker_enabled: When True, track failures per (provider, model)
and skip a target whose circuit is open, moving straight to the next
fallback instead of retrying a known-dead provider (default ``False``).
circuit_breaker_threshold: Consecutive failures that open a circuit
(default ``5``).
circuit_breaker_reset_timeout: Seconds a circuit stays open before a
single half-open trial is allowed (default ``30.0``).
"""

max_retries: int = 3
Expand All @@ -25,6 +32,9 @@ class RetryConfig:
retryable_status_codes: frozenset[int] = field(
default_factory=lambda: frozenset({429, 500, 502, 503, 529}),
)
circuit_breaker_enabled: bool = False
circuit_breaker_threshold: int = 5
circuit_breaker_reset_timeout: float = 30.0


DEFAULT_RETRY_CONFIG = RetryConfig()
Expand Down
Loading
Loading