Skip to content

Commit a010aa4

Browse files
authored
feat: add agent runtime with turn loop and run command (#3)
* feat: add agent runtime with turn loop, session buffer, and run command Implement the core agent turn loop (Step 4) that ties providers and tools together. Adds in-memory session buffer with system prompt injection, streaming response accumulation with fragmented tool call reassembly, safety-tier-aware tool dispatch (ReadOnly parallel, SideEffecting sequential), context budget warning, and `yantra run "prompt"` CLI subcommand. Includes 9 test cases covering all runtime paths with race detection. * fix: wire WorkspaceDir, extend turn timeout to cover tool dispatch P1: ToolExecutionContext.WorkspaceDir was never set, causing all built-in file tools to fail security policy checks. Now wired through AgentRuntime via a workspaceDir field set at construction. P1: Turn timeout only covered provider streaming — dispatchTools ran on the parent context with no deadline. Now turnCtx covers both phases. Added classifyError to normalize context.DeadlineExceeded to ErrTimeout (turn budget exceeded) vs ErrCancelled (parent cancelled). Added TestRun_TurnTimeout and TestRun_TurnTimeoutDuringToolExecution (11 tests total, all passing with -race). * fix: address Qodo review — dispatch ordering, signal handling, duplicate progress 1. Tool dispatch ordering: Rewrite dispatchTools to use contiguous-block approach — iterates in model-provided order, accumulates contiguous ReadOnly calls into parallel blocks, flushes before any SideEffecting call. Preserves write_file → read_file ordering correctness. 2. CLI signal handling: Wire signal.NotifyContext (SIGINT/SIGTERM) into root command via ExecuteContext, pass cmd.Context() into runAgent so Ctrl-C propagates into provider streaming and tool execution. 3. Duplicate progress: Remove ProgressToolExecution emission from runtime.executeTool — ToolRegistry.Execute already emits it. * docs: update README and architecture docs for runtime layer - README: update architecture diagram (runtime no longer planned), add yantra run usage to quick start, add runtime section explaining the turn loop, add runtime/ to project structure - architecture.md: add Layer 4 (Runtime) covering session buffer, turn loop, stream accumulation, tool dispatch ordering, error handling, and context budget. Update "what's next" table. Fix safety tier description to reflect contiguous-block parallel dispatch. - config.md: clarify turn_timeout_secs covers both streaming and tools - tools.md: update SafetyTier dispatch description for accuracy
1 parent af88b64 commit a010aa4

8 files changed

Lines changed: 1231 additions & 39 deletions

File tree

README.md

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,11 @@ Think of it as building your own Claude Code / Cursor agent from scratch.
1414
┌─────────────────────────────────────────────┐
1515
│ CLI │
1616
│ cmd/yantra/main.go │
17+
│ yantra init | run | version │
1718
├─────────────────────────────────────────────┤
18-
│ Runtime (Step 4) │
19-
│ the agent turn loop (planned) │
19+
│ Runtime │
20+
│ agent turn loop + session │
21+
│ stream → think → act → observe → loop │
2022
├──────────────┬──────────────┬───────────────┤
2123
│ Provider │ Tools │ Memory │
2224
│ Layer │ System │ (Step 5) │
@@ -41,14 +43,25 @@ go build ./...
4143
# Generate default config
4244
go run ./cmd/yantra init
4345

44-
# Edit yantra.toml with your API keys
46+
# Edit yantra.toml — set your API key
4547
$EDITOR yantra.toml
48+
49+
# Set your provider API key
50+
export OPENAI_API_KEY=sk-...
51+
# Or for Anthropic:
52+
export ANTHROPIC_API_KEY=sk-ant-...
53+
54+
# Run the agent
55+
go run ./cmd/yantra run "What is 2+2? Answer briefly."
56+
57+
# Run with a custom system prompt and workspace
58+
go run ./cmd/yantra run --system "You are a Go expert" --workspace ./myproject "add tests for main.go"
4659
```
4760

4861
## Project structure
4962

5063
```
51-
cmd/yantra/ CLI entry point (init, version, start, serve, tui)
64+
cmd/yantra/ CLI entry point (init, run, version, start, serve, tui)
5265
internal/
5366
types/ Shared interfaces and data types
5467
config.go Configuration structs + defaults
@@ -67,6 +80,9 @@ internal/
6780
anthropic.go Anthropic Messages API
6881
gemini.go Google Gemini GenerateContent
6982
reliable.go Retry wrapper with exponential backoff
83+
runtime/ Agent turn loop
84+
session.go In-memory conversation buffer
85+
runtime.go AgentRuntime, Run(), stream accumulation, tool dispatch
7086
tool/ Tool system
7187
schema.go JSON Schema builder helpers
7288
security.go SecurityPolicy + WorkspacePolicy
@@ -119,6 +135,22 @@ All tool execution goes through a `SecurityPolicy`:
119135
- **Operator blocking**: `|`, `&&`, `||`, `;`, `>` blocked by default (configurable)
120136
- Deny always overrides allow
121137

138+
## Runtime
139+
140+
The runtime is the core agent loop that ties providers and tools together:
141+
142+
1. User message is added to an in-memory session
143+
2. Session context (system prompt + messages + tool schemas) is streamed to the provider
144+
3. Response is accumulated, including fragmented tool call deltas
145+
4. If the LLM returns tool calls, they're dispatched respecting safety tiers:
146+
- **ReadOnly** tools in a contiguous block run in parallel
147+
- **SideEffecting/Privileged** tools run sequentially at their original position
148+
- Model-provided tool call order is preserved (e.g., `write_file` before `read_file`)
149+
5. Tool results are appended to the session, and the loop repeats
150+
6. When the LLM responds with text only (no tool calls), the loop ends
151+
152+
The turn timeout covers both provider streaming and tool execution as a single budget. Ctrl-C (SIGINT/SIGTERM) propagates cleanly into the runtime via context cancellation.
153+
122154
## Tests
123155

124156
```bash

cmd/yantra/main.go

Lines changed: 84 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,17 @@
11
package main
22

33
import (
4+
"context"
45
"fmt"
56
"os"
6-
7+
"os/signal"
8+
"path/filepath"
9+
"syscall"
10+
11+
"github.com/hackertron/Yantra/internal/provider"
12+
"github.com/hackertron/Yantra/internal/runtime"
13+
"github.com/hackertron/Yantra/internal/tool"
14+
"github.com/hackertron/Yantra/internal/types"
715
"github.com/spf13/cobra"
816
)
917

@@ -30,13 +38,17 @@ Single binary. Zero config to get started.`,
3038

3139
root.AddCommand(
3240
initCmd(),
41+
runCmd(),
3342
startCmd(),
3443
tuiCmd(),
3544
serveCmd(),
3645
versionCmd(),
3746
)
3847

39-
if err := root.Execute(); err != nil {
48+
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
49+
defer stop()
50+
51+
if err := root.ExecuteContext(ctx); err != nil {
4052
fmt.Fprintf(os.Stderr, "error: %v\n", err)
4153
os.Exit(1)
4254
}
@@ -165,6 +177,76 @@ provider = "duckduckgo"
165177
return nil
166178
}
167179

180+
func runCmd() *cobra.Command {
181+
var systemPrompt string
182+
var workspace string
183+
184+
cmd := &cobra.Command{
185+
Use: "run [prompt]",
186+
Short: "Run a single agent turn loop with the given prompt",
187+
Args: cobra.ExactArgs(1),
188+
RunE: func(cmd *cobra.Command, args []string) error {
189+
return runAgent(cmd.Context(), args[0], systemPrompt, workspace)
190+
},
191+
}
192+
cmd.Flags().StringVar(&systemPrompt, "system", "You are a helpful AI assistant with access to tools.", "system prompt")
193+
cmd.Flags().StringVar(&workspace, "workspace", ".", "workspace directory for tool execution")
194+
return cmd
195+
}
196+
197+
func runAgent(ctx context.Context, prompt, systemPrompt, workspace string) error {
198+
cfg, err := types.LoadConfig(configPath)
199+
if err != nil {
200+
return fmt.Errorf("loading config: %w", err)
201+
}
202+
203+
p, err := provider.BuildFromConfig(cfg)
204+
if err != nil {
205+
return fmt.Errorf("building provider: %w", err)
206+
}
207+
p = provider.NewReliable(p, provider.DefaultReliableConfig())
208+
209+
policy := tool.NewWorkspacePolicy(cfg.Tools.Shell)
210+
reg := tool.NewRegistry(policy)
211+
if err := tool.RegisterBuiltins(reg, cfg.Tools); err != nil {
212+
return fmt.Errorf("registering tools: %w", err)
213+
}
214+
215+
absWorkspace, err := filepath.Abs(workspace)
216+
if err != nil {
217+
return fmt.Errorf("resolving workspace: %w", err)
218+
}
219+
220+
rt := runtime.New(p, reg, cfg.Runtime, absWorkspace)
221+
222+
progress := make(chan types.ProgressEvent, 32)
223+
go func() {
224+
for ev := range progress {
225+
if ev.Tool != "" {
226+
fmt.Fprintf(os.Stderr, "[%s] %s: %s\n", ev.Kind, ev.Tool, ev.Message)
227+
} else {
228+
fmt.Fprintf(os.Stderr, "[%s] %s\n", ev.Kind, ev.Message)
229+
}
230+
}
231+
}()
232+
233+
result, err := rt.Run(ctx, systemPrompt, prompt, progress)
234+
close(progress)
235+
if err != nil {
236+
return fmt.Errorf("agent run failed: %w", err)
237+
}
238+
239+
fmt.Println(result.FinalContent)
240+
fmt.Fprintf(os.Stderr, "\n--- stats ---\n")
241+
fmt.Fprintf(os.Stderr, "turns: %d\n", result.TurnsUsed)
242+
fmt.Fprintf(os.Stderr, "tokens: %d prompt, %d completion, %d total\n",
243+
result.TotalUsage.PromptTokens,
244+
result.TotalUsage.CompletionTokens,
245+
result.TotalUsage.TotalTokens,
246+
)
247+
return nil
248+
}
249+
168250
func runStart(cmd *cobra.Command, args []string) error {
169251
fmt.Println("Starting Yantra daemon...")
170252
// TODO: implement daemon startup

docs/architecture.md

Lines changed: 89 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Everything in Yantra exists to make this loop work well:
3838
- **Tools** give the LLM hands
3939
- **Security** prevents the LLM from doing damage
4040
- **Config** makes it all customizable
41+
- **Runtime** runs the think → act → observe loop
4142
- **Memory** (planned) lets the agent remember across sessions
4243
- **Gateway** (planned) lets you control it remotely
4344

@@ -129,10 +130,10 @@ const (
129130
)
130131
```
131132

132-
These tiers inform the runtime how to handle tools:
133-
- **ReadOnly** tools can run in parallel safely
134-
- **SideEffecting** tools should run sequentially (they change state)
135-
- **Privileged** tools need extra checks and may require user confirmation
133+
These tiers inform the runtime how to dispatch tools:
134+
- **ReadOnly** tools run in parallel when contiguous in the call list
135+
- **SideEffecting** tools run sequentially (they change state)
136+
- **Privileged** tools run sequentially and may require user confirmation in future
136137

137138
### Configuration
138139

@@ -385,49 +386,106 @@ type ToolExecutionContext struct {
385386
`WorkspaceDir` is the most important — it's the root directory for all file operations. `Progress` is an optional channel for emitting status updates (the gateway can forward these to the UI).
386387

387388

388-
## How the pieces connect
389+
## Layer 4: Runtime (`internal/runtime/`)
390+
391+
The runtime is the brain — it ties providers and tools together in a turn loop.
389392

390-
Here's how everything flows when the runtime (Step 4) is built:
393+
### Session buffer
394+
395+
`Session` is an in-memory conversation buffer. The system prompt is stored separately and injected by `Context()` when building the payload for the provider. This keeps the message list clean for turn counting and future summarization.
396+
397+
```go
398+
session := NewSession("You are a helpful assistant.", toolSchemas)
399+
session.Append(Message{Role: "user", Content: "fix the bug"})
400+
401+
ctx := session.Context()
402+
// → Messages: [system prompt, user message]
403+
// → Tools: [read_file, write_file, ...]
404+
```
405+
406+
### The turn loop
407+
408+
`AgentRuntime.Run()` is the main entry point:
391409

392410
```
393411
1. User runs: yantra run "add error handling to server.go"
394412
395-
2. CLI loads config (yantra.toml + env vars)
396-
→ YantraConfig
413+
2. CLI loads config, builds provider + registry + runtime
414+
415+
3. TURN LOOP (up to MaxTurns):
416+
a. Per-turn timeout covers streaming + tool dispatch
417+
b. Stream provider response, accumulate text + tool call deltas
418+
c. If tool calls present:
419+
- Dispatch respecting safety tiers and model-provided order
420+
- Contiguous ReadOnly calls run in parallel
421+
- SideEffecting/Privileged calls run sequentially at original position
422+
- Tool results appended to session
423+
d. If text-only response → return result (done)
424+
e. Check context budget (log warning if approaching limit)
425+
426+
4. Return: FinalContent, TurnsUsed, TotalUsage
427+
```
428+
429+
### Stream accumulation
397430

398-
3. Build provider from config
399-
→ ReliableProvider(OpenAIProvider{model: "gpt-4o"})
431+
The provider returns a channel of `StreamItem`. The runtime's `collectStream()` method:
432+
- Accumulates `StreamText` into the response content
433+
- Reassembles `StreamToolCallDelta` fragments into complete `ToolCall` objects (keyed by index)
434+
- Captures final `Usage` from the `StreamDone` event
435+
- Propagates `StreamError` as a Go error
400436

401-
4. Create tool registry with workspace policy
402-
→ RegisterBuiltins(registry, config.Tools)
403-
→ registry has: read_file, write_file, list_files, shell_exec, web_fetch
437+
Tool call deltas arrive in chunks — the first delta for an index carries `ID` + `Name`, subsequent deltas append to `Arguments` via a `strings.Builder`. This handles all three providers (OpenAI, Anthropic, Gemini) uniformly.
404438

405-
5. Get tool schemas for LLM
406-
→ registry.Schemas(nil) → []FunctionDecl
439+
### Tool dispatch ordering
407440

408-
6. Build initial messages
409-
→ [system prompt, user message]
441+
Tools are dispatched in model-provided order with parallelism for contiguous ReadOnly blocks:
410442

411-
7. AGENT LOOP:
412-
a. Call provider.Complete(ctx, &Context{Messages, Tools})
413-
b. LLM returns Message with ToolCalls
414-
c. For each ToolCall:
415-
- registry.Execute(ctx, name, args, execCtx)
416-
- Policy check → timeout → execute → truncate
417-
- Create tool result Message
418-
d. Append assistant message + tool results to history
419-
e. Check budget (turns, tokens, cost)
420-
f. Go to step a
443+
```
444+
Call order from LLM: [read_file, read_file, write_file, read_file]
445+
├─ parallel ─┤ sequential sequential
446+
447+
Block 1: read_file + read_file → parallel (both ReadOnly)
448+
Block 2: write_file → sequential (SideEffecting)
449+
Block 3: read_file → sequential (ReadOnly, but after a side effect)
450+
```
451+
452+
This preserves correctness for patterns like `write_file → read_file` (verify what was written) while maximizing parallelism where safe.
453+
454+
### Error handling
455+
456+
The runtime classifies errors:
457+
- Parent context cancelled → `ErrCancelled` (user pressed Ctrl-C)
458+
- Turn context deadline exceeded → `ErrTimeout` (turn budget exhausted)
459+
- Max turns reached → `ErrMaxTurns`
460+
- Tool execution errors → placed in message content (the LLM sees them and can recover)
461+
462+
### Context budget
421463

422-
8. LLM returns text-only response → done
423-
→ Print final answer to user
464+
After each tool dispatch, the runtime estimates token usage (chars/4) and logs a warning if the session is approaching the context limit (`TriggerRatio * MaxContextTokens`). Actual summarization is deferred to Step 5 (Memory).
465+
466+
## How the pieces connect
467+
468+
```
469+
yantra run "add error handling to server.go"
470+
471+
├── LoadConfig() → YantraConfig
472+
├── BuildFromConfig() → ReliableProvider(OpenAIProvider)
473+
├── NewWorkspacePolicy() → SecurityPolicy
474+
├── NewRegistry() + RegisterBuiltins() → ToolRegistry
475+
└── runtime.New() + Run() → AgentRuntime turn loop
476+
477+
├── Session.Context() → system prompt + messages + tool schemas
478+
├── provider.Stream() → channel of StreamItem
479+
├── collectStream() → assembled Response with ToolCalls
480+
├── dispatchTools() → tool results (parallel ReadOnly, sequential others)
481+
├── checkContextBudget() → warning if approaching limit
482+
└── loop until text-only response or MaxTurns
424483
```
425484

426485
## What's next
427486

428487
| Step | What | Purpose |
429488
|------|------|---------|
430-
| 4 | Runtime | The agent turn loop — the brain that ties providers + tools together |
431-
| 5 | Memory | Persistent vector DB for cross-session recall |
489+
| 5 | Memory | Persistent vector DB for cross-session recall + rolling summarization |
432490
| 6 | Gateway | WebSocket server for remote control |
433491
| 7 | Multi-agent | Specialist subagents with delegation |

docs/config.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ max_cost = 0.0 # Max dollar cost (0 = unlimited)
112112

113113
**max_turns** prevents infinite loops. If the LLM keeps calling tools without converging on an answer, this stops it.
114114

115-
**turn_timeout_secs** is the timeout for a single turn (LLM call + tool executions). Not per-tool — that's the tool's own Timeout().
115+
**turn_timeout_secs** is the timeout for a single turn. It covers both the provider streaming phase and tool execution as one budget. Individual tools also have their own Timeout() applied by the registry.
116116

117117
**max_cost** tracks token usage cost and stops if exceeded. Useful for preventing runaway spend.
118118

docs/tools.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ One of three values:
7878
- `SideEffecting` — changes state (writing files, making HTTP requests)
7979
- `Privileged` — potentially dangerous (running shell commands)
8080

81-
The runtime uses these to decide execution strategy. ReadOnly tools can run in parallel. SideEffecting tools run sequentially. Privileged tools might prompt the user for confirmation.
81+
The runtime uses these to decide execution strategy. Contiguous ReadOnly tools run in parallel; SideEffecting and Privileged tools run sequentially at their original position in the call list. This preserves model-provided ordering for cross-tool dependencies (e.g., `write_file` then `read_file`) while maximizing parallelism where safe.
8282

8383
### Timeout()
8484

0 commit comments

Comments
 (0)