|
| 1 | +# Wizard + Auto-Layout Integration Test |
| 2 | + |
| 3 | +## What Changed |
| 4 | + |
| 5 | +The Wizard now uses **deterministic auto-layout** instead of trying to do spatial reasoning. The LLM focuses on semantics (what nodes/edges to create), and your existing `graphLayoutService.js` handles all spatial positioning. |
| 6 | + |
| 7 | +## Architecture |
| 8 | + |
| 9 | +``` |
| 10 | +User: "Add cookie recipe components" |
| 11 | + ↓ |
| 12 | +LLM: { graphSpec: { nodes: [...], edges: [...], layoutAlgorithm: "hierarchical" } } |
| 13 | + ↓ |
| 14 | +Queue Goal → Planner → Executor |
| 15 | + ↓ |
| 16 | +Executor calls applyLayout(nodes, edges, "hierarchical") |
| 17 | + ↓ |
| 18 | +Positioned ops → Auditor → Committer → UI |
| 19 | + ↓ |
| 20 | +Graph appears with proper spatial layout! |
| 21 | +``` |
| 22 | + |
| 23 | +## Setup |
| 24 | + |
| 25 | +1. **Start UI**: |
| 26 | + ```bash |
| 27 | + npm run dev |
| 28 | + ``` |
| 29 | + |
| 30 | +2. **Start Bridge** (separate terminal): |
| 31 | + ```bash |
| 32 | + npm run bridge |
| 33 | + ``` |
| 34 | + |
| 35 | +3. **Verify Health**: |
| 36 | + ```bash |
| 37 | + curl http://localhost:3001/health |
| 38 | + # Should return: {"status":"ok","source":"bridge-daemon",...} |
| 39 | + ``` |
| 40 | + |
| 41 | +4. **Store API Key**: |
| 42 | + - Open UI at http://localhost:4000 |
| 43 | + - Click 🔑 icon in AI panel |
| 44 | + - Paste your Anthropic or OpenRouter key |
| 45 | + - Click "Store API Key" |
| 46 | + |
| 47 | +## Test Cases |
| 48 | + |
| 49 | +### Test 1: Single Node (Baseline) |
| 50 | +**Input**: "Add Solar Energy" |
| 51 | + |
| 52 | +**Expected**: |
| 53 | +- LLM generates: `{ graphSpec: { nodes: [{ name: "Solar Energy" }], layoutAlgorithm: "force" } }` |
| 54 | +- Executor uses force-directed layout (default for single node) |
| 55 | +- Node appears in the active graph |
| 56 | + |
| 57 | +**Verify**: Check browser console for `[Agent] Queued create_subgraph goal` with layoutAlgorithm. |
| 58 | + |
| 59 | +--- |
| 60 | + |
| 61 | +### Test 2: Simple Graph with Edges |
| 62 | +**Input**: "Add a recipe with Flour, Sugar, and Eggs" |
| 63 | + |
| 64 | +**Expected**: |
| 65 | +- LLM generates graphSpec with 4 nodes (Recipe + 3 ingredients) and 3 edges |
| 66 | +- Executor chooses `"hierarchical"` or `"radial"` layout (Recipe at center/top, ingredients around it) |
| 67 | +- Nodes appear with proper spacing and connections |
| 68 | + |
| 69 | +**Verify**: |
| 70 | +- Check `/telemetry?limit=50` for the `agent_queued` entry showing node/edge counts |
| 71 | +- Nodes should NOT overlap |
| 72 | +- Edges should be visible |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +### Test 3: Complex Graph |
| 77 | +**Input**: "Fill out the components of a web application" |
| 78 | + |
| 79 | +**Expected**: |
| 80 | +- LLM generates 8-12 nodes (Frontend, Backend, Database, API, Auth, etc.) |
| 81 | +- LLM chooses appropriate layout (probably `"force"` for general graph) |
| 82 | +- Auto-layout positions nodes with collision avoidance |
| 83 | +- Network structure is clear and readable |
| 84 | + |
| 85 | +**Verify**: |
| 86 | +- No nodes stacked on top of each other |
| 87 | +- Related components are near each other (force-directed clustering) |
| 88 | +- Canvas feels balanced, not all nodes in one corner |
| 89 | + |
| 90 | +--- |
| 91 | + |
| 92 | +### Test 4: Layout Algorithm Selection |
| 93 | +Try different phrases to see if LLM picks appropriate layouts: |
| 94 | + |
| 95 | +- **Hierarchical**: "Create a company org chart with CEO, managers, and employees" |
| 96 | + - Should use `"hierarchical"` layout (top-down tree) |
| 97 | + |
| 98 | +- **Radial**: "Add planets orbiting the Sun" |
| 99 | + - Should use `"radial"` layout (Sun at center, planets in orbit) |
| 100 | + |
| 101 | +- **Grid**: "Create a periodic table with elements" |
| 102 | + - Might use `"grid"` layout (uniform spacing) |
| 103 | + |
| 104 | +**Verify**: Check telemetry for the chosen `layoutAlgorithm` in each case. |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## Debugging |
| 109 | + |
| 110 | +### Check Queue Flow |
| 111 | +```bash |
| 112 | +# See what's in the goal queue |
| 113 | +curl http://localhost:3001/queue/peek?name=goalQueue&head=5 |
| 114 | + |
| 115 | +# See what's in the task queue |
| 116 | +curl http://localhost:3001/queue/peek?name=taskQueue&head=5 |
| 117 | + |
| 118 | +# Check if patches are being generated |
| 119 | +curl http://localhost:3001/queue/metrics?name=patchQueue |
| 120 | +``` |
| 121 | + |
| 122 | +### Check Telemetry |
| 123 | +```bash |
| 124 | +# See recent agent activity |
| 125 | +curl 'http://localhost:3001/telemetry?limit=20' | jq '.items[] | select(.type == "agent_queued" or .type == "agent_plan")' |
| 126 | + |
| 127 | +# Filter by correlation ID (cid from agent response) |
| 128 | +curl 'http://localhost:3001/telemetry?cid=cid-1731626400000-abc123' | jq |
| 129 | +``` |
| 130 | + |
| 131 | +### Common Issues |
| 132 | + |
| 133 | +**1. "Something went wrong planning the graph"** |
| 134 | +- Check browser console for LLM API errors |
| 135 | +- Verify API key is stored (🔑 icon should show "Manage API Key" not "Setup API Key") |
| 136 | +- Check if LLM returned invalid JSON (telemetry will show parse errors) |
| 137 | + |
| 138 | +**2. Nodes appear but all at same position** |
| 139 | +- Auto-layout failed - check console for `[Executor] Task execution failed` |
| 140 | +- Verify `graphLayoutService.js` is being imported correctly |
| 141 | +- Check if `applyLayout` returned empty positions array |
| 142 | + |
| 143 | +**3. Nothing happens after sending message** |
| 144 | +- Bridge might be down - check if `npm run bridge` is still running |
| 145 | +- Verify BridgeClient is mounted - check browser console for "MCP Bridge: Connection" messages |
| 146 | +- Check if scheduler is running: `curl http://localhost:3001/orchestration/scheduler/status` |
| 147 | + |
| 148 | +**4. Scheduler not processing tasks** |
| 149 | +- Manually start it via bridge startup (should auto-start on first goal enqueue) |
| 150 | +- Check: `curl http://localhost:3001/orchestration/scheduler/status` |
| 151 | +- If not enabled, something in `ensureSchedulerStarted()` failed |
| 152 | + |
| 153 | +## Success Criteria |
| 154 | + |
| 155 | +✅ LLM generates graphSpec **without** x/y coordinates |
| 156 | +✅ Executor logs show `[Executor]` creating positioned ops |
| 157 | +✅ Nodes appear in UI with proper spacing (no overlaps) |
| 158 | +✅ Different layout algorithms produce visually distinct results |
| 159 | +✅ Complex graphs (10+ nodes) are readable and well-structured |
| 160 | + |
| 161 | +## What To Look For |
| 162 | + |
| 163 | +**In Browser Console**: |
| 164 | +- `[Agent] Queued create_subgraph goal: { cid, graphId, nodeCount, edgeCount, layoutAlgorithm }` |
| 165 | +- No `[Executor] Task execution failed` errors |
| 166 | +- BridgeClient logs showing pending actions being executed |
| 167 | + |
| 168 | +**In Telemetry** (`/telemetry?limit=50`): |
| 169 | +- `agent_queued` entries with correct node/edge counts |
| 170 | +- `tool_call` entries for `applyMutations` with positioned ops |
| 171 | +- No entries with `error` field |
| 172 | + |
| 173 | +**In UI**: |
| 174 | +- Nodes appear in active graph after 1-2 seconds |
| 175 | +- Edges connect the right nodes |
| 176 | +- Layout looks intentional (not random scatter) |
| 177 | +- Refresh button in AI panel stays green (connected) |
| 178 | + |
| 179 | +## Performance Expectations |
| 180 | + |
| 181 | +- **Simple graphs** (1-3 nodes): < 2 seconds end-to-end |
| 182 | +- **Medium graphs** (5-10 nodes): 2-4 seconds |
| 183 | +- **Complex graphs** (10-20 nodes): 4-6 seconds |
| 184 | + |
| 185 | +Most time is spent in: |
| 186 | +1. LLM API call (~1-2s for Claude/GPT-4) |
| 187 | +2. Auto-layout calculation (~0.1-0.5s depending on algorithm) |
| 188 | +3. UI applying mutations (~0.5-1s for rendering) |
| 189 | + |
| 190 | +--- |
| 191 | + |
| 192 | +## Next Steps After Testing |
| 193 | + |
| 194 | +Once basic flow works: |
| 195 | +1. **Enhance layout heuristics**: Teach LLM when to use each layout type |
| 196 | +2. **Add layout options to UI**: Let users override the chosen algorithm |
| 197 | +3. **Implement prototype reuse**: Check for existing prototypes before creating new ones |
| 198 | +4. **Add undo/redo**: For when Wizard creates unwanted nodes |
| 199 | +5. **Multi-graph support**: Let Wizard create new graphs and populate them |
| 200 | + |
| 201 | +The hard part (auto-layout integration) is now done. The rest is UX polish! 🎉 |
| 202 | + |
0 commit comments