|
1 | | -# agentgui Comprehensive Testing Plan |
| 1 | +# agentgui Work Items |
2 | 2 |
|
3 | | -## STATUS: RESOLVED - SERVER RUNNING |
4 | | -**✅ Fixed:** Using `@anthropic-ai/claude-code` SDK directly |
5 | | -- Package: `@anthropic-ai/claude-code@^1.0.128` |
6 | | -- API: `query({ prompt, options })` for streaming responses |
7 | | -- acp-launcher.js: Refactored to use query() API |
8 | | -- Server Status: Running on http://localhost:3000/gm/ |
9 | | -- UI Status: Fully loaded, connected, showing chat history |
| 3 | +## COMPLETED ✅ |
| 4 | +- [x] Phase 1: Claude Code Conversation Import |
| 5 | + - Database schema migrated (9 new columns added) |
| 6 | + - 90 Claude Code conversations imported |
| 7 | + - Auto-import running every 30 seconds |
| 8 | + - Server running, UI fully loaded (177 conversations) |
| 9 | + - Commits: 340b7c1, fda2822 |
10 | 10 |
|
11 | | ---- |
| 11 | +## ACTIVE 🔄 |
12 | 12 |
|
13 | | -## NEW REQUIREMENT: Import OpenCode Conversations |
14 | | -**User Request:** "we must also import all the opencode conversations from the system and keep them in sync too" |
15 | | - |
16 | | -**Implementation Plan:** |
| 13 | +### Phase 2: OpenCode Integration (PRIORITY 2) |
| 14 | +**User Request:** Import all OpenCode conversations and keep them in sync |
17 | 15 |
|
18 | | -### Phase 1: Claude Code History Import (PRIORITY 1) - COMPLETE ✅ |
19 | | -- [x] Load `~/.claude/projects/*/sessions-index.json` on server startup |
20 | | -- [x] Store in database with agent_type='claude-code', source='imported' |
21 | | -- [x] Database schema migrated with 9 new columns (agentType, source, externalId, projectPath, gitBranch, sourcePath, lastSyncedAt, firstPrompt, messageCount) |
22 | | -- [x] Implement queries.importClaudeCodeConversations() |
23 | | -- [x] Auto-import on server startup and every 30 seconds |
24 | | -- [x] Successfully imported 90 Claude Code conversations |
25 | | -- [x] Test confirms import works correctly (177 total conversations: 87 native + 90 imported) |
26 | | -- [ ] FUTURE: Display in sidebar with [claude-code] prefix (client-side formatting - nice-to-have) |
| 16 | +**Blockers:** |
| 17 | +- [ ] Determine OpenCode conversation storage location |
27 | 18 |
|
28 | | -### Phase 2: OpenCode Integration (PRIORITY 2) |
29 | | -- [ ] Determine OpenCode conversation storage location (TBD) |
30 | | -- [ ] Implement OpenCode history loader (parallel to Claude Code) |
| 19 | +**Tasks:** |
| 20 | +- [ ] Implement OpenCode history loader |
31 | 21 | - [ ] Merge both sources in database |
32 | 22 | - [ ] Display merged history in sidebar |
33 | 23 | - [ ] Support agent type filtering |
|
39 | 29 | - [ ] Handle conflicts (same conversation in both systems) |
40 | 30 | - [ ] Implement read-only mode for imported conversations |
41 | 31 |
|
42 | | -### Database Schema Changes |
43 | | -```sql |
44 | | -ALTER TABLE conversations ADD COLUMN agent_type TEXT DEFAULT 'claude-code'; |
45 | | -ALTER TABLE conversations ADD COLUMN source TEXT DEFAULT 'gui'; -- 'gui', 'imported' |
46 | | -ALTER TABLE conversations ADD COLUMN source_path TEXT; -- path to original file |
47 | | -ALTER TABLE conversations ADD COLUMN last_synced_at INTEGER; -- Unix timestamp |
48 | | -``` |
49 | | - |
50 | | -**Status:** Phase 1 COMPLETE - Implementation continues |
51 | | -**Blocking:** None - system is functional |
52 | | - |
53 | | ---- |
54 | | - |
55 | | -## EXECUTION SUMMARY |
56 | | - |
57 | | -### What Was Accomplished This Session |
58 | | - |
59 | | -**1. Fixed Critical Blocker: Claude Code Integration** |
60 | | -- Discovered `@zed-industries/claude-code-acp` is an ACP server, not a client library |
61 | | -- Found correct API: `@anthropic-ai/claude-code` SDK with `query()` function |
62 | | -- Refactored acp-launcher.js to use direct SDK API |
63 | | -- Server now starts successfully without subprocess overhead |
64 | | - |
65 | | -**2. Implemented Claude Code Conversation Import (Phase 1)** |
66 | | -- Created conversation-importer.js module |
67 | | -- Added 9 database columns for conversation metadata |
68 | | -- Database schema migration: successfully alters existing conversations table |
69 | | -- Auto-import function on server startup and every 30 seconds |
70 | | -- Successfully imported 90 conversations from ~/.claude/projects/ |
71 | | -- Test verified: 177 total conversations (87 native + 90 imported) |
72 | | - |
73 | | -**3. Created Comprehensive Testing Plan** |
74 | | -- 12 test categories covering all system aspects |
75 | | -- Issue tracking template for bugs found |
76 | | -- Success criteria documented |
77 | | -- Testing framework in .prd file |
78 | | - |
79 | | -**4. Identified UI/Testing Blockers** |
80 | | -- Conversation click handler causes execution timeout |
81 | | -- Message submission flow needs clarification |
82 | | -- WebSocket sync connection working, but agent communication path unclear |
83 | | - |
84 | | -### Git Commit |
85 | | -- Commit 340b7c1: feat: Implement Claude Code conversation import from CLI |
86 | | -- All changes staged and committed successfully |
| 32 | +## ISSUES TO FIX ⚠️ |
87 | 33 |
|
88 | | -### Next Steps (For User Review) |
89 | | -1. **Investigate UI Click Handler** - Why does clicking conversations timeout? |
90 | | -2. **Verify Agent Communication** - Test message send/receive flow |
91 | | -3. **OpenCode Integration** - Phase 2 (pending structure clarification) |
92 | | -4. **Frontend Labeling** - Add [claude-code] prefix to imported conversations (nice-to-have) |
93 | | -5. **Full Test Execution** - Run remaining 10 test categories once blockers resolved |
94 | | - |
95 | | -### Project Status |
96 | | -- **Core Functionality**: ✅ Working (import, storage, retrieval) |
97 | | -- **UI Integration**: ⚠️ Needs debugging (click handlers) |
98 | | -- **Agent Communication**: ❓ Needs testing (message flow) |
99 | | -- **OpenCode Support**: 🔄 Pending (Phase 2) |
100 | | -- **Production Readiness**: 70% (import working, need to verify message flow) |
101 | | - |
102 | | ---- |
103 | | - |
104 | | -## TESTING PROGRESS SUMMARY |
105 | | - |
106 | | -### Completed Work |
107 | | -✅ **Phase 1: Claude Code Conversation Import - COMPLETE** |
108 | | -- Database schema migrated successfully (9 new columns added) |
109 | | -- 90 Claude Code conversations imported from ~/.claude/projects/*/sessions-index.json |
110 | | -- Auto-import running every 30 seconds on server startup |
111 | | -- Verified: 177 total conversations (87 native + 90 imported) |
112 | | -- Verified: Imported conversations have correct metadata (source, agentType, externalId) |
113 | | - |
114 | | -✅ **Server & Connectivity** |
115 | | -- Server running on http://localhost:3000/gm/ |
116 | | -- UI fully loads with 177 conversations displayed |
117 | | -- WebSocket sync connection active and working |
118 | | -- Status indicator shows "Connected" |
119 | | -- Page title shows conversation count (177) |
120 | | - |
121 | | -### Known Issues Found During Testing |
122 | | -⚠️ **Issue #1: Timeout on Conversation Selection** |
| 34 | +**Issue #1: Timeout on Conversation Selection** |
123 | 35 | - Clicking conversations causes code execution timeout |
124 | | -- Suggests potential performance issue or UI hang |
125 | 36 | - Needs investigation into click handlers |
126 | 37 |
|
127 | | -⚠️ **Issue #2: Message Input Field** |
128 | | -- Message input found but submission behavior unclear |
| 38 | +**Issue #2: Message Input Field** |
| 39 | +- Message submission behavior unclear |
129 | 40 | - May not be properly wired to agent communication |
130 | | -- Needs testing with agent actually selected |
131 | | - |
132 | | -### Testing Categories (12 Total) |
133 | | - |
134 | | -### Category 1: Server Startup & Connection (BLOCKED) |
135 | | -- [ ] Server starts without errors |
136 | | -- [ ] HTTP server listens on PORT (3000 or custom) |
137 | | -- [ ] Static files served correctly |
138 | | -- [ ] WebSocket endpoint available at /ws |
139 | | -- [ ] Initial page load completes without JS errors |
140 | | -- [ ] WebSocket connection establishes on page load |
141 | | -- [ ] Receives sync_connected message from server |
142 | | -- [ ] Console shows no errors or warnings |
143 | | -- **Blocker:** Server cannot start without claude-code-acp |
144 | | - |
145 | | -### Category 2: Real-Time Streaming with Persistence |
146 | | -- [ ] Send test message from UI |
147 | | -- [ ] Message appears in real-time before database confirmation |
148 | | -- [ ] StreamHandler.persistAndBroadcast executes atomically |
149 | | -- [ ] stream_updates table receives entry with correct sequence |
150 | | -- [ ] WebSocket broadcasts update before database write completes (write-before-broadcast) |
151 | | -- [ ] Sequence number increments correctly (no gaps) |
152 | | -- [ ] Update contains correct: sessionId, conversationId, updateType, timestamp |
153 | | -- [ ] Multiple rapid messages maintain order and sequence |
154 | | -- [ ] Very large messages (10MB+) handle gracefully |
155 | | -- [ ] Empty messages are rejected or handled safely |
156 | | - |
157 | | -### Category 3: HTML Rendering Without Text Mixing |
158 | | -- [ ] Agent response renders HTML blocks only (no plain text below) |
159 | | -- [ ] HTML code blocks extracted correctly: /\`\`\`html\n([\s\S]*?)\n\`\`\`/ |
160 | | -- [ ] Plain text fallback NEVER triggers |
161 | | -- [ ] RippleUI components render visually correct |
162 | | -- [ ] Inline HTML display shows no artifacts or escape sequences |
163 | | -- [ ] HTML content sanitized (no XSS vectors) |
164 | | -- [ ] SVG content within HTML renders correctly |
165 | | -- [ ] Nested HTML structures parse correctly |
166 | | -- [ ] HTML with special characters escapes properly |
167 | | -- [ ] Large HTML blocks (100KB+) render without lag |
168 | | - |
169 | | -### Category 4: Theme Compliance (Light/Dark) |
170 | | -- [ ] Page detects system dark/light preference on load |
171 | | -- [ ] Toggle between light/dark mode works |
172 | | -- [ ] Theme preference persists across page reloads |
173 | | -- [ ] RippleUI generated content uses Tailwind classes: text-gray-700 (light), text-gray-300 (dark) |
174 | | -- [ ] Background colors don't clash with theme: light bg-white/bg-gray-50, dark bg-gray-900/bg-gray-800 |
175 | | -- [ ] No hardcoded color hex codes in generated HTML |
176 | | -- [ ] Progress bars use theme-aware colors |
177 | | -- [ ] Cards/sections adapt to theme automatically |
178 | | -- [ ] Form inputs styled appropriately for theme |
179 | | -- [ ] Text contrast ratios meet accessibility standards (WCAG AA) |
180 | | - |
181 | | -### Category 5: Advanced RippleUI Components |
182 | | -- [ ] **Progress Bars:** Display correctly, update smoothly, show percentage text |
183 | | -- [ ] **Grid Layouts:** Cards arrange responsively, wrap on mobile |
184 | | -- [ ] **Collapsible Sections:** Toggle state persists visually, smooth animations |
185 | | -- [ ] **Two-Column Layouts:** Content splits evenly, responsive on narrow screens |
186 | | -- [ ] **Badge/Pill Labels:** Display with correct styling, multiple variants work |
187 | | -- [ ] **Timeline Visualizations:** Vertical/horizontal orientation, connection lines render |
188 | | -- [ ] **Icon + Text Combinations:** Icons align correctly, text wraps properly |
189 | | -- [ ] **Buttons:** Various states (default, hover, active, disabled) visible |
190 | | -- [ ] **Code Blocks:** Syntax highlighting works, scrollable for long code |
191 | | -- [ ] **Tables:** Content aligns, scrollable on mobile, alternating row colors |
192 | | - |
193 | | -### Category 6: Interactive Forms & Input Validation |
194 | | -- [ ] **Text Input:** Accepts input, displays typed text |
195 | | -- [ ] **Email Input:** Validates email format (basic HTML5 validation) |
196 | | -- [ ] **Textarea:** Accepts multi-line input, text wraps correctly |
197 | | -- [ ] **Select Dropdown:** Opens/closes, options selectable, shows selected value |
198 | | -- [ ] **Checkboxes:** Toggle on/off, multiple selections work, state persists visually |
199 | | -- [ ] **Radio Buttons:** Mutually exclusive selection, only one active at time |
200 | | -- [ ] **Form Submit:** Click triggers handler, data captured correctly |
201 | | -- [ ] **Form Data Capture:** Submitted data includes all field values |
202 | | -- [ ] **Validation Messages:** Error messages display when validation fails |
203 | | -- [ ] **Form Reset:** Reset button clears all fields, restores defaults |
204 | | - |
205 | | -### Category 7: Database Persistence & Recovery |
206 | | -- [ ] Session created in database on new connection |
207 | | -- [ ] Conversation created with correct sessionId |
208 | | -- [ ] Each stream_update persists with atomic sequence |
209 | | -- [ ] Messages survive database restart (data integrity) |
210 | | -- [ ] Refresh page shows all previous messages in order |
211 | | -- [ ] State checkpoints created for recovery (latest 5 versions) |
212 | | -- [ ] Gap detection identifies missing sequence numbers |
213 | | -- [ ] Full state recovery request fetches missing data |
214 | | -- [ ] Duplicate updates prevented (idempotency) |
215 | | -- [ ] Orphaned data cleaned up (no dangling references) |
216 | | - |
217 | | -### Category 8: State Consistency & Validation |
218 | | -- [ ] StateValidator.validateSession identifies state errors |
219 | | -- [ ] Checksum validation detects corrupted updates |
220 | | -- [ ] Sequence gaps trigger recovery request |
221 | | -- [ ] Session state returns complete picture for recovery |
222 | | -- [ ] Concurrent updates don't cause race conditions |
223 | | -- [ ] Update count matches database row count |
224 | | -- [ ] Block count accurate for all content types |
225 | | -- [ ] No duplicate messages in UI |
226 | | -- [ ] Message order matches sequence numbers exactly |
227 | | -- [ ] State validator runs async (doesn't block broadcast) |
228 | | - |
229 | | -### Category 9: Reconnection & Recovery |
230 | | -- [ ] WebSocket disconnect detected within 5 seconds |
231 | | -- [ ] Exponential backoff attempts: 1s, 2s, 4s, 8s, 16s, 32s, 60s |
232 | | -- [ ] Max 10 reconnect attempts before giving up |
233 | | -- [ ] Reconnect succeeds after network recovery |
234 | | -- [ ] Missed messages fetched on reconnect via gap detection |
235 | | -- [ ] State checkpoint provides recovery baseline |
236 | | -- [ ] Fast reconnect (< 100ms) shows no duplicate messages |
237 | | -- [ ] Slow network (high latency) handles gracefully |
238 | | -- [ ] Multiple rapid disconnect/connect cycles work |
239 | | -- [ ] Persistent connection maintained for 1+ hours |
240 | | - |
241 | | -### Category 10: Error Handling & Edge Cases |
242 | | -- [ ] Malformed JSON in stream updates rejected |
243 | | -- [ ] Invalid session ID returns 404 |
244 | | -- [ ] Missing conversationId handled gracefully |
245 | | -- [ ] Null/undefined content doesn't crash |
246 | | -- [ ] Very large payloads (100MB) handled or rejected |
247 | | -- [ ] Rapid fire 1000 messages/second buffered correctly |
248 | | -- [ ] Single character message processes correctly |
249 | | -- [ ] Message with only whitespace processed |
250 | | -- [ ] HTML with syntax errors parsed safely |
251 | | -- [ ] Database locked by other process doesn't block |
252 | | - |
253 | | -### Category 11: Performance & Latency |
254 | | -- [ ] First message appears in UI within 50ms of broadcast |
255 | | -- [ ] Database write + broadcast takes < 100ms total |
256 | | -- [ ] Page load from blank to interactive < 2 seconds |
257 | | -- [ ] 1000 message history loads in < 1 second |
258 | | -- [ ] WebSocket messages deliver within 50ms (local network) |
259 | | -- [ ] Memory usage stays under 500MB (after 10k messages) |
260 | | -- [ ] CPU usage < 5% idle (no busy loops) |
261 | | -- [ ] Smooth scrolling with 1000+ messages rendered |
262 | | -- [ ] No jank when switching themes |
263 | | -- [ ] Form submission responsive (< 200ms) |
264 | | - |
265 | | -### Category 12: Multi-Agent & Configuration |
266 | | -- [ ] Agent type selection works (claude-code vs opencode) |
267 | | -- [ ] CLI config loaded from ~/.claude/config.json |
268 | | -- [ ] OpenCode config loaded from ~/.opencode/config.json |
269 | | -- [ ] Environment variables passed through (HOME, PATH, etc) |
270 | | -- [ ] OAuth tokens handled correctly |
271 | | -- [ ] Model preferences applied from config |
272 | | -- [ ] Multiple agents can run in same session |
273 | | -- [ ] Agent switching preserves conversation history |
274 | | -- [ ] Session isolation prevents cross-contamination |
275 | | -- [ ] Configuration changes apply without restart |
276 | | - |
277 | | ---- |
278 | | - |
279 | | -## Issue Categories & Fixes |
280 | | - |
281 | | -### Issue Tracking Template |
282 | | -``` |
283 | | -[ISSUE-###] Category: <1-12> |
284 | | -Severity: Critical | High | Medium | Low |
285 | | -Status: New | In Progress | Fixed | Verified |
286 | | -Description: <what fails> |
287 | | -Steps to Reproduce: <how to trigger> |
288 | | -Expected: <what should happen> |
289 | | -Actual: <what happens instead> |
290 | | -Fix Applied: <how fixed> |
291 | | -Verification: <how verified> |
292 | | -``` |
293 | | - |
294 | | -### Known Issues |
295 | | -1. **[BLOCKER] Missing claude-code-acp package** |
296 | | - - Status: Blocking all testing |
297 | | - - Needs: Clarify package source or replace with alternative |
298 | | - - Impact: Server cannot start |
299 | | - |
300 | | ---- |
301 | | - |
302 | | -## Testing Execution Flow |
303 | | - |
304 | | -1. **Resolve Dependency Blocker** |
305 | | - - Clarify `claude-code-acp` source |
306 | | - - Install package or replace import |
307 | | - - Verify server starts successfully |
308 | | - |
309 | | -2. **Category 1: Server Startup** (Foundation) |
310 | | - - If passes: Continue |
311 | | - - If fails: Fix blocker, retry |
312 | | - |
313 | | -3. **Categories 2-12: Feature Testing** (Parallel where possible) |
314 | | - - Execute in dependency order: |
315 | | - - Streaming (2) → Persistence (7) |
316 | | - - HTML Rendering (3) → Theme (4) → Components (5) |
317 | | - - Forms (6) → Validation (6) |
318 | | - - State (8) → Reconnection (9) |
319 | | - - Error Handling (10) |
320 | | - - Performance (11) |
321 | | - - Multi-Agent (12) |
322 | | - |
323 | | -4. **Issue Remediation** |
324 | | - - Fix each issue as discovered |
325 | | - - Re-run affected test categories |
326 | | - - Document root cause |
327 | | - |
328 | | -5. **Final Verification** |
329 | | - - Full end-to-end workflow test |
330 | | - - All 12 categories passing |
331 | | - - No blockers remaining |
332 | | - - Performance benchmarks met |
333 | | - |
334 | | ---- |
335 | | - |
336 | | -## Success Criteria (All Must Pass) |
337 | | -- [ ] Server starts and stays running |
338 | | -- [ ] All 12 test categories pass |
339 | | -- [ ] No critical or high severity issues |
340 | | -- [ ] Response time < 100ms average |
341 | | -- [ ] Zero data loss on disconnect/reconnect |
342 | | -- [ ] HTML rendering clean (no text mixing) |
343 | | -- [ ] Theme compliance verified |
344 | | -- [ ] Forms fully functional |
345 | | -- [ ] Database persistence verified |
346 | | -- [ ] Production ready |
347 | | - |
348 | | ---- |
349 | 41 |
|
350 | | -**Last Updated:** 2026-02-03 |
351 | | -**Testing Status:** BLOCKED - Awaiting Dependency Resolution |
352 | | -**Next Step:** Resolve `claude-code-acp` package availability |
| 42 | +## Test Categories Remaining (10 of 12) |
| 43 | +- [ ] Category 2: Real-Time Streaming with Persistence |
| 44 | +- [ ] Category 3: HTML Rendering Without Text Mixing |
| 45 | +- [ ] Category 4: Theme Compliance (Light/Dark) |
| 46 | +- [ ] Category 5: Advanced RippleUI Components |
| 47 | +- [ ] Category 6: Interactive Forms & Input Validation |
| 48 | +- [ ] Category 7: Database Persistence & Recovery |
| 49 | +- [ ] Category 8: State Consistency & Validation |
| 50 | +- [ ] Category 9: Reconnection & Recovery |
| 51 | +- [ ] Category 10: Error Handling & Edge Cases |
| 52 | +- [ ] Category 11: Performance & Latency |
| 53 | +- [ ] Category 12: Multi-Agent & Configuration |
0 commit comments