Skip to content

Commit 6ac346b

Browse files
committed
Document headed long-run validation and agent-browser evidence
1 parent c59d30f commit 6ac346b

3 files changed

Lines changed: 107 additions & 26 deletions

File tree

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,21 @@ Includes:
5454
- Unit tests for session engine transitions and IndexedDB retention/query boundaries.
5555
- Playwright dashboard tests for timeline filters, focus/open behavior, and settings action.
5656
- Timestamped screenshots and accessibility snapshots for long-run validation evidence.
57+
- Detailed evidence log:
58+
- `docs/testing/validation-evidence-2026-02-26.md`
59+
60+
## agent-browser Snapshot Proof
61+
62+
Example workflow (with explicit session name):
63+
64+
```powershell
65+
agent-browser --session car open https://www.wikipedia.org
66+
agent-browser --session car snapshot -i
67+
agent-browser --session car screenshot ".\\artifacts\\agent-browser-proof.png" --full
68+
agent-browser --session car close
69+
```
70+
71+
If `default` session fails to start on Windows, use a non-default `--session` name.
5772

5873
## Notes
5974

Lines changed: 61 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Validation Evidence - 2026-02-26
22

3-
This document records real extension-runtime validation runs with screenshot/snapshot evidence.
3+
This document records real extension-runtime validation runs with screenshot/snapshot artifacts.
44

5-
## Tooling Decision
5+
## Tooling Status
66

7-
- Intended skill for snapshots/screenshots: `agent-browser`.
8-
- Environment status: `agent-browser` CLI was not installed (`command not found`).
9-
- Fallback used: Playwright extension-runtime scripts:
10-
- `scripts/extension-smoke-test.mjs`
11-
- `scripts/long-duration-extension-validation.mjs`
7+
- `agent-browser` skill is installed and available in this environment.
8+
- `agent-browser` CLI is installed (`agent-browser 0.15.0`).
9+
- Runtime caveat on this machine:
10+
- Default session (`default`) maps to a blocked Windows TCP port.
11+
- Working approach: run `agent-browser` with an explicit non-default session (example: `--session car`).
1212

1313
## Run 1: Long-Duration Multi-Window (Headless)
1414

@@ -18,19 +18,19 @@ This document records real extension-runtime validation runs with screenshot/sna
1818
- Artifact root:
1919
- `artifacts/validation/20260226-162944/`
2020
- Evidence:
21-
- 40+ screenshots across step checkpoints
22-
- 8 structured ARIA snapshots
21+
- 40+ screenshots across checkpoints
22+
- ARIA snapshots
2323
- JSON run log + markdown report
2424

2525
Observed outcome:
26-
- Extension loaded and multi-window/tab switching actions executed.
26+
- Extension loaded and multi-window/tab switching executed.
2727
- Runtime status remained healthy (`ok=true`, `retentionDays=30`).
2828
- Final runtime idle state was `idle`, and final timeline count was `0`.
2929

3030
Interpretation:
31-
- In headless automation, idle-state behavior can suppress effective session capture.
31+
- In headless mode, idle-state behavior can suppress effective session capture.
3232

33-
## Run 2: Real Extension Runtime (Headed) Sanity Validation
33+
## Run 2: Real Extension Runtime (Headed) Sanity
3434

3535
- Run ID: `20260226-163821`
3636
- Duration: 2 minutes
@@ -39,29 +39,64 @@ Interpretation:
3939
- Artifact root:
4040
- `artifacts/validation/20260226-163821/`
4141
- Evidence:
42-
- Screenshots at each checkpoint
42+
- Screenshots at checkpoints
4343
- ARIA snapshots
4444
- JSON run log + markdown report
4545

4646
Observed outcome:
4747
- Extension loaded in headed Chromium with unpacked extension.
48-
- Timeline recorded at least one session (`timelineCount=1`).
48+
- Timeline recorded sessions (`timelineCount=1`).
4949
- Runtime status healthy (`ok=true`, `paused=false`, `retentionDays=30`).
5050

51+
## Run 3: Full Long-Duration Multi-Window (Headed)
52+
53+
- Run ID: `20260226-165807`
54+
- Duration: 12 minutes
55+
- Command:
56+
- `$env:VALIDATION_HEADED='1'; $env:VALIDATION_DURATION_MINUTES='12'; npm run test:validate:long`
57+
- Artifact root:
58+
- `artifacts/validation/20260226-165807/`
59+
- Evidence:
60+
- 70+ screenshots across steps plus final dashboard/settings captures
61+
- ARIA snapshots every third step
62+
- JSON run log + markdown report
63+
64+
Observed outcome:
65+
- 36 multi-window steps executed.
66+
- Final dashboard 7d summary:
67+
- `timelineCount=10`
68+
- `summaryTotal=10 sessions`
69+
- `summaryDuration=10s total`
70+
- Runtime healthy at completion:
71+
- `ok=true`
72+
- `paused=false`
73+
- `retentionDays=30`
74+
75+
## agent-browser Snapshot/Screenshot Proof
76+
77+
Command run (non-default session):
78+
79+
```powershell
80+
agent-browser --session car open https://www.wikipedia.org
81+
agent-browser --session car snapshot -i
82+
agent-browser --session car screenshot "artifacts/validation/20260226-165807/screenshots/agent-browser-wikipedia.png" --full
83+
agent-browser --session car close
84+
```
85+
86+
Proof artifact:
87+
- `artifacts/validation/20260226-165807/screenshots/agent-browser-wikipedia.png`
88+
5189
## Additional Runtime Proof
5290

53-
- Smoke command:
54-
- `npm run test:smoke:extension`
55-
- Evidence output (JSON):
56-
- dashboard heading found
57-
- timeline container present
58-
- runtime message responded with retention=30
59-
- settings page loaded
91+
- Full automated quality gate:
92+
- `npm run test:all` (unit + e2e): pass
93+
- Real extension smoke:
94+
- `npm run test:smoke:extension`: pass
95+
- Runtime response confirms `retentionDays=30`
6096

6197
## Conclusion
6298

63-
- Automated quality gates are green (`test:unit`, `test:e2e`, `test:all`).
64-
- Real extension-runtime execution with screenshot evidence is complete.
65-
- For strict “real-user long-duration” sign-off, run headed long validation while actively using the machine:
66-
- `$env:VALIDATION_HEADED='1'; $env:VALIDATION_DURATION_MINUTES='10'; npm run test:validate:long`
67-
- then review artifacts in `artifacts/validation/<run-id>/`.
99+
- Automated test suites are green.
100+
- Extension runtime validation with long-duration multi-window activity is complete.
101+
- Screenshot/snapshot evidence is present under `artifacts/validation/*`.
102+
- `agent-browser` is usable for snapshot/screenshot capture with explicit session naming on this host.

project-history.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,26 @@ Chronological execution log:
103103

104104
16. Pushed final state to GitHub `main`.
105105

106+
17. Installed and verified `agent-browser` skill/tooling for screenshot/snapshot-driven browser validation.
107+
108+
18. Ran full verification loop again:
109+
- `npm run test:all`
110+
- `npm run test:smoke:extension`
111+
112+
19. Executed headed long-duration multi-window validation:
113+
- Run ID: `20260226-165807`
114+
- Duration: 12 minutes
115+
- Steps: 36
116+
- Artifacts: screenshots + ARIA snapshots + JSON/markdown reports under `artifacts/validation/20260226-165807/`
117+
118+
20. Executed direct `agent-browser` snapshot/screenshot proof flow:
119+
- Non-default session workaround (`--session car`) due local default-session port conflict.
120+
- Captured evidence screenshot:
121+
- `artifacts/validation/20260226-165807/screenshots/agent-browser-wikipedia.png`
122+
123+
21. Updated validation evidence document:
124+
- `docs/testing/validation-evidence-2026-02-26.md`
125+
106126
## 4. What Were The Decisions That We Took?
107127

108128
### Product/Architecture Decisions
@@ -192,13 +212,16 @@ Not in MVP (intentionally out of scope):
192212
- Automated tests: **green**
193213
- CI workflow: **configured**
194214
- Manual acceptance assets: **present**
215+
- Headed long-duration runtime validation with artifacts: **complete**
195216
- Repo pushed to GitHub: **yes**
196217

197218
### Quality Status
198219

199220
- `npm run test:unit`: passing
200221
- `npm run test:e2e`: passing
201222
- `npm run test:all`: passing
223+
- `npm run test:smoke:extension`: passing
224+
- Long-duration headed validation: passing (`runId=20260226-165807`, `timelineCount=10`, `retentionDays=30`)
202225

203226
### Branch/History Status
204227

@@ -207,6 +230,9 @@ Primary commits:
207230
- `0f2e6fc` Build MV3 Chrome Activity Reader MVP scaffold
208231
- `b1db2c7` Add Playwright test loop for dashboard behavior
209232
- `4fb4594` Complete MVP hardening, recovery, and full test loop
233+
- `0b49afb` Add full project documentation, CI workflow, and acceptance checklist
234+
- `0a14196` Rename project history doc and add real extension smoke test
235+
- `c59d30f` Add long-duration extension validation with screenshot evidence
210236

211237
## 9. File-Level Map Of What Exists
212238

@@ -272,4 +298,9 @@ This project has moved from concept to a tested MVP with:
272298
- automated and manual verification paths,
273299
- CI on GitHub.
274300

301+
Latest verification cycle confirms:
302+
- all automated suites are green,
303+
- real extension runtime works in headed long-duration multi-window flow,
304+
- screenshot/snapshot evidence is recorded.
305+
275306
The current codebase is production-ready for MVP-level internal use and ready for the next phase (publishing hardening, side panel UX, and optional richer activity intelligence).

0 commit comments

Comments
 (0)