Skip to content

Commit 7ca7fda

Browse files
committed
chore: agents + style
1 parent 7643d04 commit 7ca7fda

2 files changed

Lines changed: 31 additions & 1 deletion

File tree

AGENTS.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,13 @@ Recommended sequence:
6868
4. Confirm the title and URL live inside that boundary.
6969
5. Record the final URL if the page redirects by locale or renders a different surface than expected.
7070

71+
If Chrome MCP is unavailable (`Transport closed` or page-lock errors), do this recovery sequence:
72+
73+
1. Kill stale Chrome MCP processes (`pkill -9 -f 'chrome-devtools-mcp|Chrome for Testing'`).
74+
2. Retry Chrome MCP once before continuing.
75+
3. If still unavailable, continue with `curl -I -L`, runtime `feed`, and HTML inspection in a temporary file.
76+
4. Explicitly report Chrome MCP outage in the final handoff.
77+
7178
## Browserless
7279

7380
Use Browserless when:
@@ -158,6 +165,20 @@ bundle exec rspec --tag fetch --example 'example.com/feed.yml' spec/html2rss/con
158165
- the chosen surface is too noisy or too dynamic
159166
- the candidate should be downgraded or dropped
160167

168+
7. Cross-runtime mismatch check (required when core feed works but fetch specs fail):
169+
170+
- confirm canonical URL with redirect tracing:
171+
172+
```bash
173+
curl -I -L -s https://example.com | sed -n '1,20p'
174+
```
175+
176+
- compare behavior in both runtimes:
177+
- core repo (`../html2rss`) via `html2rss feed`
178+
- configs repo fetch lane (`bundle exec rspec --tag fetch --example ...`)
179+
- if selectors are valid in core but fetch lane still returns zero items, treat this as request-strategy/runtime mismatch, not selector success.
180+
- in that case: prefer Browserless-backed verification if available; otherwise mark as downgraded/deferred with evidence.
181+
161182
## Runtime Debugging
162183

163184
Use the core CLI as the authority for single-config debugging. The quickest loop is:
@@ -170,6 +191,13 @@ Use the core CLI as the authority for single-config debugging. The quickest loop
170191

171192
If Browserless works but Faraday does not, keep the config narrow and classify it as Browserless-backed instead of trying to rescue it with brittle tweaks.
172193

194+
Additional high-value checks:
195+
196+
- Always normalize `channel.url` to the final canonical host/path (`www` vs non-`www`, retired legacy paths).
197+
- Prefer selectors anchored to content links (`h3 a`, `a[href*='/article/']`) over container-only selectors.
198+
- Remove optional fields first when quality drops (`categories`, synthetic IDs, weak descriptions) before adding selector complexity.
199+
- Set `enhance: false` early if enhancement starts pulling nav/hero/market widgets.
200+
173201
## Auto-Source
174202

175203
Use `auto` for reconnaissance, not as proof that a config is ready.
@@ -211,3 +239,5 @@ When finishing config work, report:
211239
- dropped or deferred candidates and why
212240
- commands actually run
213241
- residual risks, especially selector drift, localization dependence, or Browserless dependence
242+
- whether Chrome MCP was available during validation
243+
- whether focused fetch specs matched core runtime behavior

lib/html2rss/configs/mozilla.org/security-advisories.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ channel:
66
ttl: 360
77
selectors:
88
items:
9-
selector: 'main li'
9+
selector: "main li"
1010
enhance: false
1111
title:
1212
selector: 'a[href*="/security/advisories/mfsa"]'

0 commit comments

Comments
 (0)