|
2 | 2 |
|
3 | 3 | ML-powered DOM pruning that reduces browser prompt tokens by **up to 99.8%** while preserving actionable elements. |
4 | 4 |
|
| 5 | +## Quick Start |
| 6 | + |
| 7 | +### 1. Install the Skill |
| 8 | + |
| 9 | +**Via ClawHub (Recommended):** |
| 10 | +```bash |
| 11 | +npx clawdhub@latest install predicate-snapshot |
| 12 | +``` |
| 13 | + |
| 14 | +**Manual Installation:** |
| 15 | +```bash |
| 16 | +git clone https://github.com/PredicateSystems/openclaw-predicate-skill ~/.openclaw/skills/predicate-snapshot |
| 17 | +cd ~/.openclaw/skills/predicate-snapshot |
| 18 | +npm install |
| 19 | +npm run build |
| 20 | +``` |
| 21 | + |
| 22 | +### 2. Get Your API Key (Optional) |
| 23 | + |
| 24 | +For ML-powered ranking (95% token reduction), get a free API key: |
| 25 | + |
| 26 | +1. Go to [PredicateSystems.ai](https://www.PredicateSystems.ai) |
| 27 | +2. Sign up for a **free account (includes 500 free credits/month)** |
| 28 | +3. Navigate to **Dashboard > API Keys** |
| 29 | +4. Click **Create New Key** and copy your key (starts with `sk-...`) |
| 30 | + |
| 31 | +**Without API key:** Local heuristic-based pruning (~80% token reduction) |
| 32 | +**With API key:** ML-powered ranking for cleaner output (~95% token reduction) |
| 33 | + |
| 34 | +### 3. Configure the API Key |
| 35 | + |
| 36 | +**Option A: Environment Variable (Recommended)** |
| 37 | +```bash |
| 38 | +# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.) |
| 39 | +export PREDICATE_API_KEY="sk-your-key-here" |
| 40 | +``` |
| 41 | + |
| 42 | +**Option B: OpenClaw Config File** |
| 43 | + |
| 44 | +Add to `~/.openclaw/config.yaml`: |
| 45 | +```yaml |
| 46 | +skills: |
| 47 | + predicate-snapshot: |
| 48 | + api_key: "sk-your-key-here" |
| 49 | + max_credits_per_session: 100 # Optional: limit credits per session |
| 50 | +``` |
| 51 | +
|
| 52 | +### 4. Use the Skill |
| 53 | +
|
| 54 | +```bash |
| 55 | +# In OpenClaw: |
| 56 | +/predicate-snapshot # Get ranked elements |
| 57 | +/predicate-act click 42 # Click element by ID |
| 58 | +/predicate-snapshot-local # Free local mode (no API) |
| 59 | +``` |
| 60 | + |
5 | 61 | ## Overview |
6 | 62 |
|
7 | 63 | This OpenClaw skill replaces the default accessibility tree snapshot with Predicate's ML-ranked DOM elements. Instead of sending 800+ elements (~18,000 tokens) to the LLM, it sends only the 50 most relevant elements (configurable) (~500 tokens). |
@@ -68,57 +124,6 @@ You might wonder: "Isn't 50 elements vs 24,567 elements comparing apples to oran |
68 | 124 | | Accessibility Tree | ~150,000+ | ~6,000+ | Low (noise) | |
69 | 125 | | Predicate Snapshot | ~500-1,300 | 50 | High (ML-ranked) | |
70 | 126 |
|
71 | | -## Quick Start |
72 | | - |
73 | | -### 1. Install the Skill |
74 | | - |
75 | | -**Via ClawHub (Recommended):** |
76 | | -```bash |
77 | | -npx clawdhub@latest install predicate-snapshot |
78 | | -``` |
79 | | - |
80 | | -**Manual Installation:** |
81 | | -```bash |
82 | | -git clone https://github.com/PredicateSystems/openclaw-predicate-skill ~/.openclaw/skills/predicate-snapshot |
83 | | -cd ~/.openclaw/skills/predicate-snapshot |
84 | | -npm install |
85 | | -npm run build |
86 | | -``` |
87 | | - |
88 | | -### 2. Get Your API Key |
89 | | - |
90 | | -1. Go to [PredicateSystems.ai](https://www.PredicateSystems.ai) |
91 | | -2. Sign up for a **free account (includes 500 free credits/month)** |
92 | | -3. Navigate to **Dashboard > API Keys** |
93 | | -4. Click **Create New Key** and copy your key (starts with `sk-...`) |
94 | | - |
95 | | -### 3. Configure the API Key |
96 | | - |
97 | | -**Option A: Environment Variable (Recommended)** |
98 | | -```bash |
99 | | -# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.) |
100 | | -export PREDICATE_API_KEY="sk-your-key-here" |
101 | | -``` |
102 | | - |
103 | | -**Option B: OpenClaw Config File** |
104 | | - |
105 | | -Add to `~/.openclaw/config.yaml`: |
106 | | -```yaml |
107 | | -skills: |
108 | | - predicate-snapshot: |
109 | | - api_key: "sk-your-key-here" |
110 | | - max_credits_per_session: 100 # Optional: limit credits per session |
111 | | -``` |
112 | | -
|
113 | | -### 4. Verify Installation |
114 | | -
|
115 | | -```bash |
116 | | -# In OpenClaw, run: |
117 | | -/predicate-snapshot |
118 | | -``` |
119 | | - |
120 | | -If configured correctly, you'll see a ranked list of page elements. |
121 | | - |
122 | 127 | ## How It Works |
123 | 128 |
|
124 | 129 | ### Does This Replace the Default A11y Tree? |
@@ -468,6 +473,32 @@ npm run build |
468 | 473 | npm test |
469 | 474 | ``` |
470 | 475 |
|
| 476 | +## Why Predicate Snapshot Over Accessibility Tree? |
| 477 | + |
| 478 | +OpenClaw and similar browser automation frameworks default to the **Accessibility Tree (A11y)** for navigating websites. While A11y works for simple cases, it has fundamental limitations that make it unreliable for production LLM-driven automation: |
| 479 | + |
| 480 | +### A11y Tree Limitations |
| 481 | + |
| 482 | +| Problem | Description | Impact on LLM Agents | |
| 483 | +|---------|-------------|----------------------| |
| 484 | +| **Optimized for Consumption, Not Action** | A11y is designed for assistive technology (screen readers), not action verification or layout reasoning | Lacks precise semantic geometry and ordinality (e.g., "the first item in a list") that agents need for reliable reasoning | |
| 485 | +| **Hydration Lag & Structural Inconsistency** | In JS-heavy SPAs, A11y often lags behind hydration or misrepresents dynamic overlays and grouping | Snapshots miss interactive nodes or incorrectly label states (e.g., confusing `focused` with `active`) | |
| 486 | +| **Shadow DOM & Iframe Blind Spots** | A11y struggles to maintain global order across Shadow DOM and iframe boundaries | Cross-shadow ARIA delegation is inconsistent; iframe contents are often missing or lose spatial context | |
| 487 | +| **Token Inefficiency** | Extracting the entire A11y tree for small actions wastes context window and compute | Superfluous nodes (like `genericContainer`) consume tokens without helping the agent | |
| 488 | +| **Missing Visual/Layout Bugs** | A11y trees miss rendering-time issues like overlapping buttons or z-index conflicts | Agent reports elements as "correct" but cannot detect visual collisions | |
| 489 | + |
| 490 | +### Predicate Snapshot Advantages |
| 491 | + |
| 492 | +| Capability | How Predicate Solves It | |
| 493 | +|------------|------------------------| |
| 494 | +| **Post-Rendered Geometry** | Layers in actual bounding boxes and grouping missing from standard A11y representations | |
| 495 | +| **Live DOM Synchronization** | Anchors on the live, post-rendered DOM ensuring perfect sync with actual page state | |
| 496 | +| **Unified Cross-Boundary Grounding** | Rust/WASM engine prunes and ranks elements across Shadow DOM and iframes, maintaining unified element ordering | |
| 497 | +| **Token-Efficient Pruning** | Specifically prunes uninformative branches while preserving all interactive elements, enabling 3B parameter models to perform at larger model levels | |
| 498 | +| **Deterministic Verification** | Binds intent to deterministic outcomes via snapshot diff, providing an auditable "truth" layer rather than just a structural "report" | |
| 499 | + |
| 500 | +> **Bottom Line:** A11y trees tell you what *should* be there. Predicate Snapshots tell you what *is* there—and prove it. |
| 501 | +
|
471 | 502 | ## Architecture |
472 | 503 |
|
473 | 504 | ``` |
@@ -495,31 +526,7 @@ predicate-snapshot-skill/ |
495 | 526 |
|
496 | 527 | ## Support |
497 | 528 |
|
498 | | -- Documentation: [predicatesystems.ai/docs](https://predicatesystems.ai/docs) |
499 | | -- Issues: [GitHub Issues](https://github.com/PredicateSystems/openclaw-predicate-skill/issues) |
500 | | - |
501 | | -## Why Predicate Snapshot Over Accessibility Tree? |
502 | | - |
503 | | -OpenClaw and similar browser automation frameworks default to the **Accessibility Tree (A11y)** for navigating websites. While A11y works for simple cases, it has fundamental limitations that make it unreliable for production LLM-driven automation: |
504 | | - |
505 | | -### A11y Tree Limitations |
506 | | - |
507 | | -| Problem | Description | Impact on LLM Agents | |
508 | | -|---------|-------------|----------------------| |
509 | | -| **Optimized for Consumption, Not Action** | A11y is designed for assistive technology (screen readers), not action verification or layout reasoning | Lacks precise semantic geometry and ordinality (e.g., "the first item in a list") that agents need for reliable reasoning | |
510 | | -| **Hydration Lag & Structural Inconsistency** | In JS-heavy SPAs, A11y often lags behind hydration or misrepresents dynamic overlays and grouping | Snapshots miss interactive nodes or incorrectly label states (e.g., confusing `focused` with `active`) | |
511 | | -| **Shadow DOM & Iframe Blind Spots** | A11y struggles to maintain global order across Shadow DOM and iframe boundaries | Cross-shadow ARIA delegation is inconsistent; iframe contents are often missing or lose spatial context | |
512 | | -| **Token Inefficiency** | Extracting the entire A11y tree for small actions wastes context window and compute | Superfluous nodes (like `genericContainer`) consume tokens without helping the agent | |
513 | | -| **Missing Visual/Layout Bugs** | A11y trees miss rendering-time issues like overlapping buttons or z-index conflicts | Agent reports elements as "correct" but cannot detect visual collisions | |
514 | | - |
515 | | -### Predicate Snapshot Advantages |
516 | | - |
517 | | -| Capability | How Predicate Solves It | |
518 | | -|------------|------------------------| |
519 | | -| **Post-Rendered Geometry** | Layers in actual bounding boxes and grouping missing from standard A11y representations | |
520 | | -| **Live DOM Synchronization** | Anchors on the live, post-rendered DOM ensuring perfect sync with actual page state | |
521 | | -| **Unified Cross-Boundary Grounding** | Rust/WASM engine prunes and ranks elements across Shadow DOM and iframes, maintaining unified element ordering | |
522 | | -| **Token-Efficient Pruning** | Specifically prunes uninformative branches while preserving all interactive elements, enabling 3B parameter models to perform at larger model levels | |
523 | | -| **Deterministic Verification** | Binds intent to deterministic outcomes via snapshot diff, providing an auditable "truth" layer rather than just a structural "report" | |
524 | | - |
525 | | -> **Bottom Line:** A11y trees tell you what *should* be there. Predicate Snapshots tell you what *is* there—and prove it. |
| 529 | +- **ClawHub:** [clawhub.ai/rcholic/predicate-snapshot](https://clawhub.ai/rcholic/predicate-snapshot) |
| 530 | +- **GitHub:** [github.com/PredicateSystems/openclaw-predicate-skill](https://github.com/PredicateSystems/openclaw-predicate-skill) |
| 531 | +- **Documentation:** [predicatesystems.ai/docs](https://predicatesystems.ai/docs) |
| 532 | +- **Issues:** [GitHub Issues](https://github.com/PredicateSystems/openclaw-predicate-skill/issues) |
0 commit comments