You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: demo/BLOG_POST.md
+33-5Lines changed: 33 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,9 +95,13 @@ Predicate Secure wraps your existing agent code in **3-5 lines** - no rewrites n
95
95
The demo executes a simple but complete browser task:
96
96
97
97
✓ Navigate to https://www.example.com with policy check
98
+
98
99
✓ Take snapshot with visual element overlay
100
+
99
101
✓ Find and click "Learn more" link using semantic query
102
+
100
103
✓ Verify URL contains "example-domains" after navigation
104
+
101
105
✓ Upload trace to Predicate Studio (if API key provided)
102
106
103
107
Each action goes through the full authorization + verification loop.
@@ -174,17 +178,19 @@ Authorization rules are declarative YAML:
174
178
175
179
> **Note:** The policy is fail-closed: any action not explicitly allowed is denied. This prevents agents from taking unexpected actions.
176
180
177
-
### 3. Verification with Local LLM
181
+
### 3. LLM-Generated Verification Predicates
178
182
179
-
After each action, the local LLM generates verification predicates:
183
+
After each action, the local LLM analyzes the state changes and generates **deterministic verification predicates** (assertions to check):
184
+
185
+
> **Important:** The LLM is NOT doing visual verification. Instead, it generates structured assertions (like `url_contains`, `element_exists`) based on observed state changes. The actual verification execution is **deterministic** - predicates are evaluated as true/false checks.
180
186
181
187
```python
182
188
# Capture pre and post snapshots
183
189
pre_snapshot = await get_page_summary()
184
190
result = await execute_action()
185
191
post_snapshot = await get_page_summary()
186
192
187
-
# LLM generates verification plan
193
+
# LLM generates verification plan (what to check, not the check itself)
The LLM sees both snapshots and generates appropriate checks:
213
+
The LLM sees both snapshots and generates a structured verification plan:
208
214
209
215
```json
210
216
{
@@ -222,6 +228,28 @@ The LLM sees both snapshots and generates appropriate checks:
222
228
}
223
229
```
224
230
231
+
**For Production Workflows:**
232
+
233
+
For well-understood web flows (like QA testing flows or regular business processes), you can skip LLM generation and use **human-defined predicates** directly:
This approach is **faster** (no LLM inference), **more predictable** (explicit assertions), and **ideal for regression testing** of known workflows. Use LLM-generated predicates for exploratory tasks or novel scenarios.
0 commit comments