Skip to content

Commit 57c42f4

Browse files
abrichrclaude
andcommitted
docs: qualify README claims for intellectual honesty
- Change "Core Innovation" to "Core Approach" (more accurate) - Change "key differentiator" to "explores" (less marketing) - Correct accuracy figure (46.7% -> 100%, not 33% -> 100%) - Add context that all 45 tasks share same navigation entry point - Link to publication roadmap for methodology and limitations - Change "No technical expertise needed" to "Reduced prompt engineering" The goal is accuracy over marketing appeal. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent fcef4c8 commit 57c42f4

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -285,18 +285,18 @@ flowchart TB
285285
class L0,L1,L2 implemented
286286
```
287287

288-
### Core Innovation: Demo-Conditioned Prompting
288+
### Core Approach: Demo-Conditioned Prompting
289289

290-
OpenAdapt's key differentiator is **demonstration-conditioned automation** - "show, don't tell":
290+
OpenAdapt explores **demonstration-conditioned automation** - "show, don't tell":
291291

292292
| Traditional Agent | OpenAdapt Agent |
293293
|-------------------|-----------------|
294294
| User writes prompts | User records demonstration |
295295
| Ambiguous instructions | Grounded in actual UI |
296-
| Requires prompt engineering | No technical expertise needed |
296+
| Requires prompt engineering | Reduced prompt engineering |
297297
| Context-free | Context from similar demos |
298298

299-
**Retrieval powers BOTH training AND evaluation**: Similar demonstrations are retrieved as context for the VLM, improving accuracy from 33% to 100% on first-action benchmarks.
299+
**Retrieval powers BOTH training AND evaluation**: Similar demonstrations are retrieved as context for the VLM. In early experiments on a controlled macOS benchmark, this improved first-action accuracy from 46.7% to 100% - though all 45 tasks in that benchmark share the same navigation entry point. See the [publication roadmap](docs/publication-roadmap.md) for methodology and limitations.
300300

301301
### Key Concepts
302302

0 commit comments

Comments
 (0)