Claude Context for openadapt-ml

MANDATORY: Branches and Pull Requests

NEVER push directly to main. ALWAYS use feature branches and pull requests.

Create a feature branch: git checkout -b feat/description or fix/description
Make commits on the branch
Push the branch: git push -u origin branch-name
Create a PR: gh pr create --title "..." --body "..."
Only merge via PR (never git push origin main)

This is a hard rule with NO exceptions, even for "small" changes.

Simplicity Guidelines

Philosophy: "Less is more. 80/20 impact/complexity. Working code beats elegant design."

Before writing code: Can this be <100 lines? Does this provide 80% of value? Is this the simplest approach?

Avoid: Classes when functions work, abstractions before 3rd use, design docs for non-existent code.

See: /Users/abrichr/oa/src/openadapt-evals/SIMPLICITY_PRINCIPLES.md for full guidelines.

Project Status

IMPORTANT: Check /Users/abrichr/oa/src/STATUS.md at session start for P0 priorities.

Project Overview

openadapt-ml (v0.5.0): Pure ML engine for GUI automation agents.

What	Where
Schemas, VLM adapters, training, inference, grounding, ML agents	`openadapt-ml`
Evaluation infrastructure (VM management, pool orchestration, CLI)	`openadapt-evals`

Package structure

openadapt_ml/schema/ - Canonical trajectory schemas (episode, converters)
openadapt_ml/models/ - VLM adapters (Qwen3-VL, Qwen2.5-VL, API backends)
openadapt_ml/training/ - Supervised fine-tuning pipeline
openadapt_ml/runtime/ - Policy API
openadapt_ml/grounding/ - Grounding module (deprioritized; for real UIs without SoM overlays)
openadapt_ml/baselines/ - Baseline adapters
openadapt_ml/benchmarks/agent.py - ML-specific agents (PolicyAgent, APIBenchmarkAgent, UnifiedBaselineAgent)
openadapt_ml/cloud/ - Cloud GPU training (Lambda Labs, Azure inference, SSH tunnels)
openadapt_ml/config.py - Settings (pydantic-settings)
configs/ - Training YAML configs (qwen3vl_capture.yaml, etc.)

Key Architecture Decisions

SoM mode - Element IDs (CLICK([1])) instead of coordinates for accuracy on synthetic benchmarks
Schema purity - Domain-agnostic; external systems adapt TO the schema, not vice versa
Lossless preservation - Store raw benchmark configs in raw_config, raw_observation, raw_action fields
DOM/AX mandatory in schema - For evaluator compatibility (WebArena, Mind2Web need DOM), even if agents use vision-only
Cloud-first - Offload heavy compute to cloud GPUs (Azure, Lambda Labs)
Stub training - Use --stub flag for rapid UI iteration without GPU

Demo Retrieval (Core Value Proposition)

Key insight: OpenAdapt's value is trajectory-conditioned disambiguation of UI affordances.

Validated: Demo-conditioned prompting improves accuracy (zero-shot 33% -> with demo 100% correct first actions). See docs/experiments/demo_conditioned_prompting_results.md.

Cloud GPU Training

See docs/cloud_gpu_training.md for full documentation.

# Lambda Labs - automated pipeline
uv run python -m openadapt_ml.cloud.lambda_labs train --capture /path --goal "Task"

# Step by step
uv run python -m openadapt_ml.cloud.lambda_labs launch --type gpu_1x_a10
uv run python -m openadapt_ml.cloud.lambda_labs train-status
uv run python -m openadapt_ml.cloud.lambda_labs terminate <id>

Training Commands

# Train on capture
uv run python -m openadapt_ml.scripts.train \
  --config configs/qwen3vl_capture.yaml \
  --capture /path/to/capture \
  --open

# Serve dashboard (auto-regenerates HTML)
uv run python -m openadapt_ml.cloud.local serve --port 8080 --open

# Regenerate viewer without serving
uv run python -m openadapt_ml.cloud.local viewer

# Compare human vs model
uv run python -m openadapt_ml.scripts.compare \
  --capture /path/to/capture \
  --checkpoint checkpoints/model \
  --open

Azure Setup

scripts/setup_azure.py automates Azure resource creation (resource group, service principal, ML workspace, ACR, WAA Docker image import):

python scripts/setup_azure.py            # Setup
python scripts/setup_azure.py --cleanup  # Cleanup

Code Patterns

Environment Variables

Use config.settings, NOT os.environ:

# Good
from openadapt_ml.config import settings
api_key = settings.openai_api_key

# Bad
api_key = os.environ.get("OPENAI_API_KEY")

When adding new env vars:

Add to Settings class in config.py
Add to .env.example

API Keys for CLI

Priority: --api-key flag > .env file > environment variable

Git Commit Style (Angular)

<type>(<scope>): <subject>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Types: feat, fix, docs, style, refactor, perf, test, chore, ci

Rules: Imperative mood, no period, max 50 chars, lowercase after type

PR Titles MUST Use Conventional Commit Format

PR titles become the squash merge commit message on main. python-semantic-release parses these to decide version bumps. If the PR title doesn't follow the format, no release is created.

fix: short description          → patch bump (0.0.x)
feat: short description         → minor bump (0.x.0)
feat!: breaking change          → major bump (x.0.0)

Examples:

fix: align PolicyAgent prompt with training format
feat: add Modal inference timeout config

Wrong (will NOT trigger a release):

Align PolicyAgent prompt with training format (no fix: prefix)

When merging with gh pr merge --squash, GitHub uses the PR title as the commit message — so the title format is what matters.

Don't Do

Don't use os.environ - use config.settings
Don't use pip install - use uv add or uv sync
Don't tell user to run commands - YOU run them
Don't use broad pkill patterns (they kill unrelated apps)
Don't add timelines/estimates to plans
Don't mention specific clients by name

Safe Process Management

# WRONG (kills unrelated apps)
pkill -f "openadapt"
pkill -f "python"

# RIGHT (specific)
kill $(lsof -t -i :8765) 2>/dev/null
pkill -f "python.*-m openadapt_ml.cloud.local serve"

# Check before killing
pgrep -f "pattern" -l

File Access

Pre-approved read access to ~/oa/src/ (related projects like openadapt-capture, openadapt-evals).

Troubleshooting

Dashboard/Viewer Stale Data

After code changes:

Regenerate: uv run python -m openadapt_ml.cloud.local viewer
Hard-refresh browser: Cmd+Shift+R

Symptom	Fix
Elapsed time shows 0	Check training_log.json has elapsed_time
No comparison screenshots	Update capture_path in training_log.json
Stale data after code change	Hard refresh (Cmd+Shift+R)

See docs/ for detailed troubleshooting guides.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude Context for openadapt-ml

MANDATORY: Branches and Pull Requests

Simplicity Guidelines

Project Status

Project Overview

Package structure

Key Architecture Decisions

Demo Retrieval (Core Value Proposition)

Cloud GPU Training

Training Commands

Azure Setup

Code Patterns

Environment Variables

API Keys for CLI

Git Commit Style (Angular)

PR Titles MUST Use Conventional Commit Format

Don't Do

Safe Process Management

File Access

Troubleshooting

Dashboard/Viewer Stale Data

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Claude Context for openadapt-ml

MANDATORY: Branches and Pull Requests

Simplicity Guidelines

Project Status

Project Overview

Package structure

Key Architecture Decisions

Demo Retrieval (Core Value Proposition)

Cloud GPU Training

Training Commands

Azure Setup

Code Patterns

Environment Variables

API Keys for CLI

Git Commit Style (Angular)

PR Titles MUST Use Conventional Commit Format

Don't Do

Safe Process Management

File Access

Troubleshooting

Dashboard/Viewer Stale Data