Neo is a local operator for Apple iPhone Mirroring. It turns a phone-call transcript or local transcript payload into a concrete task, asks an OpenAI model what to do next, reads the mirrored iPhone screen with OmniParser, and executes safe computer actions through a small native macOS helper.
The project also includes:
- a webhook server for ElevenLabs post-call transcripts
- repo-managed ElevenLabs agent manifests
- a terminal run console for live operations
- run artifacts for replay, debugging, and evals
- a local OmniParser service with separately downloaded model weights
- macOS with Apple iPhone Mirroring available
- Node.js 22 or newer
- Python 3 with
venv - Xcode command line tools for building the native helper
OPENAI_API_KEYin your shell or.envELEVENLABS_API_KEYandELEVENLABS_WEBHOOK_SECRETfor ElevenLabs flows
Install JavaScript dependencies:
npm installBuild the native macOS helper:
npm run build:helperDownload the OmniParser weights described below, then start Neo:
npm run serveIn another terminal, open the operator UI:
./bin/neo uiFor a local replay run instead of the webhook server:
./bin/neo run path/to/transcript.jsonRuntime configuration lives in neo.config.json.
Important defaults:
- server:
127.0.0.1:8787 - OpenAI model:
gpt-5.4 - OmniParser service:
127.0.0.1:8000 - OmniParser weights:
./assets/omniparser/weights - macOS target app:
iPhone Mirroring - native helper:
./bin/neo-helper
Local secrets should live in .env or your shell:
OPENAI_API_KEY=...
ELEVENLABS_API_KEY=...
ELEVENLABS_WEBHOOK_SECRET=...
ELEVENLABS_CALLBACK_PHONE_NUMBER_ID=....env, .cache/, runs/, native build output, and model weights are ignored
by Git.
OmniParser model weights are intentionally not committed. GitHub rejects large files, and the Florence caption checkpoint is over 1 GB.
Neo expects this layout:
assets/omniparser/weights/
├── icon_caption_florence/
│ ├── config.json
│ ├── generation_config.json
│ └── model.safetensors
└── icon_detect/
├── model.pt
├── model.yaml
└── train_args.yaml
Create the directories:
mkdir -p assets/omniparser/weights/icon_detect
mkdir -p assets/omniparser/weights/icon_caption_florenceInstall the Hugging Face CLI:
python3 -m pip install --upgrade "huggingface_hub[cli]"Download detector weights from microsoft/OmniParser-v2.0:
huggingface-cli download microsoft/OmniParser-v2.0 icon_detect/model.pt --local-dir assets/omniparser/weights
huggingface-cli download microsoft/OmniParser-v2.0 icon_detect/model.yaml --local-dir assets/omniparser/weights
huggingface-cli download microsoft/OmniParser-v2.0 icon_detect/train_args.yaml --local-dir assets/omniparser/weightsDownload Florence caption weights from microsoft/OmniParser:
huggingface-cli download microsoft/OmniParser icon_caption_florence/config.json --local-dir assets/omniparser/weights
huggingface-cli download microsoft/OmniParser icon_caption_florence/generation_config.json --local-dir assets/omniparser/weights
huggingface-cli download microsoft/OmniParser icon_caption_florence/model.safetensors --local-dir assets/omniparser/weightsDirect links:
On the first OmniParser-backed run, Neo creates a Python virtual environment at
.cache/neo-omniparser/venv and installs dependencies from
python/neo_omniparser/requirements.txt.
Use ./bin/neo from the repository root.
neo serve
neo run <transcript-file>
neo ui [run-id]
neo runs list [--status <status>] [--limit N] [--json]
neo runs show <run-id> [--events N] [--json]
neo runs screenshots <run-id> [--latest] [--json]
neo runs enrich-clicks [<run-id>|--all] [--archive-completed]
neo runs eval <run-id> [--manifest|--turns|--grading] [--json]
neo runs active [--json]
neo runs stop <run-id>
neo runs retry <run-id>
neo skills list
neo skills save <spec-file>
neo skills eval prepare-run <run-id>
neo skills eval judge-run <run-id> --skill <skill-name> --spec <spec-file>
neo elevenlabs sync
neo elevenlabs status
neo elevenlabs inspect
neo elevenlabs webhook [--base-url <public-base-url>]
neo elevenlabs simulate <manager|callback>
Common workflows:
./bin/neo runs list --status failed --limit 5
./bin/neo runs show <run-id> --events 25
open "$(./bin/neo runs screenshots <run-id> --latest)"./bin/neo runs active
./bin/neo runs stop <run-id>
./bin/neo runs retry <run-id>See .docs/cli.md for the full command guide and .docs/ui.md for the
terminal UI reference.
Neo keeps ElevenLabs desired state in the repository and resolved live state in the local cache.
Repo-managed files:
.agents/elevenlabs/manager-agent.json.agents/elevenlabs/callback-agent.json.agents/elevenlabs/prompts/*.md
Local resolved state:
.cache/elevenlabs/agents.json
neo serve uses the persisted local state. It does not sync agent manifests on
startup. Run sync only when you intend to apply manifest changes:
./bin/neo elevenlabs sync
./bin/neo elevenlabs status
./bin/neo elevenlabs inspectPrint webhook URLs and signature details:
./bin/neo elevenlabs webhook
./bin/neo elevenlabs webhook --base-url https://your-public-tunnel.exampleThe transcript webhook path is:
/webhooks/elevenlabs/transcripts
The live call-start webhook path is:
/webhooks/elevenlabs/call-start
See .docs/elevenlabs-setup.md for dashboard setup, callback phone-number
selection, and end-to-end verification.
- Transcript intake receives a webhook or local transcript file.
- The task builder converts the transcript into a target app, intent, and operator-facing task summary.
- The runner builds a prompt from task state, screenshots, parsed UI targets, previous actions, and active skills.
- The OpenAI model proposes actions.
- The action pipeline checks safety, target validity, recovery state, and progress.
- The native helper focuses iPhone Mirroring and executes clicks, text input, scrolling, keypresses, drags, waits, or screenshots.
- Neo captures screenshots, parses them with OmniParser, writes artifacts, and continues until the task completes, blocks, fails, or is stopped.
Only one run executes at a time. The active worker owns .neo-run.lock.
Every run writes to runs/<id>/:
runs/<id>/
run.json
task.json
transcript.json
prompt.txt
summary.txt
events.jsonl
control.json
responses/*.json
screenshots/*.png
screenshots/.latest
eval/manifest.json
eval/turns.json
eval/artifacts.json
eval/grading.json
These files are the source of truth for the CLI, terminal UI, debugging, and eval preparation.
src/application/ orchestration, runner, run queue, CLI-facing services
src/domain/ prompting, task planning, safety, skills, shared types
src/infrastructure/ config, server, OpenAI client, run store, macOS harness
src/elevenlabs/ ElevenLabs manifest sync and callback support
python/ local OmniParser FastAPI service
native/ Swift helper for iPhone Mirroring automation
bin/ CLI launcher and built helper
.agents/ ElevenLabs manifests and local task skills
.docs/ detailed operator guides
test/ Node test suite
Run tests:
npm testBuild the helper after changing native/neo_helper.swift:
npm run build:helperRun the CLI help:
./bin/neo helpThe CLI runs TypeScript directly with Node's --experimental-strip-types, so
there is no separate JavaScript build step.
If the parser fails to start, confirm that all six OmniParser files exist under
assets/omniparser/weights.
If huggingface-cli asks for authentication, run:
huggingface-cli loginIf Neo cannot control iPhone Mirroring, rebuild the helper and make sure macOS has granted the necessary Accessibility and screen-recording permissions to the terminal app you are using.
If a run appears stuck, inspect the lock and active run state:
./bin/neo runs activeIf a run is live but should stop, request a graceful stop:
./bin/neo runs stop <run-id>If ElevenLabs behavior looks stale after editing manifests, run:
./bin/neo elevenlabs sync
./bin/neo elevenlabs status