CLI Guide

The Memorizz CLI turns the library into a tool: an interactive, Claude-Code-style terminal agent backed by the full Memorizz harness (memory backends, providers, tools). It streams replies token-by-token, supports / slash commands, and — by default — keeps one persistent agent whose memory carries across sessions.

It also runs a 100% local stack (Ollama LLM + Ollama embeddings + on-disk memory, no API key) and can launch the Local UI.

Install

The CLI ships in the base package, so the memorizz command works straight away:

uv tool install memorizz       # recommended
pipx install memorizz
pip install memorizz

For the fully-local Ollama stack (Ollama SDK + FAISS filesystem vector store), add the local extra:

uv tool install "memorizz[local]"     # == memorizz[ollama,filesystem]

Other providers are extras: memorizz[anthropic], memorizz[mongodb], memorizz[oracle], memorizz[ui], or everything with memorizz[all].

!!! note "Homebrew & npm" A Homebrew tap (brew install RichmondAlake/memorizz/memorizz) and an npm bootstrapper (npm i -g memorizz, which installs the real tool via uv) are also available for non-Python-first workflows.

Quickstart

Option A — Local, no API key (Ollama)

# 1. install + run Ollama, then pull a tool-capable chat model + an embedder
ollama pull qwen2.5:7b
ollama pull nomic-embed-text

# 2. launch the REPL
memorizz

With no cloud key set and an Ollama daemon running, Memorizz auto-selects Ollama for the LLM, Ollama (nomic-embed-text) for embeddings, and an on-disk filesystem store under ~/.memorizz/memory.

Option B — Cloud (OpenAI / Anthropic)

export OPENAI_API_KEY=sk-...      # or ANTHROPIC_API_KEY=...
memorizz

Cloud keys are auto-detected (Anthropic → OpenAI → Azure → local Ollama). You can also save a key from inside the REPL with /login.

The REPL

Running memorizz with no arguments launches the interactive loop:

Type plain text to chat; the reply streams live and renders as Markdown.
For reasoning models (e.g. qwen3, deepseek-r1), the model's thinking is shown dimmed above the answer, and tool activity is shown as it happens.
Ctrl-C during a reply aborts just that reply (you stay in the REPL).
Ctrl-C at the prompt, Ctrl-D, or /exit saves the agent and quits.
Press Tab to autocomplete slash commands.

Slash commands

Command	Description
`/help`	List all commands + the current mode/model.
`/model [name]`	Show or switch the chat model (keeps the provider).
`/provider [name]`	Switch provider: `openai`/`anthropic`/`ollama`/`azure`/`huggingface`/`mlx`.
`/ollama [list\|pull <tag>\|host <url>]`	List/pull Ollama models or set `OLLAMA_HOST`.
`/web [on\|off\|tavily\|firecrawl]`	Enable/disable internet search (Tavily/Firecrawl).
`/code [on\|off]`	Toggle coding tools (file read/write + bounded commands, scoped to cwd).
`/memory [id]`	Show or switch the active memory id.
`/history`	Print the current conversation history.
`/forget <id>`	Delete a single stored memory by id.
`/new`	Start a fresh conversation thread (keeps long-term memory).
`/clear`	Erase the agent's entire stored memory (asks to confirm).
`/cls`	Clear the terminal screen.
`/agents`	List saved agents.
`/agent <id>`	Load a saved agent by id.
`/persona [name \| goals \| background]`	Show or set the agent's persona.
`/persona-reset`	Clear the persona (revert to default).
`/tools`	List the agent's registered tools.
`/ingest <file>`	Ingest a file into the knowledge base.
`/ui [--port N] [--host H]`	Launch the local web UI.
`/login [provider]`	Log in / save an API key — lists platforms to pick from if none given.
`/config`	Show resolved config + paths.
`/docs [cli\|ui]`	Open the documentation in your browser.
`/exit`	Save the agent and quit.

Modes

Memory assistant (default). A conversational agent with persistent long-term memory — it remembers facts you share and recalls them in later turns and later sessions.

Coding mode. Launch with memorizz --code, or type /code in the REPL, to enable the agent's self-aware tools: read/write files and run a bounded set of commands, scoped to the current working directory (writes on, deletes off).

Internet access

Give the agent web search + page reading via Tavily or Firecrawl. No extra install is needed — the providers call the REST APIs directly.

export TAVILY_API_KEY=tvly-...      # or FIRECRAWL_API_KEY=fc-...
memorizz                            # internet tools auto-enable when a key is set

Or manage it from the REPL:

/login tavily       # save the key AND enable internet immediately
/web                # show status  (also: /web on | off | tavily | firecrawl)

When enabled, the agent gains internet_search (web search) and open_web_page (fetch + read a full page). It can search, then open the most relevant result for deeper detail (Tavily runs at advanced search depth for ~5x richer results). Use a 7B+ model for reliable multi-step web use.

Persistent agent & memory

Unlike a stateless chat, the CLI reuses one persistent agent across launches. The default agent id and the rolling memory id are stored in ~/.memorizz/state.json, so a fact you teach it today is recalled tomorrow.

Action	Command	Effect
New conversation, keep long-term memory	`/new`	Starts a fresh thread; past facts still recalled semantically.
Forget one memory	`/forget <id>`	Deletes a single stored entry.
Wipe everything	`/clear`	Erases all stored memory after confirmation; keeps persona + tools.

One-shot mode

Run a single prompt and print the reply (pipe-friendly, no REPL):

memorizz run "Summarize what you remember about my project."
memorizz run --code "Add a docstring to utils.py and run the tests."

One-shot turns share the same persistent agent + memory as the REPL.

Configuration

Memorizz centralizes config under ~/.memorizz/:

Path	Purpose
`~/.memorizz/.env`	API keys and `MEMORIZZ_*` defaults.
`~/.memorizz/memory/`	Default filesystem memory store.
`~/.memorizz/state.json`	Persistent agent id + rolling memory id.
`~/.memorizz/history`	REPL input history.

Overrides: MEMORIZZ_HOME (the home dir) and MEMORIZZ_ENV_FILE (the env file). A project-local ./.env is still honored for backwards compatibility. The CLI and the Local UI read/write the same .env, so configuring once applies to both.

Useful commands:

memorizz init           # interactive key wizard
memorizz init --local   # configure the local Ollama stack
memorizz config         # show resolved paths + detected provider

Launch the Local UI

memorizz ui                         # http://127.0.0.1:8765
memorizz ui --host 0.0.0.0 --port 9000

See the Local UI Guide for details.

Command reference

memorizz                 # launch the interactive REPL (default)
memorizz chat [--code] [--provider P] [--model M]
memorizz run "<prompt>"  # one-shot
memorizz ui [--host H] [--port N]
memorizz init [--local]
memorizz config
memorizz --version
memorizz oracle install|setup|setup-schema|teardown
memorizz automations run [--poll-interval N] [--lease-seconds N] [--concurrency N]

!!! note "Back-compatible commands" The earlier forms still work with a deprecation notice: memorizz run local → memorizz ui, memorizz run automations → memorizz automations run, and memorizz install-oracle → memorizz oracle install (etc.).

Choosing an Ollama model

The agent always sends tools, so the local model must support tool-calling:

Recommended: qwen2.5:7b, qwen2.5:3b, or llama3.1:8b — tool-capable, non-reasoning, good quality.
Reasoning models (qwen3, deepseek-r1, qwq, magistral) work — Memorizz auto-enables their "thinking" so reasoning is surfaced and answers aren't truncated — but for a snappy default a non-reasoning instruct model is better.
Smaller models (3b, even 0.5b) give simpler answers but stay snappy — plain chat exposes no tools by default, so they no longer loop. gemma models lack tool-calling in Ollama, so only use them for plain chat (not /code).
For web search + memory reasoning, prefer 7B+ (qwen2.5:7b / llama3.1:8b): smaller models tend to answer from guesswork instead of reading their memory or chaining internet_search → open_web_page.

Zero-config auto-selection already prefers tool-capable, non-reasoning families.

Troubleshooting

"This model can't tool-call." Your Ollama model lacks tool support (e.g. gemma). Pull a tool-capable model and switch:

ollama pull llama3.1:8b
# then in the REPL:
/model llama3.1:8b

No provider configured. Set a key (/login or export OPENAI_API_KEY=...) or start Ollama and pull a model, then relaunch.

Semantic recall seems weak (local stack). Pull the embedder so memory uses vectors instead of brute-force text matching:

ollama pull nomic-embed-text

Slow startup. A default install is lean (no PyTorch). If import memorizz feels heavy, ensure you didn't install memorizz[huggingface] unless you need local HuggingFace models/embeddings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI Guide

Install

Quickstart

Option A — Local, no API key (Ollama)

Option B — Cloud (OpenAI / Anthropic)

The REPL

Slash commands

Modes

Internet access

Persistent agent & memory

One-shot mode

Configuration

Launch the Local UI

Command reference

Choosing an Ollama model

Troubleshooting

FilesExpand file tree

cli.md

Latest commit

History

cli.md

File metadata and controls

CLI Guide

Install

Quickstart

Option A — Local, no API key (Ollama)

Option B — Cloud (OpenAI / Anthropic)

The REPL

Slash commands

Modes

Internet access

Persistent agent & memory

One-shot mode

Configuration

Launch the Local UI

Command reference

Choosing an Ollama model

Troubleshooting