A terminal UI for launching llama.cpp llama-server with your local GGUF models.
llamapad stopped
Model > ~/.cache/huggingface/hub/models--nohurry--gemma-4-.../gemma-4-26b-a4b-it-heretic.q4_k_m.gguf
(1/2) nohurry/gemma-4-26B-A4B-it-heretic-GUFF / gemma-4-26b-a4b-it-heretic.q4_k_m.gguf
Host > 0.0.0.0
Port > 11434
GPU Layers > 99
Context Size 2048 4096 8192 16384 [32768] 65536 131072
--swa-full [ off ]
--flash-attn [ on ]
Launch Stop
- Auto-discovers local GGUF models from HuggingFace hub cache, llama.cpp cache, and
~/models - Browse models with arrow keys -- up/down on the Model field cycles through discovered files
- Configure everything -- host, port, GPU layers, context size, SWA, flash attention
- Live log streaming -- stdout/stderr from
llama-serverdisplayed in real time - Favorites -- save and recall frequently used models with
ctrl+f - Persistent config -- settings saved to
~/.config/llamapad/config.json
- llama-server in your
PATH - Go 1.21+ (to build)
git clone https://github.com/witong42/llamapad.git
cd llamapad
make installThis builds the binary and copies it to /usr/local/bin (requires sudo).
To install elsewhere:
make install PREFIX=~/.local/binmake uninstallllamapad| Key | Action |
|---|---|
tab / shift+tab |
Move between fields |
up / down |
Browse local models (on Model field) |
up / down |
Toggle options (on toggle fields) |
left / right |
Change context size |
enter |
Launch / Stop server |
ctrl+f |
Open favorites |
ctrl+c |
Quit |
llamapad scans these directories for .gguf files on startup:
| Directory | Description |
|---|---|
~/.cache/huggingface/hub/ |
HuggingFace hub cache (default) |
~/Library/Caches/llama.cpp/ |
llama.cpp native cache (macOS) |
~/.cache/llama.cpp/ |
llama.cpp native cache (Linux) |
~/models/ |
User models directory |
Respects LLAMA_CACHE, HF_HUB_CACHE, HUGGINGFACE_HUB_CACHE, HF_HOME, and XDG_CACHE_HOME environment variables.
make # build
make clean # remove binaryMIT