Find the best local AI model (LLM, Image, Speech, Multimodal) that runs on your hardware.
WhichModel is an intelligent CLI tool that automatically detects your hardware configuration and recommends the best locally runnable AI models. It covers multiple categories including LLMs, image generation, speech recognition, and multimodal models, with special optimization for the Chinese model ecosystem.
Inspired by whichllm, we expanded its capabilities from LLM-only to a full AI model recommendation engine.
- 🖥️ Hardware Auto-Detection — Automatically detects NVIDIA/AMD/Apple Silicon/Intel Arc GPUs, CPU, RAM
- 🧠 Multi-Category Coverage — LLM, Image Generation, Speech Recognition, Multimodal models
- 🇨🇳 Chinese Model Priority — Deep support for Qwen, DeepSeek, ChatGLM, Baichuan, and other domestic models
- 🐳 One-Click Deployment — Generates Docker Compose / systemd / shell startup scripts
- 📊 Model Comparison Matrix — Side-by-side comparison of performance, quality, and resource usage
- 📡 JSON Output — Supports piping and automated integration
- 🎯 GPU Simulation — Test before you buy: simulate any GPU to see recommended models
pip install whichmodel# Auto-detect hardware and recommend models
whichmodel recommend
# Simulate a specific GPU
whichmodel recommend --gpu "RTX 4090"
# Recommend image generation models
whichmodel recommend --category image --top 3
# Output as JSON
whichmodel recommend --json
# Show hardware info
whichmodel hardware-infoDetected Hardware:
CPU: INTEL(R) XEON(R) PLATINUM 8582C (3 cores)
RAM: 5.8 GB
GPU: NVIDIA GeForce RTX 4090 (24.0 GB VRAM)
Recommended AI Models
┏━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
┃ Rank ┃ Model ┃ Category ┃ Score ┃ Fit ┃ VRAM ┃ Quant ┃ Descripti… ┃
┡━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
│ 1 │ 🇨🇳 DeepSeek…│ LLM │ 115.3 │ ✅ full │ 22.0G │ Q8_0 │ DeepSeek's │
│ 2 │ 🇨🇳 Qwen3-32B│ LLM │ 100.3 │ ✅ full │ 20.0G │ FP16 │ Alibaba's │
│ 3 │ Llama-3-70B │ LLM │ 95.8 │ ✅ full │ 22.0G │ Q8_0 │ Meta's │
└──────┴─────────────┴──────────┴───────┴──────────┴───────┴───────┴────────────┘
| Command | Description |
|---|---|
whichmodel recommend |
Recommend models for your hardware |
whichmodel hardware-info |
Show detected hardware information |
whichmodel deploy-script <model> |
Generate deployment script |
whichmodel compare <model1> <model2> |
Compare models side by side |
whichmodel version |
Show version info |
whichmodel recommend [OPTIONS]
Options:
-c, --category TEXT Model category: llm, image, speech, multimodal, all [default: llm]
-n, --top INTEGER Number of recommendations [default: 5]
--json Output as JSON
-g, --gpu TEXT Simulate a GPU (e.g., "RTX 4090")
--min-vram FLOAT Minimum VRAM in GB
-q, --quant TEXT Quantization: Q4_K_M, Q5_K_M, Q6_K, Q8_0, FP16
--chinese-first Prioritize Chinese models [default: True]# Generate Docker Compose for a model
whichmodel deploy-script qwen3-8b --backend docker -o docker-compose.yml
# Generate systemd service
whichmodel deploy-script whisper-large-v3 --backend systemd -o whisper.service- Hardware-First — Recommendations are based on actual hardware capabilities, not just model popularity
- Evidence-Based Scoring — Benchmark scores, VRAM fit, and quantization quality all factor into rankings
- Chinese Model Ecosystem — Prioritizes domestic models to better serve Chinese developers
- Production-Ready — Generates deployment scripts that work out of the box
- HuggingFace Hub API integration for live model data
- Support for more quantization formats (AWQ, GPTQ, EXL2)
- Web UI for non-CLI users
- Model download and auto-setup
- Benchmark suite integration
git clone https://github.com/gitstq/whichmodel.git
cd whichmodel
pip install -e ".[dev]"
pytestpython -m build
twine upload dist/*Contributions are welcome! Please see our Contributing Guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ by the WhichModel Team
