import { Badge, Tab, Tabs } from "rspress/theme";
AIScript supports running AI models locally through Ollama, providing privacy, cost-effectiveness, and offline capabilities.
- Install Ollama from ollama.ai
- Start Ollama service (runs on port 11434 by default)
- Pull models you want to use:
# Popular general-purpose models
ollama pull llama3.2
# Coding models
ollama pull codellama
ollama pull deepseek-coder
# Specialized models
ollama pull deepseek-r1
ollama pull gemma2
ollama pull qwen2.5Check that Ollama is running and models are available:
# List installed models
ollama list
# Test a model
ollama run llama3.2 "Hello, how are you?"Configure AIScript to use Ollama models:
[ai.ollama]
api_endpoint = "http://localhost:11434/v1"
model = "llama3.2" # Default modelexport OLLAMA_API_ENDPOINT="http://localhost:11434/v1"let response = prompt {
input: "Explain how neural networks work",
model: "llama3.2"
};Ollama not responding:
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Restart Ollama service
ollama serveModel not found:
# List available models
ollama list
# Pull missing model
ollama pull llama3.2Out of memory:
- Try a smaller model (e.g.,
phi3instead ofllama3.1) - Close other applications
- Use quantized models
- Slow responses: Try smaller models or increase hardware resources
- High memory usage: Monitor with
htopor Activity Monitor - GPU not utilized: Ensure GPU drivers are properly installed
| Aspect | Local Models | Cloud Models |
|---|---|---|
| Privacy | ✅ Complete | ❌ Data sent externally |
| Cost | ✅ One-time setup | ❌ Per-token billing |
| Internet | ✅ Works offline | ❌ Requires connection |
| Latency | ✅ Low (local) | ❌ Network dependent |
| Quality | ✅ Usually higher | |
| Maintenance | ❌ Self-managed | ✅ Fully managed |
| Scaling | ❌ Hardware limited | ✅ Unlimited |
Local models with Ollama provide a powerful way to integrate AI capabilities into your AIScript applications while maintaining full control over your data and costs.