@@ -125,6 +125,9 @@ State-of-the-art proprietary AI models with cutting-edge capabilities from leadi
125125| ** Kimi K2.5** | Moonshot AI | 256K | Native multimodal, thinking & agent tasks | $0.60 / $3.00 |
126126| ** DeepSeek-V4** | DeepSeek | 1M+ | Engram memory, coding focus | Pay-per-token |
127127| ** Qwen3.5-Max** | Alibaba | 128K | Hybrid attention, native VLM | Pay-per-token |
128+ | ** Amazon Nova Pro** | AWS | 2M | Multimodal, 128K output, low latency | $0.60 / $2.40 |
129+ | ** Amazon Nova Lite** | AWS | 1M | Cost-efficient multimodal | $0.06 / $0.24 |
130+ | ** Amazon Nova Micro** | AWS | 1M | Fast, lightweight | $0.04 / $0.12 |
128131| ** Gemini 3 Pro** | Google | 1M+ | PhD-level reasoning, agentic tool-use | Tiered pricing |
129132| ** Gemini 3 Flash** | Google | 10M | Pro-grade reasoning, Flash speed | $0.30 / $2.50 |
130133| ** Gemini 3.1 Flash-Lite** | Google | 1M | Fast, cost-efficient multimodal model for high-volume tasks | $0.25 / $1.50 |
@@ -136,6 +139,7 @@ State-of-the-art proprietary AI models with cutting-edge capabilities from leadi
136139| ** Llama 4 Maverick** | Meta | 128K | 400B params, multimodal | Free (self-host) |
137140| ** Grok 4** | xAI | 128K | First-principles reasoning | $3 / $15 |
138141| ** Grok 4 Fast** | xAI | 128K | Cost-efficient variant | $0.20 / $1.50 |
142+ | ** Command R+** | Cohere | 128K | Enterprise-grade, multilingual, tool use | Pay-per-token |
139143
140144#### Top Models by Category
141145
@@ -157,6 +161,8 @@ Self-hostable models with permissive licenses or open weights for privacy, cost
157161| ** Qwen3.5-Max** | Alibaba | 1T+ | 128K | Apache 2.0 |
158162| ** Qwen3-Max-Thinking** | Alibaba | 1T+ | 128K | Apache 2.0 |
159163| ** Mistral Large 3** | Mistral AI | 675B (MoE) | 128K | Apache 2.0 |
164+ | ** Gemma 4 31B Dense** | Google DeepMind | 31B | 256K | Apache 2.0 | SOTA open model, 1452 LM Arena |
165+ | ** Gemma 4 26B A4B** | Google DeepMind | 26B (4B active) | 256K | Apache 2.0 | MoE, efficient, multimodal |
160166| ** Llama 4 Scout** | Meta | 109B | 10M | Community |
161167| ** Llama 4 Maverick** | Meta | 400B | 128K | Community |
162168| ** GPT-OSS-120B** | OpenAI | 117B | 128K | Apache 2.0 |
@@ -167,6 +173,30 @@ Self-hostable models with permissive licenses or open weights for privacy, cost
167173| ** Granite 4.0** | IBM | 8B-3B | 128K | Apache 2.0 |
168174| ** DeepSeek-Coder-V2** | DeepSeek | 236B | 128K | MIT |
169175| ** Yi-Coder** | 01.AI | 9B/1.5B | 128K | Apache 2.0 |
176+ | ** Mistral Small 3.1** | Mistral AI | 24B | 128K | Apache 2.0 |
177+ | ** Mistral Nemo** | Mistral AI | 12B | 128K | Apache 2.0 |
178+
179+ ### Small Language Models (SLM) 📱
180+
181+ Compact models optimized for on-device deployment, edge devices, and resource-constrained environments.
182+
183+ | Model | Developer | Params | Context | License | Key Features |
184+ | -------| -----------| --------| ---------| ---------| --------------|
185+ | ** Gemma 4 E2B** | Google DeepMind | 2.3B (5.1B with embeddings) | 128K | Apache 2.0 | On-device MoE, multimodal (text/image/audio/video), function calling |
186+ | ** Gemma 4 E4B** | Google DeepMind | 4.5B (8B with embeddings) | 128K | Apache 2.0 | On-device MoE, multimodal, better reasoning than E2B |
187+ | ** Gemma 3 4B** | Google DeepMind | 4B | 128K | Apache 2.0 | Lightweight, efficient, multimodal |
188+ | ** Phi-4** | Microsoft | 14B | 128K | MIT | Strong reasoning, code generation |
189+ | ** Mistral Small 3.1** | Mistral AI | 24B | 128K | Apache 2.0 | Best-in-class for size, multimodal |
190+ | ** Mistral Nemo** | Mistral AI | 12B | 128K | Apache 2.0 | Cost-efficient, open-weight |
191+ | ** Granite 4.0** | IBM | 8B-3B | 128K | Apache 2.0 | Enterprise-ready, multilingual |
192+
193+ #### On-Device SLM Benchmarks
194+
195+ | Model | MMLU Pro | GPQA Diamond | AIME 2026 | LiveCodeBench |
196+ | -------| ----------| ---------------| -----------| ---------------|
197+ | ** Gemma 4 E4B** | 69.4% | 58.6% | 42.5% | 52.0% |
198+ | ** Gemma 4 E2B** | 60.0% | 43.4% | 37.5% | 44.0% |
199+ | ** Mistral Small 3.1** | ~ 67% | ~ 45% | ~ 35% | ~ 40% |
170200
171201#### Deployment Options
172202
@@ -916,6 +946,11 @@ Side-by-side comparisons of AI models sorted by various criteria.
916946| 🏢 Company | 🤖 Model | 📦 Version | 📅 Release Date | 🔄 Latest Updated | 💻 Coding | 📊 Benchmarks | 💰 Price | 🖥️ Self-Host | 🔗 Official Site |
917947| :---:| ---| ---| ---| ---| :---:| ---| ---| :---:| :---:|
918948| 🤖 OpenAI | GPT-5 | 5.4 mini | 2026-03-17 00:00 UTC | 2026-03-17 00:00 UTC ⭐ | ✅ | N/A | $0.75 / $4.50 | ❌ | [ 🔗] ( https://openai.com/news/?display=list ) |
949+ | 🌐 Google DeepMind | Gemma 4 | 31B Dense | 2026-04-02 00:00 UTC | 2026-04-02 00:00 UTC ⭐ | ✅ | GPQA 84.3% | Free (self-host) | ✅ | [ 🔗] ( https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/ ) |
950+ | 🌐 Google DeepMind | Gemma 4 | E2B | 2026-04-02 00:00 UTC | 2026-04-02 00:00 UTC ⭐ | ✅ | MMLU 60% | Free (self-host) | ✅ | [ 🔗] ( https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/ ) |
951+ | ☁️ AWS | Amazon Nova | Pro | 2025-09-02 00:00 UTC | 2025-09-02 00:00 UTC | ✅ | N/A | $0.60 / $2.40 | ❌ | [ 🔗] ( https://docs.aws.amazon.com/ai/responsible-ai/nova-micro-lite-pro/overview.html ) |
952+ | ☁️ AWS | Amazon Nova | Lite | 2025-09-02 00:00 UTC | 2025-09-02 00:00 UTC | ✅ | N/A | $0.06 / $0.24 | ❌ | [ 🔗] ( https://docs.aws.amazon.com/ai/responsible-ai/nova-micro-lite-pro/overview.html ) |
953+ | 🔬 Cohere | Command | R+ | 2024-08-30 00:00 UTC | 2024-08-30 00:00 UTC | ✅ | N/A | Pay-per-token | ❌ | [ 🔗] ( https://cohere.com/ ) |
919954| 🤖 OpenAI | GPT-5 | 5.4 | 2026-03-05 00:00 UTC | 2026-03-05 00:00 UTC ⭐ | ✅ | N/A | $2.50 / $15.00 | ❌ | [ 🔗] ( https://openai.com/research/index/release/ ) |
920955| 🌐 Google DeepMind | Gemini 3.1 | Flash-Lite | 2026-03-03 00:00 UTC | 2026-03-03 00:00 UTC ⭐ | ✅ | N/A | $0.25 / $1.50 | ❌ | [ 🔗] ( https://deepmind.google/models/model-cards/gemini-3-1-flash-lite/ ) |
921956| 🔬 DeepSeek | DeepSeek | V4 | 2026-02-17 00:00 UTC | 2026-02-17 00:00 UTC | ✅ | N/A | Pay-per-token | ✅ | [ 🔗] ( https://www.deepseek.com/ ) |
@@ -931,6 +966,10 @@ Side-by-side comparisons of AI models sorted by various criteria.
931966| :---:| ---| ---| ---| :---:|
932967| 🧠 MiniMax | MiniMax M2.5 | 2026-02 | $0.30 / $1.20 | [ 🔗] ( https://platform.minimax.io/docs/guides/models-intro ) |
933968| 🇨🇳 Alibaba/Qwen | Qwen 3.5-Max | 2026-02 | Open-source release window | [ 🔗] ( https://qwenlm.github.io/ ) |
969+ | 🌐 Google DeepMind | Gemma 4 | 2026-04 | Apache 2.0 (E2B, E4B, 31B, 26B) | [ 🔗] ( https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/ ) |
970+ | 💻 Mistral AI | Mistral Small 3.1 | 2025-03 | Apache 2.0 (24B) | [ 🔗] ( https://mistral.ai/news/mistral-small-3-1 ) |
971+ | 💻 Mistral AI | Mistral Nemo | 2024-07 | Apache 2.0 (12B) | [ 🔗] ( https://mistral.ai/news/mistral-nemo ) |
972+ | ☁️ AWS | Amazon Nova | 2025-09 | Pro/Lite/Micro | [ 🔗] ( https://docs.aws.amazon.com/ai/responsible-ai/nova-micro-lite-pro/overview.html ) |
934973| 🌐 Google DeepMind | Gemini 3.1 Flash-Lite | 2026-03 | $0.25 / $1.50 | [ 🔗] ( https://deepmind.google/models/model-cards/gemini-3-1-flash-lite/ ) |
935974| 🌐 Google DeepMind | Gemini 3 Pro | 2026-01 | Tiered pricing | [ 🔗] ( https://deepmind.google/models/gemini/ ) |
936975| 🤖 OpenAI | GPT-5.4 family | 2026-03 | GPT-5.4, GPT-5.4 mini, GPT-5.4 nano | [ 🔗] ( https://openai.com/news/?display=list ) |
@@ -947,7 +986,9 @@ Side-by-side comparisons of AI models sorted by various criteria.
947986| 5 | ** Yi-Lightning** | $0.14 | $0.42 | Apache 2.0 |
948987| 6 | ** GPT-5.4 nano** | $0.20 | $1.25 | Proprietary |
949988| 7 | ** Gemini 3.1 Flash-Lite** | $0.25 | $1.50 | Proprietary |
950- | 8 | ** DeepSeek-V3.1** | $0.27 | $0.41 | MIT |
989+ | 8 | ** Amazon Nova Lite** | $0.06 | $0.24 | Proprietary |
990+ | 9 | ** Amazon Nova Micro** | $0.04 | $0.12 | Proprietary |
991+ | 10 | ** DeepSeek-V3.1** | $0.27 | $0.41 | MIT |
951992| 9 | ** Gemini 3 Flash** | $0.30 | $2.50 | Proprietary |
952993| 10 | ** MiniMax-M2.5** | $0.30 | $1.20 | Proprietary |
953994
@@ -967,8 +1008,11 @@ Side-by-side comparisons of AI models sorted by various criteria.
9671008| ------| -------| ---------| ----------|
9681009| 1 | ** Gemini 3 Flash** | 10M | Entire libraries |
9691010| 2 | ** Llama 4 Scout** | 10M | Long-document RAG |
970- | 3 | ** Gemini 3 Pro** | 1M+ | Research papers |
971- | 4 | ** Kimi K2.5** | 256K | Large codebases |
1011+ | 3 | ** Gemma 4 31B Dense** | 256K | Large context apps |
1012+ | 4 | ** Gemma 4 26B A4B** | 256K | Efficient large context |
1013+ | 5 | ** Gemini 3 Pro** | 1M+ | Research papers |
1014+ | 6 | ** Kimi K2.5** | 256K | Large codebases |
1015+ | 7 | ** Mistral Small 3.1** | 128K | Compact on-device |
9721016
9731017### Data Sources 📚
9741018
@@ -1003,6 +1047,12 @@ Attribution, verification sources, and methodology.
10031047| ** Moonshot AI** | Developer Documentation | [ platform.moonshot.ai] ( https://platform.moonshot.ai/docs/overview ) |
10041048| ** Moonshot AI** | Models & Pricing | [ platform.moonshot.ai] ( https://platform.moonshot.ai/docs/pricing/chat ) |
10051049| ** Cohere** | Developer Documentation | [ docs.cohere.com] ( https://docs.cohere.com ) |
1050+ | ** Cohere** | Command R+ Model Card | [ cohere.com] ( https://cohere.com/models/command ) |
1051+ | ** AWS** | Amazon Nova Service Cards | [ docs.aws.amazon.com] ( https://docs.aws.amazon.com/ai/responsible-ai/nova-micro-lite-pro/overview.html ) |
1052+ | ** Mistral AI** | Mistral Small 3.1 Release | [ mistral.ai] ( https://mistral.ai/news/mistral-small-3-1 ) |
1053+ | ** Mistral AI** | Mistral Nemo Release | [ mistral.ai] ( https://mistral.ai/news/mistral-nemo ) |
1054+ | ** Google DeepMind** | Gemma 4 Release | [ blog.google] ( https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/ ) |
1055+ | ** Google DeepMind** | Gemma 4 on Hugging Face | [ huggingface.co/blog/gemma4] ( https://huggingface.co/blog/gemma4 ) |
10061056| ** AI21 Labs** | Developer Documentation | [ docs.ai21.com] ( https://docs.ai21.com/docs/jamba-foundation-models ) |
10071057| ** Perplexity** | Developer Documentation | [ docs.perplexity.ai] ( https://docs.perplexity.ai/docs/getting-started/pricing ) |
10081058| ** ByteDance (Volcengine)** | Developer Documentation | [ volcengine.com] ( https://www.volcengine.com/docs/82379/1263482 ) |
0 commit comments