Cost leaderboard using input, output, blended price, and value index.
Domain score averages relative percentile across included metrics.
Domain score 100
Cost leaderboard using input, output, blended price, and value index.
| Rank | Model | Creator | Domain Score | Speed | Blended Price |
|---|---|---|---|---|---|
| #1 | Qwen3.5 0.8B (Non-reasoning) | Alibaba | 99.8 | 88 tok/s | $0.020/M |
| #2 |
| Alibaba |
| 99.6 |
| n/a |
| $0.020/M |
| #3 | Qwen3.5 2B (Non-reasoning) | Alibaba | 98.7 | 318.9 tok/s | $0.040/M |
| #4 | Gemma 3n E4B Instruct | 98.7 | 50 tok/s | $0.025/M |
| #5 | Qwen3.5 2B (Reasoning) | Alibaba | 98.6 | n/a | $0.040/M |
| #6 | Sarvam 30B (high) | Sarvam | 97.7 | 147.9 tok/s | $0.047/M |
| #7 | Qwen3.5 4B (Non-reasoning) | Alibaba | 97.4 | 208.9 tok/s | $0.060/M |
| #8 | Qwen3.5 4B (Reasoning) | Alibaba | 97.3 | 195.8 tok/s | $0.060/M |
| #9 | LFM2 24B A2B | Liquid AI | 96.5 | 118.3 tok/s | $0.052/M |
| #10 | Granite 4.1 8B | IBM | 95.7 | 134.2 tok/s | $0.063/M |
| #11 | Nova Micro | Amazon | 95.5 | 302 tok/s | $0.061/M |
| #12 | Gemma 3 4B Instruct | 95.5 | n/a | $0.050/M |
| #13 | NVIDIA Nemotron Nano 9B V2 (Reasoning) | NVIDIA | 95.3 | 118 tok/s | $0.070/M |
| #14 | Sarvam 105B (high) | Sarvam | 95.3 | 100.7 tok/s | $0.074/M |
| #15 | gpt-oss-20B (high) | OpenAI | 95.1 | 272.3 tok/s | $0.088/M |
| #16 | Llama 3.2 Instruct 1B | Meta | 94.9 | 92.9 tok/s | $0.050/M |
| #17 | gpt-oss-20B (low) | OpenAI | 93.5 | 274.1 tok/s | $0.095/M |
| #18 | NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) | NVIDIA | 93.3 | 133.6 tok/s | $0.096/M |
| #19 | NVIDIA Nemotron Nano 9B V2 (Non-reasoning) | NVIDIA | 93.1 | 133.6 tok/s | $0.086/M |
| #20 | NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) | NVIDIA | 92.6 | 87.3 tok/s | $0.088/M |
| #21 | Qwen3.5 9B (Reasoning) | Alibaba | 92.5 | 69.4 tok/s | $0.113/M |
| #22 | Qwen2.5 Turbo | Alibaba | 92.0 | 66.4 tok/s | $0.088/M |
| #23 | Llama 3 Instruct 8B | Meta | 91.9 | 88.3 tok/s | $0.070/M |
| #24 | Llama 3.1 Instruct 8B | Meta | 91.3 | 201.5 tok/s | $0.100/M |
| #25 | Mistral Small 3 | Mistral | 91.2 | 153.7 tok/s | $0.104/M |
| #26 | Ministral 3 3B | Mistral | 90.6 | 155.6 tok/s | $0.100/M |
| #27 | Nova Lite | Amazon | 90.3 | 191.7 tok/s | $0.105/M |
| #28 | Granite 3.3 8B (Non-reasoning) | IBM | 89.9 | 400.9 tok/s | $0.085/M |
| #29 | GPT-5 nano (high) | OpenAI | 89.8 | 150.4 tok/s | $0.138/M |
| #30 | Llama 2 Chat 7B | Meta | 89.6 | 100.6 tok/s | $0.100/M |
| #31 | GPT-5 nano (medium) | OpenAI | 89.5 | 167 tok/s | $0.138/M |
| #32 | Nemotron 3 Nano Omni 30B A3B Reasoning | NVIDIA | 89.3 | 276.7 tok/s | $0.131/M |
| #33 | MiMo-V2-Flash (Feb 2026) | Xiaomi | 89.3 | 124.9 tok/s | $0.150/M |
| #34 | Granite 4.0 H Small | IBM | 89.2 | 454.2 tok/s | $0.107/M |
| #35 | Mistral Small 3.2 | Mistral | 89.0 | 127.1 tok/s | $0.128/M |
| #36 | MiMo-V2-Flash (Reasoning) | Xiaomi | 88.6 | 129.5 tok/s | $0.150/M |
| #37 | Ling 2.6 Flash | InclusionAI | 88.3 | n/a | $0.150/M |
| #38 | MiMo-V2-Flash (Non-reasoning) | Xiaomi | 88.1 | 122.8 tok/s | $0.150/M |
| #39 | GLM-4.7-Flash (Reasoning) | Z AI | 87.8 | 94.1 tok/s | $0.153/M |
| #40 | Step 3.5 Flash 2603 | StepFun | 87.6 | 231 tok/s | $0.150/M |
| #41 | Step 3.5 Flash | StepFun | 87.6 | 217.5 tok/s | $0.150/M |
| #42 | GLM-4.7-Flash (Non-reasoning) | Z AI | 87.2 | 105.6 tok/s | $0.153/M |
| #43 | DeepSeek V4 Flash (Reasoning, High Effort) | DeepSeek | 87.1 | n/a | $0.175/M |
| #44 | GPT-5 nano (minimal) | OpenAI | 87.1 | 153.9 tok/s | $0.138/M |
| #45 | Qwen3 30B A3B (Non-reasoning) | Alibaba | 87.0 | 68.6 tok/s | $0.133/M |
| #46 | DeepSeek V4 Flash (Reasoning, Max Effort) | DeepSeek | 87.0 | 107.8 tok/s | $0.175/M |
| #47 | Mistral Small 3.1 | Mistral | 87.0 | 158.2 tok/s | $0.138/M |
| #48 | Devstral Small (Jul '25) | Mistral | 87.0 | 42.3 tok/s | $0.150/M |
| #49 | DeepSeek V4 Flash (Non-reasoning) | DeepSeek | 86.6 | 120.2 tok/s | $0.175/M |
| #50 | MiMo-V2.5 | Xiaomi | 86.4 | 77.4 tok/s | $0.175/M |
| #51 | Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) | 85.4 | n/a | $0.175/M |
| #52 | Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) | 85.2 | n/a | $0.175/M |
| #53 | Ministral 3 8B | Mistral | 85.2 | 119.9 tok/s | $0.150/M |
| #54 | Gemini 2.5 Flash-Lite (Reasoning) | 84.9 | 265.2 tok/s | $0.175/M |
| #55 | Olmo 3 7B Instruct | Allen Institute for AI | 84.8 | n/a | $0.125/M |
| #56 | Apertus 8B Instruct | Swiss AI Initiative | 84.3 | n/a | $0.125/M |
| #57 | Gemma 3 12B Instruct | 83.9 | n/a | $0.140/M |
| #58 | Gemma 3 27B Instruct | 83.7 | n/a | $0.145/M |
| #59 | Gemma 4 26B A4B (Reasoning) | 83.4 | n/a | $0.198/M |
| #60 | Llama Nemotron Super 49B v1.5 (Reasoning) | NVIDIA | 83.3 | 44.2 tok/s | $0.175/M |
| #61 | Gemma 4 26B A4B (Non-reasoning) | 83.2 | 82.3 tok/s | $0.198/M |
| #62 | Gemini 2.5 Flash-Lite (Non-reasoning) | 82.7 | 229.5 tok/s | $0.175/M |
| #63 | Llama 3.2 Instruct 3B | Meta | 82.7 | 52.3 tok/s | $0.150/M |
| #64 | Hy3-preview (Reasoning) | Tencent | 82.6 | 96 tok/s | $0.200/M |
| #65 | Solar Mini | Upstage | 82.6 | 75.9 tok/s | $0.150/M |
| #66 | Gemma 4 31B (Non-reasoning) | 82.4 | 56.9 tok/s | $0.205/M |
| #67 | Hy3-preview (Non-reasoning) | Tencent | 82.1 | 83.9 tok/s | $0.200/M |
| #68 | GPT-4.1 nano | OpenAI | 81.5 | 118.2 tok/s | $0.175/M |
| #69 | Llama Nemotron Super 49B v1.5 (Non-reasoning) | NVIDIA | 81.5 | 43.9 tok/s | $0.175/M |
| #70 | Qwen3 30B A3B (Reasoning) | Alibaba | 81.3 | 68.5 tok/s | $0.180/M |
| #71 | Ministral 3 14B | Mistral | 79.2 | 86 tok/s | $0.200/M |
| #72 | Hermes 4 - Llama-3.1 70B (Reasoning) | Nous Research | 78.8 | 87.2 tok/s | $0.198/M |
| #73 | Qwen3 8B (Non-reasoning) | Alibaba | 78.6 | 64.4 tok/s | $0.185/M |
| #74 | gpt-oss-120b (high) | OpenAI | 78.1 | 358.8 tok/s | $0.262/M |
| #75 | Qwen3 4B (Non-reasoning) | Alibaba | 77.7 | n/a | $0.188/M |
| #76 | Hermes 4 - Llama-3.1 70B (Non-reasoning) | Nous Research | 77.6 | 84.9 tok/s | $0.198/M |
| #77 | Grok 4 Fast (Reasoning) | xAI | 77.1 | n/a | $0.275/M |
| #78 | Qwen3.5 Omni Flash | Alibaba | 76.7 | 224.4 tok/s | $0.275/M |
| #79 | Mistral Small 4 (Reasoning) | Mistral | 76.3 | 183.5 tok/s | $0.262/M |
| #80 | Qwen3 30B A3B 2507 Instruct | Alibaba | 75.9 | 105.2 tok/s | $0.213/M |