Easy Benchmarks
Workspace
Overview
Benchmarks
Benchmarks list
Compare
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
LLMs
Audio
Image
Video
Feedback
Log inSign up
Back

Speed

Runtime leaderboard using output speed and latency.

Leader

Domain score averages relative percentile across included metrics.

Qwen3.5 2B (Non-reasoning)

Domain score 99

SpeedTTFT

Top Speed Models

Runtime leaderboard using output speed and latency.

Domain Leaderboard

RankModelCreatorDomain ScoreSpeedBlended Price
#1Qwen3.5 2B (Non-reasoning)Alibaba98.8318.9 tok/s$0.040/M
#2Llama 3.1 Nemotron Instruct 70BNVIDIA97.8290.5 tok/s$1.20/M
#3
Command A+
Cohere
95.9
211.8 tok/s
-
#4Gemini 2.5 Flash-Lite (Non-reasoning)Google94.9229.5 tok/s$0.175/M
#5Qwen3.5 4B (Non-reasoning)Alibaba94.7208.9 tok/s$0.060/M
#6gpt-oss-20B (high)OpenAI94.2272.3 tok/s$0.088/M
#7Qwen3.5 4B (Reasoning)Alibaba94.0195.8 tok/s$0.060/M
#8gpt-oss-20B (low)OpenAI93.4274.1 tok/s$0.095/M
#9gpt-oss-120b (low)OpenAI92.7373.3 tok/s$0.262/M
#10gpt-oss-120b (high)OpenAI91.0358.8 tok/s$0.262/M
#11Llama 3.1 Instruct 8BMeta87.8201.5 tok/s$0.100/M
#12NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)NVIDIA87.8223.7 tok/s$0.300/M
#13Nemotron 3 Nano Omni 30B A3B ReasoningNVIDIA87.4276.7 tok/s$0.131/M
#14Ministral 3 3BMistral86.1155.6 tok/s$0.100/M
#15Mistral Small 4 (Non-reasoning)Mistral85.7171.9 tok/s$0.262/M
#16Nova MicroAmazon85.4302 tok/s$0.061/M
#17GPT-5.4 mini (Non-Reasoning)OpenAI84.4173.9 tok/s$1.69/M
#18Mistral Small 3.1Mistral84.2158.2 tok/s$0.138/M
#19Step 3.7 FlashStepFun84.0385.5 tok/s$0.438/M
#20Gemini 2.5 Flash (Non-reasoning)Google83.8185.1 tok/s$0.850/M
#21Mistral Small (Feb '24)Mistral83.7157.3 tok/s$1.50/M
#22Mistral Small (Sep '24)Mistral83.0151.5 tok/s$0.300/M
#23Mistral Small 3Mistral82.8153.7 tok/s$0.104/M
#24GPT-5 (ChatGPT)OpenAI82.5167.3 tok/s$3.44/M
#25Mistral Small 4 (Reasoning)Mistral82.0183.5 tok/s$0.262/M
#26Grok 4.3 (Non-reasoning)xAI81.8173.6 tok/s$1.56/M
#27Granite 4.1 8BIBM81.3134.2 tok/s$0.063/M
#28Nova LiteAmazon81.1191.7 tok/s$0.105/M
#29Tiny Aya GlobalCohere80.4123.9 tok/s-
#30Step 3.5 Flash 2603StepFun80.3231 tok/s$0.150/M
#31Mistral Small 3.2Mistral80.1127.1 tok/s$0.128/M
#32NVIDIA Nemotron Nano 9B V2 (Reasoning)NVIDIA79.8118 tok/s$0.070/M
#33Step 3.5 FlashStepFun79.1217.5 tok/s$0.150/M
#34LFM2 24B A2BLiquid AI78.6118.3 tok/s$0.052/M
#35GPT-3.5 TurboOpenAI78.4132.3 tok/s$0.750/M
#36Jamba 1.6 MiniAI21 Labs77.6185.9 tok/s$0.250/M
#37Ministral 3 8BMistral77.6119.9 tok/s$0.150/M
#38Gemini 3.5 Flash (minimal)Google77.2199.1 tok/s$3.38/M
#39GPT-5.4 nano (Non-Reasoning)OpenAI77.2157.9 tok/s$0.463/M
#40Mistral 7B InstructMistral76.2110.4 tok/s$0.206/M
#41Nova 2.0 Lite (Non-reasoning)Amazon75.7202.2 tok/s$0.850/M
#42Grok 4.20 0309 v2 (Non-reasoning)xAI75.5160.7 tok/s$3.00/M
#43Magistral Small 1.2Mistral75.5111.6 tok/s$0.750/M
#44Trinity Large ThinkingArcee AI75.3162.3 tok/s$0.395/M
#45Grok 4.20 0309 (Non-reasoning)xAI75.3158.9 tok/s$3.00/M
#46GPT-5 nano (minimal)OpenAI74.7153.9 tok/s$0.138/M
#47GPT-4o (Nov '24)OpenAI74.3142.1 tok/s$4.38/M
#48Nova 2.0 Pro Preview (Non-reasoning)Amazon74.3159.7 tok/s$3.44/M
#49GPT-4.1 nanoOpenAI74.1118.2 tok/s$0.175/M
#50Qwen3.5 0.8B (Non-reasoning)Alibaba74.188 tok/s$0.020/M
#51NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)NVIDIA73.587.3 tok/s$0.088/M
#52Qwen3.5 Omni FlashAlibaba73.5224.4 tok/s$0.275/M
#53Gemini 3 Flash Preview (Non-reasoning)Google73.1181.3 tok/s$1.13/M
#54GPT-4.1OpenAI73.1128.3 tok/s$3.50/M
#55Claude 4.5 Haiku (Non-reasoning)Anthropic72.3127.2 tok/s$2.19/M
#56Ministral 3 14BMistral71.186 tok/s$0.200/M
#57Llama 3 Instruct 8BMeta69.488.3 tok/s$0.070/M
#58NVIDIA Nemotron Nano 9B V2 (Non-reasoning)NVIDIA69.4133.6 tok/s$0.086/M
#59Llama 3.2 Instruct 11B (Vision)Meta69.287.5 tok/s$0.245/M
#60Llama 4 MaverickMeta68.4120.6 tok/s$0.475/M
#61GPT-4o (May '24)OpenAI68.092.6 tok/s$7.50/M
#62Llama 4 ScoutMeta67.9111.9 tok/s$0.292/M
#63GPT-4o (Aug '24)OpenAI67.294.1 tok/s$4.38/M
#64Qwen3 VL 8B InstructAlibaba66.3147.4 tok/s$0.310/M
#65Command ACohere66.065.6 tok/s$4.38/M
#66Kimi K2 ThinkingKimi65.8131.1 tok/s$1.08/M
#67Llama 3.2 Instruct 1BMeta65.592.9 tok/s$0.050/M
#68Llama 3.3 Instruct 70BMeta64.695 tok/s$0.612/M
#69NVIDIA Nemotron 3 Super 120B A12B (Reasoning)NVIDIA64.5149.6 tok/s$0.412/M
#70NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)NVIDIA64.3133.6 tok/s$0.096/M
#71GPT-4.1 miniOpenAI64.179.3 tok/s$0.700/M
#72Qwen3 30B A3B 2507 (Reasoning)Alibaba64.1139.3 tok/s$0.673/M
#73Mistral Medium 3.1Mistral63.470 tok/s$0.800/M
#74GPT-5.1 (Non-reasoning)OpenAI62.493.1 tok/s$3.44/M
#75Cogito v2.1 (Reasoning)Deep Cogito62.462.8 tok/s$1.25/M
#76Hermes 4 - Llama-3.1 70B (Non-reasoning)Nous Research61.984.9 tok/s$0.198/M
#77Hermes 4 - Llama-3.1 70B (Reasoning)Nous Research61.987.2 tok/s$0.198/M
#78DeepSeek V4 Flash (Non-reasoning)DeepSeek61.6120.2 tok/s$0.175/M
#79Qwen3 VL 30B A3B (Reasoning)Alibaba61.6126.8 tok/s$0.338/M
#80Qwen3 Next 80B A3B (Reasoning)Alibaba61.4135.7 tok/s$1.88/M