EBEasy BenchmarksLLM model index
Workspace
Overview
Benchmarks
Benchmarks list
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
Models
All models
GPT-5.5 (xhigh)
GPT-5.5 (high)
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Gemini 3.1 Pro Preview
GPT-5.4 (xhigh)
Artificial Analysis data
Back

Time to First Token

Median seconds until the first output token.

Time to First Token measures perceived responsiveness for streamed responses. Artificial Analysis defines it as the time from request submission to the first received token, with lower values ranking better in this app.

Test type: Hosted endpoint latency measurement. Lower values mean the response starts sooner.

Coverage

293 models have this metric.

0.24s

Current leader: NVIDIA Nemotron Nano 9B V2 (Reasoning)

Project links

Values come from Artificial Analysis performance data in the committed snapshot.

Artificial Analysis methodology

Top TTFT Models

Top models ranked by TTFT.

Leaderboard

RankModelCreatorValueSpeedBlended Price
#1NVIDIA Nemotron Nano 9B V2 (Reasoning)NVIDIA0.24s121.6 tok/s$0.070/M
#2
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)
NVIDIA
0.26s
125 tok/s
$0.300/M
#3Ministral 3 3BMistral0.30s287.6 tok/s$0.100/M
#4Llama Nemotron Super 49B v1.5 (Non-reasoning)NVIDIA0.30s51.3 tok/s$0.175/M
#5LFM2 24B A2BLiquid AI0.31s196.9 tok/s$0.052/M
#6Gemma 3n E2B InstructGoogle0.31s58.7 tok/s-
#7gpt-oss-20B (high)OpenAI0.32s242.3 tok/s$0.100/M
#8Llama Nemotron Super 49B v1.5 (Reasoning)NVIDIA0.33s50.8 tok/s$0.175/M
#9Ministral 3 8BMistral0.33s157.6 tok/s$0.150/M
#10Gemma 3n E4B InstructGoogle0.34s15.3 tok/s$0.025/M
#11Phi-4 Mini InstructMicrosoft0.35s44.2 tok/s-
#12Ministral 3 14BMistral0.35s121.6 tok/s$0.200/M
#13Magistral Small 1.2Mistral0.35s100.3 tok/s$0.750/M
#14Mistral 7B InstructMistral0.36s156.9 tok/s$0.250/M
#15Qwen3.5 9B (Reasoning)Alibaba0.36s62.9 tok/s$0.113/M
#16Phi-4 Multimodal InstructMicrosoft0.37s16.7 tok/s-
#17Mistral Small 3.2Mistral0.38s153.8 tok/s$0.150/M
#18Devstral Small (Jul '25)Mistral0.39s194.2 tok/s$0.150/M
#19Olmo 3.1 32B InstructAllen Institute for AI0.39s49.3 tok/s$0.300/M
#20Grok 3 mini Reasoning (high)xAI0.41s215.5 tok/s$0.350/M
#21gpt-oss-20B (low)OpenAI0.42s249.7 tok/s$0.108/M
#22Llama 3.1 Nemotron Instruct 70BNVIDIA0.42s36.4 tok/s$1.20/M
#23Grok 4.1 Fast (Non-reasoning)xAI0.42s112.1 tok/s$0.275/M
#24Mistral Medium 3Mistral0.42s56.8 tok/s$0.800/M
#25Qwen3.5 0.8B (Non-reasoning)Alibaba0.42s273.6 tok/s$0.020/M
#26QwQ 32BAlibaba0.43s30.4 tok/s$0.745/M
#27Hermes 3 - Llama-3.1 70BNous Research0.44s28.8 tok/s$0.300/M
#28Command ACohere0.44s50.7 tok/s$4.38/M
#29GPT-3.5 TurboOpenAI0.45s92 tok/s$0.750/M
#30Grok 4.20 0309 (Non-reasoning)xAI0.45s77.1 tok/s$3.00/M
#31Grok 4 Fast (Non-reasoning)xAI0.45s77.4 tok/s$0.275/M
#32Llama 3.1 Instruct 8BMeta0.45s164.4 tok/s$0.100/M
#33DeepSeek R1 Distill Llama 70BDeepSeek0.46s44 tok/s$0.875/M
#34GPT-4.1 nanoOpenAI0.46s125.2 tok/s$0.175/M
#35Qwen3.5 4B (Non-reasoning)Alibaba0.46s200.2 tok/s$0.060/M
#36Gemini 2.5 Flash-Lite (Non-reasoning)Google0.46s239.9 tok/s$0.175/M
#37Llama 3.2 Instruct 11B (Vision)Meta0.46s77.4 tok/s$0.245/M
#38Granite 4.1 8BIBM0.47s134.6 tok/s$0.063/M
#39Qwen3.5 4B (Reasoning)Alibaba0.47s204.8 tok/s$0.060/M
#40GPT-4o (Nov '24)OpenAI0.47s107.3 tok/s$4.38/M
#41Magistral Medium 1.2Mistral0.48s42 tok/s$2.75/M
#42gpt-oss-120B (high)OpenAI0.49s212.3 tok/s$0.263/M
#43Grok 3xAI0.50s52.2 tok/s$6.00/M
#44Devstral MediumMistral0.50s71.4 tok/s$0.800/M
#45Grok 4.20 0309 v2 (Non-reasoning)xAI0.50s86.6 tok/s$3.00/M
#46Llama 3 Instruct 8BMeta0.50s82.2 tok/s$0.070/M
#47gpt-oss-120B (low)OpenAI0.50s216.3 tok/s$0.263/M
#48Phi-4Microsoft0.51s41.6 tok/s$0.219/M
#49Qwen3.5 2B (Non-reasoning)Alibaba0.52s227 tok/s$0.040/M
#50Cogito v2.1 (Reasoning)Deep Cogito0.52s51.1 tok/s$1.25/M
#51Mistral Small 3.1Mistral0.52s138.8 tok/s$0.150/M
#52GPT-4o miniOpenAI0.52s59.9 tok/s$0.263/M
#53Llama 4 ScoutMeta0.52s109.2 tok/s$0.292/M
#54Gemini 2.5 Flash (Non-reasoning)Google0.53s189.1 tok/s$0.850/M
#55GPT-4.1 miniOpenAI0.53s78.4 tok/s$0.700/M
#56Devstral Small 2Mistral0.53s72.5 tok/s-
#57Mistral Small 4 (Non-reasoning)Mistral0.53s139.5 tok/s$0.263/M
#58Llama 3.2 Instruct 90B (Vision)Meta0.54s45.8 tok/s$1.38/M
#59Pixtral LargeMistral0.54s55.5 tok/s$3.00/M
#60Claude 4.5 Haiku (Non-reasoning)Anthropic0.57s99.9 tok/s$2.00/M
#61Trinity Large ThinkingArcee AI0.57s124.6 tok/s$0.395/M
#62GPT-5.4 (Non-reasoning)OpenAI0.59s57.2 tok/s$5.63/M
#63Hermes 4 - Llama-3.1 70B (Non-reasoning)Nous Research0.59s83.5 tok/s$0.198/M
#64Mistral Small (Sep '24)Mistral0.59s135 tok/s$0.300/M
#65Hermes 4 - Llama-3.1 70B (Reasoning)Nous Research0.59s78.6 tok/s$0.198/M
#66Mistral Medium 3.1Mistral0.59s56.3 tok/s$0.800/M
#67GPT-5.4 mini (Non-Reasoning)OpenAI0.60s152.7 tok/s$1.69/M
#68Llama 3.1 Instruct 70BMeta0.60s32.2 tok/s$0.560/M
#69Llama 3.3 Instruct 70BMeta0.60s90.1 tok/s$0.675/M
#70GPT-5.4 nano (Non-Reasoning)OpenAI0.61s148.5 tok/s$0.463/M
#71Mistral Small (Feb '24)Mistral0.61s130.6 tok/s$1.50/M
#72Mistral Large 3Mistral0.61s53.3 tok/s$0.750/M
#73Nova LiteAmazon0.63s186.8 tok/s$0.105/M
#74NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)NVIDIA0.63s163.1 tok/s$0.300/M
#75GPT-4o (May '24)OpenAI0.64s91.6 tok/s$7.50/M
#76Mistral Small 4 (Reasoning)Mistral0.64s149.5 tok/s$0.263/M
#77Nova MicroAmazon0.64s332.1 tok/s$0.061/M
#78GLM-5.1 (Reasoning)Z AI0.65s45.7 tok/s$2.15/M
#79Llama 3.2 Instruct 1BMeta0.65s97.7 tok/s$0.100/M
#80Llama 3.2 Instruct 3BMeta0.66s52.2 tok/s$0.150/M