Time to First Token

Median seconds until the first output token.

Time to First Token measures perceived responsiveness for streamed responses. Artificial Analysis defines it as the time from request submission to the first received token, with lower values ranking better in this app.

Test type: Hosted endpoint latency measurement. Lower values mean the response starts sooner.

0.24s

Current leader: NVIDIA Nemotron Nano 9B V2 (Reasoning)

Project links

Values come from Artificial Analysis performance data in the committed snapshot.

Artificial Analysis methodology

Leaderboard

Rank	Model	Creator	Value	Speed	Blended Price
#1	NVIDIA Nemotron Nano 9B V2 (Reasoning)	NVIDIA	0.24s	121.6 tok/s	$0.070/M
#2

Time to First Token

Coverage

Top TTFT Models

Leaderboard

Time to First Token

Coverage

Top TTFT Models

Leaderboard