Humanity's Last Exam

Difficult broad knowledge and reasoning benchmark score.

Humanity's Last Exam is a broad expert benchmark from CAIS and Scale AI. The public project describes 2,500 difficult questions across many subjects, with closed-ended answers for automatic grading and held-out questions to monitor overfitting.

Test type: Closed-ended expert reasoning and knowledge benchmark with automatic grading.

44.7%

Current leader: Gemini 3.1 Pro Preview

Project links

This app ranks the HLE score exposed by the Artificial Analysis snapshot.

Official website GitHub

Leaderboard

Rank	Model	Creator	Value	Speed	Blended Price
#1	Gemini 3.1 Pro Preview	Google	44.7%	131.2 tok/s	$4.50/M
#2

Humanity's Last Exam

Coverage

Top HLE Models

Leaderboard

Humanity's Last Exam

Coverage

Top HLE Models

Leaderboard