OpenAI
o3-mini (high) is one of OpenAI's reasoning-focused models, built for harder multi-step tasks where deliberate problem solving matters more than simple chat completion. The benchmark snapshot highlights how that reasoning emphasis translates into scores, latency, and value versus general-purpose models.
Introducing o3 and o4-miniQueryable facts extracted from the upstream model payload.
Rank #10 across 201 models.
Rank #14 across 194 models.
Rank #55 across 343 models.
Rank #277 across 293 models.
Rank #222 across 325 models.
Rank #219 across 323 models.
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 25.2 | #187 |
| Artificial Analysis Coding Index | coding | 17.3 | #218 |
| MMLU-Pro | reasoning | 80.2% | #112 |
| GPQA | reasoning | 77.3% | #129 |
| Humanity's Last Exam |
| reasoning |
| 12.3% |
| #120 |
| LiveCodeBench | coding | 73.4% | #55 |
| SciCode | coding, reasoning | 39.8% | #102 |
| MATH-500 | math | 98.5% | #10 |
| AIME | math | 86.0% | #14 |
| Output Speed | speed | 140 tok/s | #65 |
| Time to First Token | speed | 22.53s | #277 |
| Blended Price | cost | $1.93/M | #220 |
| Input Price | cost | $1.10/M | #222 |
| Output Price | cost | $4.40/M | #220 |
| Value Index | cost, overall | 13.1 | #219 |