OpenAI
o3 is one of OpenAI's reasoning-focused models, built for harder multi-step tasks where deliberate problem solving matters more than simple chat completion. The benchmark snapshot highlights how that reasoning emphasis translates into scores, latency, and value versus general-purpose models.
Introducing o3 and o4-miniRank #143 across 526
Rank #67 across 436
Rank #297 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 30.4 | #143 |
| Artificial Analysis Coding Index | coding | 38.4 | #67 |
| Artificial Analysis Math Index | math | 88.3 | #35 |
| MMLU-Pro | reasoning | 85.3% | #29 |
| reasoning |
| 82.7% |
| #92 |
| Humanity's Last Exam | reasoning | 20.0% | #79 |
| LiveCodeBench | coding | 80.8% | #26 |
| SciCode | coding, reasoning | 41.0% | #91 |
| MATH-500 | math | 99.2% | #3 |
| AIME | math | 90.3% | #8 |
| Output Speed | speed | 122.3 tok/s | #113 |
| Time to First Token | speed | 6.48s | #256 |
| Blended Price | cost | $3.50/M | #297 |
| Input Price | cost | $2.00/M | #304 |
| Output Price | cost | $8.00/M | #280 |
| Value Index | cost, overall | 8.7 | #282 |