EBEasy BenchmarksLLM model index
Workspace
Overview
Benchmarks
Benchmarks list
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
Models
All models
GPT-5.5 (xhigh)
GPT-5.5 (high)
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Gemini 3.1 Pro Preview
GPT-5.4 (xhigh)
Artificial Analysis data
Back

OpenAI

o3-mini

o3-mini is one of OpenAI's reasoning-focused models, built for harder multi-step tasks where deliberate problem solving matters more than simple chat completion. The benchmark snapshot highlights how that reasoning emphasis translates into scores, latency, and value versus general-purpose models.

Introducing o3 and o4-mini

Operational Metrics

Output Speed140.1 tok/s
First Token9.18s
Blended Price$1.93/M

Model Metadata

Queryable facts extracted from the upstream model payload.

ReleaseJan 31, 2025
Context Windown/a
Modalitiesn/a
API fields: release_date

Strength: MATH-500

Rank #24 across 201 models.

97.3%

Strength: AIME

Rank #27 across 194 models.

77.0%

Strength: LCB

Rank #61 across 343 models.

71.7%

Watch Area: TTFT

Rank #258 across 293 models.

9.18s

Watch Area: Input $

Rank #221 across 325 models.

$1.10/M

Watch Area: Value

Rank #218 across 323 models.

13.5

Strength Profile

Percentile score by analysis domain.

* Cost is inverted: lower input, output, and blended prices rank higher.

Benchmark Percentiles

Higher bars mean stronger relative placement.

All Benchmarks

MetricDomainValueRank
Artificial Analysis Intelligence Indexoverall25.9#180
Artificial Analysis Coding Indexcoding17.9#211
MMLU-Proreasoning79.1%#129
GPQAreasoning74.8%#161
Humanity's Last Exam
reasoning
8.7%
#174
LiveCodeBenchcoding71.7%#61
SciCodecoding, reasoning39.9%#98
MATH-500math97.3%#24
AIMEmath77.0%#27
Output Speedspeed140.1 tok/s#64
Time to First Tokenspeed9.18s#258
Blended Pricecost$1.93/M#219
Input Pricecost$1.10/M#221
Output Pricecost$4.40/M#219
Value Indexcost, overall13.5#218