EBEasy BenchmarksLLM model index
Workspace
Overview
Benchmarks
Benchmarks list
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
Models
All models
GPT-5.5 (xhigh)
GPT-5.5 (high)
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Gemini 3.1 Pro Preview
GPT-5.4 (xhigh)
Artificial Analysis data
Back

OpenAI

o4-mini (high)

o4-mini (high) is one of OpenAI's reasoning-focused models, built for harder multi-step tasks where deliberate problem solving matters more than simple chat completion. The benchmark snapshot highlights how that reasoning emphasis translates into scores, latency, and value versus general-purpose models.

Introducing o3 and o4-mini

Operational Metrics

Output Speed124.5 tok/s
First Token17.05s
Blended Price$1.93/M

Model Metadata

Queryable facts extracted from the upstream model payload.

ReleaseApr 16, 2025
Context Windown/a
Modalitiesn/a
API fields: release_date

Strength: AIME

Rank #3 across 194 models.

94.0%

Strength: MATH-500

Rank #7 across 201 models.

98.9%

Strength: LCB

Rank #12 across 343 models.

85.9%

Watch Area: TTFT

Rank #268 across 293 models.

17.05s

Watch Area: Input $

Rank #223 across 325 models.

$1.10/M

Watch Area: Blended $

Rank #221 across 325 models.

$1.93/M

Strength Profile

Percentile score by analysis domain.

* Cost is inverted: lower input, output, and blended prices rank higher.

Benchmark Percentiles

Higher bars mean stronger relative placement.

All Benchmarks

MetricDomainValueRank
Artificial Analysis Intelligence Indexoverall33.1#118
Artificial Analysis Coding Indexcoding25.6#144
Artificial Analysis Math Indexmath90.7#24
MMLU-Proreasoning83.2%#58
GPQA
reasoning
78.4%
#117
Humanity's Last Examreasoning17.5%#82
LiveCodeBenchcoding85.9%#12
SciCodecoding, reasoning46.5%#33
MATH-500math98.9%#7
AIMEmath94.0%#3
Output Speedspeed124.5 tok/s#88
Time to First Tokenspeed17.05s#268
Blended Pricecost$1.93/M#221
Input Pricecost$1.10/M#223
Output Pricecost$4.40/M#221
Value Indexcost, overall17.2#201