EBEasy BenchmarksLLM model index
Workspace
Overview
Benchmarks
Benchmarks list
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
Models
All models
GPT-5.5 (xhigh)
GPT-5.5 (high)
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Gemini 3.1 Pro Preview
GPT-5.4 (xhigh)
Artificial Analysis data
Back

xAI

Grok 4

Grok 4 is an xAI Grok model profile, part of xAI's assistant and reasoning model family for real-time, coding, and general-purpose AI workflows. The benchmark snapshot highlights how this Grok variant compares with competing frontier and open-weight models on capability, speed, latency, and cost.

xAI model releases

Operational Metrics

Output Speed50.3 tok/s
First Token7.89s
Blended Price$6.00/M

Model Metadata

Queryable facts extracted from the upstream model payload.

ReleaseJul 10, 2025
Context Windown/a
Modalitiesn/a
API fields: release_date

Strength: AIME

Rank #2 across 194 models.

94.3%

Strength: MATH-500

Rank #6 across 201 models.

99.0%

Strength: MMLU-Pro

Rank #15 across 345 models.

86.6%

Watch Area: Output $

Rank #302 across 325 models.

$15.00/M

Watch Area: Blended $

Rank #300 across 325 models.

$6.00/M

Watch Area: Input $

Rank #300 across 325 models.

$3.00/M

Strength Profile

Percentile score by analysis domain.

* Cost is inverted: lower input, output, and blended prices rank higher.

Benchmark Percentiles

Higher bars mean stronger relative placement.

All Benchmarks

MetricDomainValueRank
Artificial Analysis Intelligence Indexoverall41.5#65
Artificial Analysis Coding Indexcoding40.5#44
Artificial Analysis Math Indexmath92.7#16
MMLU-Proreasoning86.6%#15
GPQA
reasoning
87.7%
#27
Humanity's Last Examreasoning23.9%#51
LiveCodeBenchcoding81.9%#23
SciCodecoding, reasoning45.7%#38
MATH-500math99.0%#6
AIMEmath94.3%#2
Output Speedspeed50.3 tok/s#226
Time to First Tokenspeed7.89s#250
Blended Pricecost$6.00/M#300
Input Pricecost$3.00/M#300
Output Pricecost$15.00/M#302
Value Indexcost, overall6.9#270