EBEasy BenchmarksLLM model index
Workspace
Overview
Benchmarks
Benchmarks list
Overall Index
Coding
Math
MMLU-Pro
Speed
Value
Models
All models
GPT-5.5 (xhigh)
GPT-5.5 (high)
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Gemini 3.1 Pro Preview
GPT-5.4 (xhigh)
Artificial Analysis data
Back

Alibaba

Qwen3 235B A22B 2507 (Reasoning)

Qwen3 235B A22B 2507 (Reasoning) is a Alibaba language-model profile in the Easy Benchmarks snapshot. Use this page to compare its measured Artificial Analysis scores, output speed, time to first token, pricing, and relative ranking against other models in the local catalog.

Qwen model releases

Operational Metrics

Output Speed56 tok/s
First Token1.28s
Blended Price$2.63/M

Model Metadata

Queryable facts extracted from the upstream model payload.

ReleaseJul 25, 2025
Context Windown/a
Modalitiesn/a
API fields: release_date

Strength: AIME

Rank #4 across 194 models.

94.0%

Strength: MATH-500

Rank #11 across 201 models.

98.4%

Strength: Math

Rank #22 across 269 models.

91.0

Watch Area: Output $

Rank #253 across 325 models.

$8.40/M

Watch Area: Value

Rank #234 across 323 models.

11.2

Watch Area: Blended $

Rank #235 across 325 models.

$2.63/M

Strength Profile

Percentile score by analysis domain.

* Cost is inverted: lower input, output, and blended prices rank higher.

Benchmark Percentiles

Higher bars mean stronger relative placement.

All Benchmarks

MetricDomainValueRank
Artificial Analysis Intelligence Indexoverall29.5#149
Artificial Analysis Coding Indexcoding23.2#168
Artificial Analysis Math Indexmath91.0#22
MMLU-Proreasoning84.3%#39
GPQA
reasoning
79.0%
#111
Humanity's Last Examreasoning15.0%#94
LiveCodeBenchcoding78.8%#33
SciCodecoding, reasoning42.4%#66
MATH-500math98.4%#11
AIMEmath94.0%#4
Output Speedspeed56 tok/s#206
Time to First Tokenspeed1.28s#180
Blended Pricecost$2.63/M#235
Input Pricecost$0.700/M#198
Output Pricecost$8.40/M#253
Value Indexcost, overall11.2#234