Alibaba
Compared with the snapshot as of September 23, 2025, the Qwen-3 series Max model in this release achieves an effective integration of thinking and non-thinking modes, resulting in a comprehensive and substantial improvement in the model’s overall performance. In thinking mode, the model simultaneously supports web search, web information extraction, and a code interpreter tool, enabling it to tackle more complex and challenging problems with greater accuracy by leveraging external tools while engaging in slow, deliberative reasoning. This version is based on a snapshot taken on January 23, 2026.
Qwen model releasesRank #77 across 526
Rank #128 across 436
Rank #259 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
Streaming speed is not measured for this model yet.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 39.8 | #77 |
| Artificial Analysis Coding Index | coding | 30.5 | #128 |
| GPQA | reasoning | 86.1% | #50 |
| Humanity's Last Exam | reasoning | 26.2% | #49 |
| SciCode |
| coding, reasoning |
| 43.1% |
| #67 |
| Blended Price | cost | $2.40/M | #259 |
| Input Price | cost | $1.20/M | #251 |
| Output Price | cost | $6.00/M | #266 |
| Value Index | cost, overall | 16.6 | #229 |