Alibaba
Qwen3-Max-Preview shows substantial gains over the 2.5 series in overall capability, with significant enhancements in Chinese-English text understanding, complex instruction following, handling of subjective open-ended tasks, multilingual ability, and tool invocation; model knowledge hallucinations are reduced.
Qwen model releasesRank #183 across 526
Rank #164 across 436
Rank #258 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 26.1 | #183 |
| Artificial Analysis Coding Index | coding | 25.5 | #164 |
| Artificial Analysis Math Index | math | 75.0 | #80 |
| MMLU-Pro | reasoning | 83.8% | #45 |
| reasoning |
| 76.4% |
| #159 |
| Humanity's Last Exam | reasoning | 9.3% | #184 |
| LiveCodeBench | coding | 65.1% | #98 |
| SciCode | coding, reasoning | 37.0% | #169 |
| Output Speed | speed | 47.1 tok/s | #250 |
| Time to First Token | speed | 1.91s | #225 |
| Blended Price | cost | $2.40/M | #258 |
| Input Price | cost | $1.20/M | #250 |
| Output Price | cost | $6.00/M | #265 |
| Value Index | cost, overall | 10.9 | #263 |