DeepSeek
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. DeepSeek has expanded their dataset by collecting additional long documents and substantially extending both training phases.
DeepSeek release notesRank #167 across 526
Rank #144 across 436
Rank #181 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
Streaming speed is not measured for this model yet.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 28.1 | #167 |
| Artificial Analysis Coding Index | coding | 28.4 | #144 |
| Artificial Analysis Math Index | math | 49.7 | #141 |
| MMLU-Pro | reasoning | 83.3% | #56 |
| reasoning |
| 73.5% |
| #188 |
| Humanity's Last Exam | reasoning | 6.3% | #247 |
| LiveCodeBench | coding | 57.7% | #124 |
| SciCode | coding, reasoning | 36.7% | #174 |
| Blended Price | cost | $0.834/M | #181 |
| Input Price | cost | $0.555/M | #201 |
| Output Price | cost | $1.67/M | #168 |
| Value Index | cost, overall | 33.7 | #161 |