DeepSeek
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. DeepSeek has expanded their dataset by collecting additional long documents and substantially extending both training phases.
DeepSeek release notesRank #168 across 526
Rank #135 across 436
Rank #195 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
Streaming speed is not measured for this model yet.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 27.7 | #168 |
| Artificial Analysis Coding Index | coding | 29.7 | #135 |
| Artificial Analysis Math Index | math | 89.7 | #26 |
| MMLU-Pro | reasoning | 85.1% | #30 |
| reasoning |
| 77.9% |
| #137 |
| Humanity's Last Exam | reasoning | 13.0% | #126 |
| LiveCodeBench | coding | 78.4% | #34 |
| SciCode | coding, reasoning | 39.1% | #130 |
| Blended Price | cost | $0.865/M | #195 |
| Input Price | cost | $0.590/M | #206 |
| Output Price | cost | $1.69/M | #169 |
| Value Index | cost, overall | 32.0 | #170 |