NVIDIA
NVIDIA Nemotron 3 Nano is an open reasoning model optimized for fast, cost-efficient inference. Built with a hybrid MoE and Mamba architecture and trained on NVIDIA-curated synthetic reasoning data, it delivers strong multi-step reasoning with stable latency and predictable performance for agentic and production workloads.
NVIDIA model catalogRank #203 across 526
Rank #222 across 436
Rank #23 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 24.3 | #203 |
| Artificial Analysis Coding Index | coding | 19.0 | #222 |
| Artificial Analysis Math Index | math | 91.0 | #21 |
| MMLU-Pro | reasoning | 79.4% | #125 |
| reasoning |
| 75.7% |
| #168 |
| Humanity's Last Exam | reasoning | 10.2% | #164 |
| LiveCodeBench | coding | 74.1% | #50 |
| SciCode | coding, reasoning | 29.6% | #270 |
| Output Speed | speed | 133.6 tok/s | #94 |
| Time to First Token | speed | 0.95s | #118 |
| Blended Price | cost | $0.096/M | #23 |
| Input Price | cost | $0.055/M | #26 |
| Output Price | cost | $0.220/M | #33 |
| Value Index | cost, overall | 253.1 | #14 |