NVIDIA
The model is an auto-regressive vision language model that uses an optimized transformer architecture. The model enables multi-image reasoning and video understanding, along with strong document intelligence, visual Q&A and summarization capabilities.
NVIDIA model catalogRank #327 across 526
Rank #314 across 436
Rank #102 across 357
Percentile score by analysis domain.
* Cost is inverted: lower input, output, and blended prices rank higher.
Higher bars mean stronger relative placement.
Streaming speed is not measured for this model yet.
| Metric | Domain | Value | Rank |
|---|---|---|---|
| Artificial Analysis Intelligence Index | overall | 14.9 | #327 |
| Artificial Analysis Coding Index | coding | 11.7 | #314 |
| Artificial Analysis Math Index | math | 75.0 | #79 |
| MMLU-Pro | reasoning | 75.9% | #169 |
| reasoning |
| 57.2% |
| #318 |
| Humanity's Last Exam | reasoning | 5.3% | #291 |
| LiveCodeBench | coding | 69.4% | #76 |
| SciCode | coding, reasoning | 26.2% | #327 |
| Blended Price | cost | $0.300/M | #102 |
| Input Price | cost | $0.200/M | #111 |
| Output Price | cost | $0.600/M | #107 |
| Value Index | cost, overall | 49.7 | #126 |