Artificial Analysis aggregate math score.
The Math Index is the source dataset's aggregate math signal. It is useful for comparing models that need reliable numerical reasoning, but it should still be read next to direct math benchmarks such as MATH-500 and AIME.
Test type: Aggregate math evaluation that combines math-focused benchmark results exposed by Artificial Analysis.
269 models have this metric.
Current leader: GPT-5.2 (xhigh)
Project links
Scores come from the Artificial Analysis LLM snapshot committed in this app.
Top models ranked by Math.
| Rank | Model | Creator | Value | Speed | Blended Price |
|---|---|---|---|---|---|
| #1 | GPT-5.2 (xhigh) | OpenAI | 99.0 | 71.8 tok/s | $4.81/M |
| #2 |
| OpenAI |
| 98.7 |
| 166.8 tok/s |
| $3.44/M |
| #3 | Gemini 3 Flash Preview (Reasoning) | 97.0 | 193.2 tok/s | $1.13/M |
| #4 | DeepSeek V3.2 Speciale | DeepSeek | 96.7 | n/a | - |
| #5 | GPT-5.2 (medium) | OpenAI | 96.7 | n/a | $4.81/M |
| #6 | MiMo-V2-Flash (Reasoning) | Xiaomi | 96.3 | 118.8 tok/s | $0.150/M |
| #7 | Gemini 3 Pro Preview (high) | 95.7 | 128.7 tok/s | $4.50/M |
| #8 | GPT-5.1 Codex (high) | OpenAI | 95.7 | 162.7 tok/s | $3.44/M |
| #9 | GLM-4.7 (Reasoning) | Z AI | 95.0 | 90.3 tok/s | $1.00/M |
| #10 | KAT-Coder-Pro V1 | KwaiKAT | 94.7 | 117.1 tok/s | $0.525/M |
| #11 | Kimi K2 Thinking | Kimi | 94.7 | 99 tok/s | $1.08/M |
| #12 | GPT-5 (high) | OpenAI | 94.3 | 84.2 tok/s | $3.44/M |
| #13 | Nova 2.0 Lite (high) | Amazon | 94.3 | 170.7 tok/s | $0.850/M |
| #14 | GPT-5.1 (high) | OpenAI | 94.0 | 123.3 tok/s | $3.44/M |
| #15 | gpt-oss-120B (high) | OpenAI | 93.4 | 212.3 tok/s | $0.263/M |
| #16 | Grok 4 | xAI | 92.7 | 50.3 tok/s | $6.00/M |
| #17 | DeepSeek V3.2 (Reasoning) | DeepSeek | 92.0 | n/a | $0.315/M |
| #18 | GPT-5 (medium) | OpenAI | 91.7 | 82.3 tok/s | $3.44/M |
| #19 | GPT-5.1 Codex mini (high) | OpenAI | 91.7 | 207.2 tok/s | $0.688/M |
| #20 | Claude Opus 4.5 (Reasoning) | Anthropic | 91.3 | 57 tok/s | $10.00/M |
| #21 | NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) | NVIDIA | 91.0 | 154.8 tok/s | $0.096/M |
| #22 | Qwen3 235B A22B 2507 (Reasoning) | Alibaba | 91.0 | 56 tok/s | $2.63/M |
| #23 | GPT-5 mini (high) | OpenAI | 90.7 | 85.7 tok/s | $0.688/M |
| #24 | o4-mini (high) | OpenAI | 90.7 | 124.5 tok/s | $1.93/M |
| #25 | K-EXAONE (Reasoning) | LG AI Research | 90.3 | n/a | - |
| #26 | DeepSeek V3.1 (Reasoning) | DeepSeek | 89.7 | n/a | $0.865/M |
| #27 | DeepSeek V3.1 Terminus (Reasoning) | DeepSeek | 89.7 | n/a | $1.91/M |
| #28 | Grok 4 Fast (Reasoning) | xAI | 89.7 | 76.2 tok/s | $0.275/M |
| #29 | Nova 2.0 Omni (medium) | Amazon | 89.7 | n/a | $0.850/M |
| #30 | gpt-oss-20B (high) | OpenAI | 89.3 | 242.3 tok/s | $0.100/M |
| #31 | Grok 4.1 Fast (Reasoning) | xAI | 89.3 | 140.9 tok/s | $0.275/M |
| #32 | Ring-1T | InclusionAI | 89.3 | n/a | - |
| #33 | Nova 2.0 Pro Preview (medium) | Amazon | 89.0 | 112.7 tok/s | $3.44/M |
| #34 | Nova 2.0 Lite (medium) | Amazon | 88.7 | 170.5 tok/s | $0.850/M |
| #35 | o3 | OpenAI | 88.3 | 72.7 tok/s | $3.50/M |
| #36 | Qwen3 VL 235B A22B (Reasoning) | Alibaba | 88.3 | 46.2 tok/s | $2.63/M |
| #37 | Apriel-v1.6-15B-Thinker | ServiceNow | 88.0 | n/a | - |
| #38 | Claude 4.5 Sonnet (Reasoning) | Anthropic | 88.0 | 43.8 tok/s | $6.00/M |
| #39 | INTELLECT-3 | Prime Intellect | 88.0 | n/a | - |
| #40 | DeepSeek V3.2 Exp (Reasoning) | DeepSeek | 87.7 | n/a | $0.315/M |
| #41 | Gemini 2.5 Pro | 87.7 | 120.2 tok/s | $3.44/M |
| #42 | Apriel-v1.5-15B-Thinker | ServiceNow | 87.5 | n/a | - |
| #43 | Gemini 3 Pro Preview (low) | 86.7 | n/a | $4.50/M |
| #44 | GLM-4.6 (Reasoning) | Z AI | 86.0 | 26.3 tok/s | $0.963/M |
| #45 | GLM-4.6V (Reasoning) | Z AI | 85.3 | 34.1 tok/s | $0.450/M |
| #46 | ERNIE 5.0 Thinking Preview | Baidu | 85.0 | n/a | - |
| #47 | GPT-5 mini (medium) | OpenAI | 85.0 | 77.2 tok/s | $0.688/M |
| #48 | Grok 3 mini Reasoning (high) | xAI | 84.7 | 215.5 tok/s | $0.350/M |
| #49 | Qwen3 VL 32B (Reasoning) | Alibaba | 84.7 | 94.5 tok/s | $2.63/M |
| #50 | Seed-OSS-36B-Instruct | ByteDance Seed | 84.7 | 40 tok/s | $0.300/M |
| #51 | Qwen3 Next 80B A3B (Reasoning) | Alibaba | 84.3 | 172.2 tok/s | $1.88/M |
| #52 | Claude 4.5 Haiku (Reasoning) | Anthropic | 83.7 | 103.8 tok/s | $2.00/M |
| #53 | GPT-5 nano (high) | OpenAI | 83.7 | 136 tok/s | $0.138/M |
| #54 | Ring-flash-2.0 | InclusionAI | 83.7 | 91 tok/s | $0.247/M |
| #55 | GPT-5 (low) | OpenAI | 83.0 | 65.8 tok/s | $3.44/M |
| #56 | MiniMax-M2.1 | MiniMax | 82.7 | 84.8 tok/s | $0.525/M |
| #57 | Qwen3 4B 2507 (Reasoning) | Alibaba | 82.7 | n/a | - |
| #58 | Qwen3 Max Thinking (Preview) | Alibaba | 82.3 | 40.8 tok/s | $2.40/M |
| #59 | Qwen3 VL 30B A3B (Reasoning) | Alibaba | 82.3 | 121.9 tok/s | $0.750/M |
| #60 | Magistral Medium 1.2 | Mistral | 82.0 | 42 tok/s | $2.75/M |
| #61 | Qwen3 235B A22B (Reasoning) | Alibaba | 82.0 | 61.4 tok/s | $2.63/M |
| #62 | GLM-4.5-Air | Z AI | 80.7 | 72.9 tok/s | $0.372/M |
| #63 | Qwen3 Max | Alibaba | 80.7 | 32.2 tok/s | $2.40/M |
| #64 | Claude 4.1 Opus (Reasoning) | Anthropic | 80.3 | 35.8 tok/s | $30.00/M |
| #65 | Magistral Small 1.2 | Mistral | 80.3 | 100.3 tok/s | $0.750/M |
| #66 | Motif-2-12.7B-Reasoning | Motif Technologies | 80.3 | n/a | - |
| #67 | EXAONE 4.0 32B (Reasoning) | LG AI Research | 80.0 | n/a | - |
| #68 | Falcon-H1R-7B | TII UAE | 80.0 | n/a | - |
| #69 | Doubao Seed Code | ByteDance Seed | 79.3 | n/a | - |
| #70 | Mi:dm K 2.5 Pro Preview | Korea Telecom | 78.7 | n/a | - |
| #71 | Gemini 2.5 Flash Preview (Sep '25) (Reasoning) | 78.3 | n/a | - |
| #72 | GPT-5 nano (medium) | OpenAI | 78.3 | 150.3 tok/s | $0.138/M |
| #73 | K2-V2 (high) | MBZUAI Institute of Foundation Models | 78.3 | n/a | - |
| #74 | MiniMax-M2 | MiniMax | 78.3 | 83.5 tok/s | $0.525/M |
| #75 | Olmo 3.1 32B Think | Allen Institute for AI | 77.3 | n/a | - |
| #76 | Llama Nemotron Super 49B v1.5 (Reasoning) | NVIDIA | 76.7 | 50.8 tok/s | $0.175/M |
| #77 | Mi:dm K 2.5 Pro | Korea Telecom | 76.7 | n/a | - |
| #78 | DeepSeek R1 0528 (May '25) | DeepSeek | 76.0 | n/a | $2.36/M |
| #79 | NVIDIA Nemotron Nano 12B v2 VL (Reasoning) | NVIDIA | 75.0 | 125 tok/s | $0.300/M |
| #80 | Qwen3 Max (Preview) | Alibaba | 75.0 | 45.1 tok/s | $2.40/M |