Local LLM Performance on Apple Silicon

Detailed per-chip benchmark pages for 107 model+chip combinations. Find out exactly how fast any model runs on your Mac — from M1 with 8 GB RAM to M5 Max with 128 GB.

Qwen 3.5 4B
4B • up to 148 tok/s
2 chips
Ministral 8B
8B • up to 98 tok/s
3 chips
Phi-4 14B
14B • up to 62 tok/s
3 chips
Ministral 14B
14B • up to 58 tok/s
3 chips
Qwen 3.6-35B-A3B
35B • up to 55 tok/s
3 chips
Mistral Small 4
119B • up to 45 tok/s
2 chips
Nemotron Cascade 2
30B • up to 35 tok/s
3 chips
Llama 4 Scout
109B • up to 30 tok/s
2 chips
DeepSeek R1 32B
32B • up to 27 tok/s
3 chips
Qwen3-235B-A22B
235B • up to 22 tok/s
2 chips
Llama 3.3 70B
70B • up to 18 tok/s
2 chips
DeepSeek R1 70B
70B • up to 16 tok/s
2 chips
Qwen 2.5 72B
72B • up to 15 tok/s
2 chips
GPT-oss 120B
120B • up to 10 tok/s
2 chips