Run DeepSeek R2 on M5 Max

Yes — DeepSeek R2 (671B) runs at 8 tok/s on M5 Max with 128 GB RAM using Q2_K quantization via MLX. First token latency is 2.5s. A capable open-source LLM with 671B parameters.

Speed

tok/s

First Token

2.5

seconds

RAM Needed

128

GB minimum

Engine

MLX

Benchmark Details

The LLMCheck index estimates DeepSeek R2 on M5 Max using our published methodology: Q4_K_M quantization, memory-bandwidth scaling, and cross-referenced third-party benchmarks where available. Figures are transparent estimates — own this config? Submit a real benchmark →

Metric	Value
Tokens per second	8 tok/s
Time to first token	2.5s
Quantization	Q2_K
Minimum RAM	128 GB
Recommended engine	MLX
Parameters	671B
Benchmark date	2026-05

Q2_K 671B MLX M5 Max

Setup Guide: Run DeepSeek R2 on M5 Max

The recommended engine for DeepSeek R2 on M5 Max is MLX. Install with pip and pull the model:

pip install mlx-lm

mlx_lm.generate --model mlx-community/deepseek-r2-q2_k --prompt "Hello!"

Alternatively, you can use Ollama for a simpler setup:

ollama run deepseek-r2

System Requirements

To run DeepSeek R2 on M5 Max you need:

• Mac with M5 Max chip (or newer)
• 128 GB unified memory minimum
• macOS 13 Ventura or later
• ~108-128 GB free disk space for the model file
• MLX installed (see our Ollama install guide)

🛒 Get a Mac that runs DeepSeek R2

DeepSeek R2 needs about 128 GB of unified memory. These current Apple Silicon Macs have the headroom to run it comfortably:

MacBook Pro M4 Max (128GB) → Mac Studio M4 Max →

As an Amazon Associate, LLMCheck earns from qualifying purchases. These affiliate links cost you nothing extra and help keep our benchmarks free.

Compare More Models

See how DeepSeek R2 stacks up against other models on your specific Mac hardware.

Open Compare Tool Full Leaderboard