Run Qwen 3.5 9B on M4 Pro

Yes — Qwen 3.5 9B (9B) runs at 92 tok/s on M4 Pro with 24 GB RAM using Q4_K_M quantization via MLX. First token latency is 0.4s. Alibaba's 9B model — top pick for 8 GB and 16 GB Macs.

Speed

tok/s

First Token

0.4

seconds

RAM Needed

GB minimum

Engine

MLX

Benchmark Details

The LLMCheck index estimates Qwen 3.5 9B on M4 Pro using our published methodology: Q4_K_M quantization, memory-bandwidth scaling, and cross-referenced third-party benchmarks where available. Figures are transparent estimates — own this config? Submit a real benchmark →

Metric	Value
Tokens per second	92 tok/s
Time to first token	0.4s
Quantization	Q4_K_M
Minimum RAM	24 GB
Recommended engine	MLX
Parameters	9B
Benchmark date	2026-03

Q4_K_M 9B MLX M4 Pro

Setup Guide: Run Qwen 3.5 9B on M4 Pro

The recommended engine for Qwen 3.5 9B on M4 Pro is MLX. Install with pip and pull the model:

pip install mlx-lm

mlx_lm.generate --model mlx-community/qwen-35-9b-q4_k_m --prompt "Hello!"

Alternatively, you can use Ollama for a simpler setup:

ollama run qwen3.5:9b

Performance on Other Apple Silicon Chips

Chip	Speed	First Token	Min RAM	Engine
M5 Max	105 tok/s	0.5s	64 GB	Ollama
M4	72 tok/s	0.6s	16 GB	LM Studio
M3	58 tok/s	0.7s	16 GB	Ollama
M1	35 tok/s	1.1s	16 GB	Ollama

System Requirements

To run Qwen 3.5 9B on M4 Pro you need:

• Mac with M4 Pro chip (or newer)
• 24 GB unified memory minimum
• macOS 13 Ventura or later
• ~20-24 GB free disk space for the model file
• MLX installed (see our Ollama install guide)

🛒 Get a Mac that runs Qwen 3.5 9B

Qwen 3.5 9B needs about 24 GB of unified memory. These current Apple Silicon Macs have the headroom to run it comfortably:

Mac mini M4 Pro → MacBook Pro M4 Pro →

As an Amazon Associate, LLMCheck earns from qualifying purchases. These affiliate links cost you nothing extra and help keep our benchmarks free.

Compare More Models

See how Qwen 3.5 9B stacks up against other models on your specific Mac hardware.

Open Compare Tool Full Leaderboard