Best Local LLMs for the Mac mini M4 (24 GB)

The best local LLM for a Mac mini M4 (24 GB) is Qwen 4.1 32B-A3B at 12 tok/s. With 24 GB of unified memory it runs 43 of the models we benchmark — from compact options up to 41B-class models. For everyday chat and coding, Qwen 4.1 32B-A3B is the sweet spot. Full ranking below.

Unified memory

Mem. bandwidth

120

GB/s

Models that fit

of 79

Top speed

tok/s

Top 3 picks for the Mac mini M4 (24 GB)

⭐ Best overall

Qwen 4.1 32B-A3B

32B · Apache 2.0 · cap 46/50

12 tok/s

⚡ Fastest

Qwen 3.5 4B

4B · Apache 2.0 · cap 12/50

92 tok/s

🧠 Runner-up

Qwen 4

32B · Apache 2.0 · cap 45/50

12 tok/s

Every model ranked for a Mac mini M4 (24 GB)

Ranked by LLMCheck suitability (capability balanced against real speed on the M4). Click a model for its full benchmark and setup. Speeds marked est. are scaled from measured runs by memory bandwidth.

#	Model	Size	License	Speed	Capability
1	Qwen 4.1 32B-A3B	32B	Apache 2.0	12 tok/s est.	46/50
2	Qwen 4	32B	Apache 2.0	12 tok/s est.	45/50
3	Qwen 4 Coder	32B	Apache 2.0	12 tok/s est.	44/50
4	Qwen 4 Preview 32B-A3B	32B	Apache 2.0	12 tok/s est.	42/50
5	Gemma 4 31B	31B	Apache 2.0	5 tok/s est.	40/50
6	Qwen 3.6-35B-A3B	35B	Apache 2.0	10 tok/s est.	38/50
7	Phi-5 Large 28B	28B	MIT	8 tok/s est.	36/50
8	Gemma 4 26B-A4B	26B	Apache 2.0	10 tok/s est.	35/50
9	Mistral Medium 4	41B	Apache 2.0	10 tok/s est.	34/50
10	Gemma 4.5 27B	27B	Gemma	8 tok/s est.	32/50
11	Nemotron Cascade 2	30B	N/A	7 tok/s est.	30/50
12	Gemma 4.5 12B	12B	Gemma	15 tok/s est.	28/50

Showing the top 12 of 43 models that fit in 24 GB. See the full leaderboard or all benchmarks.

Quick start: run Qwen 4.1 32B-A3B on your Mac mini M4

The fastest way to get started is Ollama. Install it, then pull the top pick for your Mac:

brew install ollama

ollama run qwen-41-32b-a3b

Prefer a GUI? LM Studio gives you a one-click download and chat window. For step-by-step help see our Ollama install guide, or open the Qwen 4.1 32B-A3B on M4 benchmark page for exact settings.

🛒 Get a Mac mini M4 for local AI

The Mac mini M4 (24 GB) comfortably runs 43 of the models we benchmark, led by Qwen 4.1 32B-A3B. Grab one and start running LLMs offline today:

Mac mini M4 (24GB) on Amazon → Compare all Macs →

As an Amazon Associate, LLMCheck earns from qualifying purchases. Affiliate links cost you nothing extra and never influence our rankings.

FAQ: local LLMs on the Mac mini M4

What is the best local LLM for a Mac mini M4 (24 GB)?

Qwen 4.1 32B-A3B (32B, Apache 2.0) is the best all-round pick at 12 tok/s on the M4. If you want maximum speed, Qwen 3.5 4B hits 92 tok/s; for maximum capability, Qwen 4 still fits in 24 GB.

How many models can a Mac mini M4 with 24 GB run?

About 43 of the 79 models in the LLMCheck leaderboard fit in 24 GB of unified memory, from compact models up to Mistral Medium 4 (41B).

Can a Mac mini M4 run a 70B model?

Not comfortably. A 70B model in Q4 needs ~40–44 GB; with 24 GB you should stick to models up to ~18 GB, such as Qwen 4.1 32B-A3B. For 70B, look at a 48 GB+ Mac.

Is 24 GB of RAM enough to run LLMs locally?

24 GB is great for small-to-mid models (up to ~14B comfortably); for 30B+ you'll want 32 GB or more. Because Apple Silicon uses unified memory, that figure is both your system RAM and your VRAM.