About LLM Check
We help Mac users find the right local AI model for their exact hardware — no guesswork, no wasted downloads, no subscriptions.
What We Do
LLM Check is a free, independent compatibility tool for Apple Silicon Macs. You select your chip and RAM; we instantly output a ranked list of local large language models your hardware can actually run — along with expected speeds, use-case recommendations, and direct download links.
Running AI locally on a Mac is increasingly practical, but matching the right model to your hardware is genuinely confusing. Quantization levels, parameter counts, memory bandwidth requirements, and MoE architectures all interact in ways that aren't obvious. We do that work so you don't have to.
Our Methodology
Every compatibility recommendation on LLM Check is based on three inputs: the model's file size at a given quantization level, the estimated system overhead (OS + background apps ≈ 2–3GB), and the available Unified Memory bandwidth of the target chip.
We test and update our database as new models are released — typically within days of a major open-source release like Qwen 3.5, Llama 3, or Mistral. Speed estimates (tokens per second) are sourced from community benchmarks on Apple Silicon and cross-referenced with our own testing.
Our Principles
Always Free
The compatibility checker and all guides are free. No account, no paywall, no subscription.
Independent
We are not affiliated with Apple, Ollama, LM Studio, or any model creator. Recommendations are unsponsored.
Mac-Focused
We specialise exclusively in Apple Silicon. Every recommendation is tested and validated for Unified Memory architectures.
Kept Current
The AI model landscape moves fast. We update our database within days of major new model releases.
Why Local AI on Mac?
Apple Silicon's Unified Memory architecture is uniquely suited for LLM inference. Unlike Windows PCs where the CPU and GPU have separate memory pools, a Mac with 128GB of Unified Memory effectively gives the GPU 128GB of "VRAM" — something that would cost tens of thousands of dollars in dedicated GPU hardware.
This means a Mac Studio with 128GB can run the full, unquantized Llama 3 70B model — a task that previously required a rack of datacenter hardware. The combination of high memory bandwidth (~600 GB/s on M5 Max), Apple's Neural Engine, and optimised frameworks like MLX and Ollama makes Macs the best consumer hardware for private, offline AI in 2026.
Find Your Perfect Local LLM
Select your Mac's chip and RAM. Get a personalised list of models in seconds.
→ Run the Free Checker