Submit a Benchmark
LLMCheck is a community-fed index. If you own a Mac, you can add a verified speed data point in about 5 minutes: run a model with Ollama or LM Studio, note the tokens-per-second eval rate and your chip + RAM, and submit it via GitHub or email. Reviewed rows appear on /benchmarks with your credit.
Every real-world number makes the index more accurate — especially for Mac configurations our estimation formula covers less precisely.
How to contribute (4 steps, ~5 minutes)
Install a runtime
Install Ollama (command line) or LM Studio (graphical app). Both are free — see our software guide if you're choosing. If you already run local models, skip ahead.
Run a model and note the eval rate
In Terminal, run any model with the --verbose flag. After the response, Ollama prints timing stats — the number you want is eval rate (tokens per second):
$ ollama run qwen4.1:32b-a3b --verbose >>> Write a haiku about unified memory. Weights and thoughts share one silicon lake — no copies, just tokens flowing. total duration: 4.512s load duration: 1.104s prompt eval rate: 312.40 tokens/s eval rate: 61.87 tokens/s ← this number
Prefer a more rigorous run? llama-bench (ships with llama.cpp) works too — paste its table into your submission. In LM Studio, the tok/s figure is shown directly under each response.
Grab your Mac's specs
Open → About This Mac and note your chip (e.g. Apple M4 Pro) and memory (e.g. 24 GB). Also note your macOS version and the runtime version (ollama --version, or LM Studio's About screen).
Submit your data point
Open a pre-filled GitHub issue — the template asks for exactly the fields above:
→ Submit on GitHub ✉ Or email data@llmcheck.net
No GitHub account? Email works just as well — paste the same details (model, quantization, runtime + version, chip, RAM, macOS version, tok/s).
BEFORE YOU RUN — QUICK GUIDELINES
- Plugged in: run on power, not battery (macOS throttles on battery).
- Quiet machine: close other heavy apps — browsers with 50 tabs count.
- Note versions: macOS version and runtime version (e.g. Ollama 0.9.x) — speeds shift between releases.
- Default context: use the runtime's default context length unless you note otherwise.
What happens next
Submitted rows are reviewed for plausibility and completeness, then merged into the next index update. Your data point is published with provenance "community" and your GitHub handle as source credit, and appears on the benchmarks page and in the open data downloads. Community rows are labeled distinctly from estimated and sourced figures — see the methodology page for how the three provenance types work.
FAQ
Do I need to be technical?
Not really. If you can copy one command into Terminal (or read a number off the LM Studio screen) and fill in a short form, you can contribute. The whole process takes about 5 minutes, and the GitHub issue template walks you through every field.
What data do you collect?
Only what you paste into the form: your Mac's chip and RAM, the model and quantization you ran, the runtime and its version, and the tokens-per-second figure. No personal data, no telemetry, no tracking — consistent with the rest of the site, everything you run stays on your Mac.
Why contribute?
Community data points make the index more accurate for everyone — especially for chip and RAM combinations we can't estimate as precisely. Your row is published with provenance "community" and your GitHub handle as source credit, so your contribution is visible and citable on the benchmarks page.
How long until my benchmark appears?
Submissions are reviewed for plausibility and completeness, then merged into the next index update. Updates ship monthly, so your data point typically appears on the benchmarks page within a few weeks.