How to contribute (4 steps, ~5 minutes)

1

Install a runtime

Install Ollama (command line) or LM Studio (graphical app). Both are free — see our software guide if you're choosing. If you already run local models, skip ahead.

2

Run a model and note the eval rate

In Terminal, run any model with the --verbose flag. After the response, Ollama prints timing stats — the number you want is eval rate (tokens per second):

Terminal — ollama
$ ollama run qwen4.1:32b-a3b --verbose
>>> Write a haiku about unified memory.
Weights and thoughts share one
silicon lake — no copies,
just tokens flowing.

total duration:       4.512s
load duration:        1.104s
prompt eval rate:     312.40 tokens/s
eval rate:            61.87 tokens/s  ← this number

Prefer a more rigorous run? llama-bench (ships with llama.cpp) works too — paste its table into your submission. In LM Studio, the tok/s figure is shown directly under each response.

3

Grab your Mac's specs

Open  → About This Mac and note your chip (e.g. Apple M4 Pro) and memory (e.g. 24 GB). Also note your macOS version and the runtime version (ollama --version, or LM Studio's About screen).

4

Submit your data point

Open a pre-filled GitHub issue — the template asks for exactly the fields above:

→ Submit on GitHub ✉ Or email data@llmcheck.net

No GitHub account? Email works just as well — paste the same details (model, quantization, runtime + version, chip, RAM, macOS version, tok/s).

BEFORE YOU RUN — QUICK GUIDELINES

  • Plugged in: run on power, not battery (macOS throttles on battery).
  • Quiet machine: close other heavy apps — browsers with 50 tabs count.
  • Note versions: macOS version and runtime version (e.g. Ollama 0.9.x) — speeds shift between releases.
  • Default context: use the runtime's default context length unless you note otherwise.

What happens next

Submitted rows are reviewed for plausibility and completeness, then merged into the next index update. Your data point is published with provenance "community" and your GitHub handle as source credit, and appears on the benchmarks page and in the open data downloads. Community rows are labeled distinctly from estimated and sourced figures — see the methodology page for how the three provenance types work.


FAQ

Do I need to be technical?

Not really. If you can copy one command into Terminal (or read a number off the LM Studio screen) and fill in a short form, you can contribute. The whole process takes about 5 minutes, and the GitHub issue template walks you through every field.

What data do you collect?

Only what you paste into the form: your Mac's chip and RAM, the model and quantization you ran, the runtime and its version, and the tokens-per-second figure. No personal data, no telemetry, no tracking — consistent with the rest of the site, everything you run stays on your Mac.

Why contribute?

Community data points make the index more accurate for everyone — especially for chip and RAM combinations we can't estimate as precisely. Your row is published with provenance "community" and your GitHub handle as source credit, so your contribution is visible and citable on the benchmarks page.

How long until my benchmark appears?

Submissions are reviewed for plausibility and completeness, then merged into the next index update. Updates ship monthly, so your data point typically appears on the benchmarks page within a few weeks.