❤️ Donate
Brainy is a tiny, open-source research sidekick that lives at askbrainy.com and in Telegram — built with free tools, a shoestring budget, and a lot of love. It currently runs on a hilariously old Mac mini A1347 (2012, MD387D/A) I rescued off eBay for €56 (plus delivery). That old champ keeps the lights on — but it can’t run Brainy locally.
It has 16 GB RAM (nice) and an SSD, but the Intel Core i5-2415M and Intel HD 3000 mean local LLMs are… let’s say “historical reenactments.” So Brainy leans on Together AI for inference. That works, but:
- Context is tight on the free endpoints (8,193 tokens in+out combined), so complex multi-document research and long chat history hit hard limits.
- RPM/TPM caps kick in quickly as more people use Brainy.
- Free model pools get congested and sometimes refuse to serve.
Also: the old Mini still draws power 24/7 (≈ €10/month electricity). Your donation makes an immediate, measurable difference.
Brainy will remain free and open-source forever. Donations prevent paywalls and keep the code open.
🎯 Funding goals
1) Goal 1 — $50 (micro)
Top up Together AI for higher throughput. Target: Build Tier 2, which reduces “429 busy”/queue time and lets Brainy batch more users.
2) Goal 2 — $750 (macro)
Buy a Mac mini (M4, 10-core CPU / 10-core GPU, 16 GB unified) — for example: www.computeruniverse.net
Why? It’s shockingly efficient and fast enough to run ~14B models locally (quantized), offloading a lot of traffic from Together while keeping Together for bigger contexts and specialty models. Apple lists 4 W idle / 65 W max for the base M4 Mini; the old 2012 Mini peaks at 85 W.
⏳ Live progress
Target: $750
$0 [>-----------------------] 0%
Ads/sponsorships/collabs will also count toward the goal.
✉️ Contact: [email protected]
🧠 Why the M4 upgrade
1) Model throughput: old Mini vs. M4 Mini vs. RTX 3060 (14B, quantized)
Let’s use a popular 14B family (e.g., Qwen 2 14B in Q4/Q5 quantization) and llama.cpp/MLX-style local inference as the yardstick.
Machine | Stack | Model / Quant | Tokens/s (TG) | Notes |
---|---|---|---|---|
Old Mac mini (2012) i5-3210M, CPU-only | llama.cpp (CPU), Q4 | ~0.5–1.5 t/s (estimated) | CPU-only 13–34B community reports land ~1.5–4 t/s on much newer multi-channel CPUs; older dual-core Ivy Bridge is substantially slower. Order-of-magnitude only. | |
Mac mini (M4, 16 GB, 10-GPU) | Metal/MLX, Q4/Q5 | ~15–20 t/s (estimated) | Community M4 Pro (64 GB) reports 30–35 t/s for Qwen 2.5 14B (MLX + speculative decoding). Base M4 (fewer GPU cores, less memory headroom) should land lower; conservative estimate shown. | |
PC w/ RTX 3060 (12 GB) | llama.cpp (CUDA), Qwen2 14B Q5_K_M | 28.9 t/s (measured) | Example benchmark shows 28.88 t/s and model file ~9.8 GiB—fits easily in 12 GB VRAM. |
Takeaway: the M4 Mini should be orders of magnitude faster than the 2012 Mini and competitive with a 3060-class PC for 14B-ish INT4/INT5 workloads—while using a fraction of the power and heat budget.
Why this is realistic:
- llama.cpp has a first-class Metal backend; Apple Silicon is explicitly optimized. Most local LLM pipelines on macOS (Ollama, MLX) run compute on GPU via Metal.
- Measured data points exist for M-series (e.g., M3 Max): LLaMA-3-70B Q4_K_M text-gen ≈ 7.5 t/s, showing how far Metal has come; 14B is far lighter than 70B.
2) Energy & thermals (tokens per watt)
Apple publishes official power numbers:
- Mac mini (M4, base): 4 W idle / 65 W max
- Mac mini (Late 2012): up to 85 W max
And for the PC baseline:
- RTX 3060 TGP ≈ 170 W (GPU alone; system draw is higher).
Very rough efficiency math (generation phase, not prompt eval):
- Old Mini (2012, CPU-only): ~1 t/s ÷ 85 W ≈ 0.012 t/s/W (pain).
- M4 Mini: ~18 t/s ÷ 65 W ≈ 0.28 t/s/W (quiet + cool).
- RTX 3060 PC (GPU only): 28.9 t/s ÷ 170 W ≈ 0.17 t/s/W (doesn’t include CPU/system overhead).
Bottom line: The M4 Mini trades a small performance delta vs. a 3060 for ~1.6× better tokens/W (GPU-only) and dramatically lower whole-system draw. For a 24/7 community tool, that’s greener and cheaper to run.
🔧 Together AI: what’s great, what hurts
I love Together’s model buffet and pricing, but Brainy’s usage hits tier math and model caps fast. Some free endpoints enforce stricter, per-model caps (e.g., 70B-class “free” endpoints) and congestion can throttle you below your nominal tier—meaning you see 429s even when you think you’re under the limit.
Real-world example I see frequently:
together.error.RateLimitError: Error code: 429
{"message":"You have reached the rate limit specific to this model meta-llama/Llama-3.3-70B-Instruct-Turbo-Free.
The maximum rate limit for this model is 6.0 queries and 180000 tokens per minute."}
This happens even if Brainy isn’t hammering at 6 RPM—because the per-model pool for the free 70B is tiny and often saturated. (Topping up to Build Tier 2 gives Brainy a much less crowded lane.)
📦 What donations unlock for Brainy
- $50 (Build Tier 2 top-up):
- Jump to 60 RPM for free models → dramatically fewer 429s and bigger concurrent user headroom.
- $750 (M4 Mini):
- Local 14B inference at usable speeds (see table), cutting API calls dramatically.
- Way better energy profile than GPU PC builds, and silent enough to run next to my coffee.
- Still keep Together for 70B+ needs.
📦 What you’ll unlock for Everyone
- More context memory for longer, more thoughtful answers.
- Higher reliability during traffic spikes (fewer rate-limit hiccups).
- Faster iteration on new and existing features.
💸 How to donate (crypto only for now)
- USDT (TRC-20):
TK5uyyAbuchtBS4hwWwtQA4G15MA54RDkG
- USDT (BSC):
0x942891F9a02632d67C496305c9746ACedfC0eb2D
- USDT (SOL):
5yzcNUo8r7goHZMzwF9hPS8MVqXevwuyT4S8hhyHQVqK
If you’d like to run an ad, sponsor, or collaborate—those count as donations too.
✉️ [email protected]
📝 About Brainy
- Brainy is free, open-source, and built on free-tier tooling wherever possible. Donations are voluntary and go 100% into compute credits & hardware for Brainy.
- I will keep this page updated with live progress and receipts (screenshots/links) as milestones are hit.
- If you want to earmark funds (e.g., “Only for Build Tier 2 top-ups”), add a memo in your email and I’ll honor it.
- Donations are typically not tax-deductible and are generally non-refundable. See the Terms of Service on the site.
Thanks for reading this far ❤️ If you can chip in—even a few USDT—it directly translates into fewer 429s, faster responses, and a greener Brainy that we can all use.