Logging · 001
Engineering's view on local frontier AI — what actually runs on hardware you control.
A weekly engineering journal benchmarking open-weights models on consumer silicon. Real harnesses, real failures, no vendor decks.
Almost there — check your inbox to confirm your subscription.
Recent dispatches
- 1 What to Buy for Local LLM Inference: Strix Halo, Mac Studio, DGX Spark, or a GPU Rig A buyer's guide to local LLM hardware, ranked by the one spec that actually decides generation speed: memory bandwidth. Plus where ROCm really stands on Strix Halo.
- 2 AMD Is Selling "First-Class ROCm" on Strix Halo. I've Run the Same Chip for Six Months. On June 8, AMD opened pre-orders for a $3,999 box built around the exact chip I've run in production since the start of the year, marketed on full ROCm support, with one of its own demos running on the exact model my board can't load under ROCm.
- 3 A BIOS Update Won't Fix #6182 — I Tried the Newest One The Bosgame M5's ROCm bug is board-specific, not chip-specific — so firmware is the obvious lever. I flashed Bosgame's newest official BIOS hoping to dodge it. It didn't work, and the negative narrows where the fault actually lives.
- 4 Full Context on a Vulkan-Only Strix Halo: The Decode-Drop Reproduces, but the Sweet Spot Moves kmarble showed ROCm decode collapses 64% at full context on Strix Halo, and ROCm+MTP cures it. My board can't run ROCm. The Vulkan half reproduces the drop — but the MTP sweet spot from last week walks left at depth: by 76k, drafting too deep is slower than no speculation at all.
- 5 MTP Defaults Are a Trap: What 260 Runs Showed About Speculative Decoding on Qwen3.6 Until May 19, the llama.cpp speculative-decoding default was 16. On Qwen3.6's single MTP head, that default cost up to 75% of generation throughput. Here's where the real sweet spots are — and why they're architecture-specific.
- 6 ROCm 7.x on the Bosgame M5: 14 Configurations, 14 Failures We promised a ROCm 7.x revisit. We got a comprehensive workaround sweep instead. Both are useful.
- 7 Vulkan/RADV vs ROCm 6.4 on Strix Halo: What 128 Benchmark Runs Actually Showed The headline isn't where Vulkan wins. It's where ROCm doesn't run at all.
- 8 What 96GB of VRAM on Unified-Memory Hardware Actually Gets You for Local LLM Inference An honest practitioner take from a Bosgame M5 running Strix Halo at full BIOS allocation.