Jun 3, 2026 Developing
A pruning pipeline running on my own hardware. 50M tokens in, a checkpoint to dissect, and a live question: where does uniform expert pruning stop being free, and what's the real VRAM-vs-quality sweet spot?
Read the developing notes → Jun 1, 2026 Origin
Summer 2025, one GPU bought on a hunch, a €9,000 receipt and the regret that came with it — and the project that turned it all into four Blackwells. The build, in photos.
Read the origin story → May 31, 2026 Validated
I went looking for decode speed and found something else: quantized MoE inference loses determinism in proportion to how aggressively you quantize — and a leaked timestamp in my own harness was faking half the signal. A controlled walk across seven configurations.
Read the finding →