▍ Origin
How this started
Summer 2025. I’d been using Claude Code daily for about five months, building things faster than I ever thought possible. I can’t remember exactly what triggered it — maybe boredom, maybe a YouTube video about running Llama 70B locally — a thought kept creeping in, what if I owned the model instead of renting it?
I pulled the trigger and bought my first RTX 6000 Pro. The system itself was an Intel consumer flagship build — 128GB of RAM, an Asus ProArt motherboard with two x16 slots. The parts arrived, I assembled them, and I plugged in the card.
I got Llama 70B running, played with it for a while, and then… didn’t really use it. I read a lot, experimented with smaller projects, but no real use case clicked. It sat there.
What followed was short period of quiet, buyers remorse, regret. I had dropped €9,000 on what was, viewed cynically, just a gamer card with extra memory. I knew the memory was the entire point — that it was the key to unlocking the AI world — but knowing that didn’t lessen the sting of the receipt sitting in my inbox.
Regardless, it did not stop me, I kept going and kept tinkering.
Things shifted more at the end of 2025. Everything I had touched this far culminated with a more serious project: an agentic, multi-turn RAG system that indexed 100GB of Estonian legal text with embeddings, agents could query and cross-reference the corpus, hit public services like the property registry and court records, and synthesize answers to real legal questions. It worked. Building and serving that system finally put my single GPU to real work, and taught me how the whole stack fit together.
But there was a problem. Every interaction cost real money. The heavy lifting relied on the Anthropic API, and I quickly burned through €2,000 just on testing and debugging.
That project was the wake-up call. It showed me exactly what these systems can do when you build them properly, and exactly what they cost when you don’t own the infrastructure. Suddenly, the decision wasn’t difficult anymore.
In December 2025, I said eff it and bought my second RTX 6000 Pro. The goal was to localize the entire build of my project and stop bleeding cash on API calls.
Limited understanding, limited capability to imagine what’s possible — only way to find out is to arrive.
Today is June 1st, 2026. I’m now running four RTX 6000 Pro cards …
So it was summer 2025, cannot recall where i got an idea, i had used claude code actively for about 5 months. I had no specific reason, was it boredom? maybe. I pulled the trigger and bought my first rtx pro 6000. I had watched earlier a youtube video about how to run 70b parameter model locally, i was excited, i knew few things, open source AI models existed. I got the card, system itself was intel consumer flagship setup, 128GB ram, asus proart motherboard with 2 x16 slots, parts arrived, assembled, plugged in the card. And downloaded the llama 70b parameter model, got it running, never actually used actively for anything lol. Cannot remember what else i tried but it was purely learning and experimentation. I ran out of ideas what to try, a period of quiet followed, regret, i remember i spent 9k euros essentially on what is just a gamer card with extra memory, but i understood, the memory unlocks the AI world and it is not cheap. Can’t blame nvidia for cashing in on demand.
So on and off i built different kind of things, sometimes leveraged the card, sometimes didn’t .. by the end of 2025 i built my first project, agentic, multi turn nature, RAG. I downloaded the entire Estonian legal corpus, 100GB. The things i learned about the whole pipeline, how to put it all together, digital lawyer. The problem though, it was mostly using anthropic api and i spent at least 2000 euros just on testing and debugging the whole thing. My single blackwell gpu was useful just building and serving the rag. This is was the project which opened my eyes to what is possible, so December 2025, eff it, second rtx pro 6000. Could i localize the entire build of my project? not spending fortune on api calls?
Today June 1st, i am building and experimenting with second system, and i got now four blackwells. Early regret i had, poof, long gone. If i had means, i would build a second identical system and cluster them together …