What Is an Agentic Runtime? My Mental Model (And Why I Care)
I have been building agents without a runtime. It works until something breaks and you have no idea why. Here is the runtime model I wish I had from the start.
Navigate
Notes, thoughts, and technical musings — 14 posts.
I have been building agents without a runtime. It works until something breaks and you have no idea why. Here is the runtime model I wish I had from the start.
I built an autonomous remediation agent. It worked for two weeks, then it restarted a production database pod during a backup window. Here is what I learned about where agents actually belong.
I ran the numbers on my own setup. Self-hosted AI is not always cheaper, and the hidden costs are not where you think they are.
I almost sent customer data to a public LLM API because my pipeline was not checking the data flow. Here is the security model I use now.
I lost a day's worth of data because n8n does not replay state. That is why I now run Temporal for anything that matters.
I gave an AI agent kubectl access once. It deleted the wrong namespace. Here is why I now believe MCP servers are the only safe way to let agents touch infrastructure.
I spent a weekend getting Ollama to actually see my GPU on Kubernetes. Here is what broke and what I learned.
I built an AI pipeline without a queue. It worked until a marketing campaign sent 500 events in an hour. Here is what I learned about queues, durability, and why n8n alone is not enough.
I had a workflow that processed AI-generated content. It failed mid-run and I lost half a day's work. Temporal would have prevented that. Here is why I now use it for anything that matters.
I had five API keys scattered across laptops, env files, and one unfortunate screenshot. LiteLLM fixed that. Here is what I actually run and what it actually costs me.
I thought vLLM would be a simple upgrade from Ollama. I was wrong. Here is what actually happened when I tried to run it on my GTX 1080.
I use multiple AI coding tools through OpenRouter and Kimi Code. Here is what each one actually does for me, what it cannot do, and why I keep using them.
I moved from Ollama to vLLM expecting a simple upgrade. I had to re-download models, learn new quantization formats, and debug NCCL errors. Here is what the migration actually cost me.
I built an AI SRE agent. It can read logs and list pods. It cannot fix anything yet. Here is why that is exactly the right scope.