~/blog/tag/mcp
MCP
The protocol that lets agents reach out of the chat window. What it does, what it costs, and where it leaks.
What I write about here
MCP, the Model Context Protocol, is how an agent reaches outside its own conversation. Files, databases, browsers, your terminal, a remote service. The protocol gives a model a way to call tools the people who built the model never had to think about.
That's the promise. The reality is more interesting.
Posts under this tag look at MCP from two angles. The practical side: which servers earn their keep, how to wire them in without giving an agent more authority than your most senior engineer. And the critical side: what happens when a protocol designed for convenience inherits all the trust assumptions of the user that runs it.
I am not anti-MCP. The protocol is useful and most of my workflow depends on it. I am sceptical of the way it is being adopted, which is roughly the same way every powerful tool gets adopted at first. Convenience first, audit later, surprise eventually.
Read these posts expecting opinions on specific servers and patterns. Some MCPs earn their stars. Some do not. The protocol itself is fine. What people build on top of it is the question.
// Best entry points
- Caveman vs context-mode: small mouth, or smaller room?
Two MCP servers that try to solve the same problem differently. A worked comparison and what it tells you about choosing instruments.
- Getting the Best Out of Claude Code
Where MCP fits in the day-to-day. Less protocol theory, more how a real agent uses these tools to actually do work.
- Benchmarks Said Frontier. Developers Said "Dumb."
What MCP looks like through a benchmark lens. Useful as a counterweight to demo culture.
Caveman vs context-mode: small mouth, or smaller room?
One Claude Code plugin has 63k stars and asks you to talk like a caveman. The other has 15k stars and sandboxes your tool output. The internet picked the funny one. Whether you should depends on which token leak you are actually trying to fix.
read →Benchmarks Said Frontier. Developers Said "Dumb."
Gemini 3.5 Flash topped MCP Atlas, Toolathlon and CharXiv on day one. By the next morning a developer on Google's own forum had documented the model looping for 776 steps. The gap between the benchmark and the work is not a bug.
Getting the Best Out of Claude Code
From status line to custom skills, how to transform Claude Code from a simple CLI into a full-blown development environment.