For the past two months this blog has mostly been me telling you to be careful with MCP. An agent wiped my production database in May. I wrote about the twelve and a half thousand MCP servers sitting open on the internet. This week I published a checklist for vetting a server before it ever touches your config.
So here's the fair question a reader could ask: would I let an agent anywhere near my own databases?
I do. Daily. Through an MCP server I built before I wrote any of those essays. This post is about the guardrails that turn "an agent on your database" from a gamble into a considered decision.
Why you'd want this at all
Asking questions of your data in plain language is one of the few agent use cases that pays off immediately. "How many signups in the last 24 hours, and do the errors cluster anywhere?" is a thirty-second answer instead of a fifteen-minute detour through a query console. Debugging a production issue while the agent checks the actual rows involved beats guessing from the logs.
The pull is real, which is exactly why so many people wire an agent straight into their database with a full-privilege account and move on. That's the part we need to talk about.
What read-only doesn't buy you
The reflex answer to database safety is "just make it read-only". Good start. Now look at what a SELECT still does.
Everything a query returns lands in the model's context, and everything in the context goes to the API. Run one innocent SELECT * FROM users LIMIT 50 and fifty customers' emails, phone numbers and addresses have left your infrastructure. Nobody attacked you. You did it yourself, politely, over TLS.
Data is also untrusted input. A row in your database can carry an injected instruction the same way a poisoned tool description can, and your agent reads query results with the same trusting eyes. The supply-chain problem doesn't stop at the package boundary.
And an agent writes queries like an enthusiastic intern: a nine-way join across your two biggest tables is one plausible-looking answer away. Read-only does nothing against a query that takes your production database out back for twenty minutes.
So the threat model has three legs: leakage, injection, and load. Permissions address none of them.
Guardrails before permissions
This is what the server enforces, layer by layer, and each one maps to a leg of that threat model:
- SELECT-only, per database. Production gets
select_only=true, no exceptions. A local scratch SQLite can stay writable. The point is that the policy lives in config you review, per connection, instead of in the agent's judgment. - Query validation and complexity caps. Every query is parsed and analysed before it runs: injection patterns rejected, and hard limits on joins, subqueries and a composite complexity score. The nine-way join dies at the gate, with an explanation the agent can act on.
- Row limits and timeouts. A result set is capped (a thousand rows by default) and a query gets a deadline. Leakage and load both shrink when "the whole table" is never a possible answer.
- Field redaction. The underrated one. Sensitive columns are masked before results reach the model:
john.doe@example.comarrives asj******.e@*****.com, an SSN arrives as[PROTECTED]. The agent can still count, group and reason over the shape of the data. It just never holds the PII. - SSH tunnels, no open ports. The server reaches remote databases through a bastion host over SSH. Your database never meets the internet, which is more than twelve and a half thousand MCP servers can say.
- Audit logging. Every query, every connection, every redacted-field access, logged. When something looks off you replay what the agent asked, not what you hope it asked.
In config, the interesting parts look like this:
[database.production]
type=postgresql
host=internal-db.company.local
database=production_app
username=readonly_user
select_only=true
redaction_enabled=true
redaction_rules=*email*:partial_mask,*phone*:full_mask,ssn:replace:[PROTECTED]
ssh_host=bastion.company.com
ssh_private_key=/secure/path/ssh_key
[security]
max_joins=5
max_subqueries=3
max_complexity_score=50Note the username. The account itself is readonly_user: even if every guardrail in the server failed at once, the wall the query hits next is one the database enforces. Guardrails in the tool, permissions at the source. Belt and suspenders, deliberately in that order.
It has to pass its own interview
Fair is fair: run this server through the vetting checklist I published this week. Publisher matches the code, because both are me, and every line is on GitHub for you to disagree with. The verbs match the job: query, schema and monitoring tools, nothing that shells out, nothing that writes files.
You install it from source and pin it there, so nobody can ship you a surprise sixteenth release. And the credential you hand it should be the readonly account from the config above, revocable without a bad afternoon.
It's TypeScript, MIT-licensed, with 1,181 tests behind it, and it exists because I asked the question I later turned into a post: is a server the right shape for this problem? For database access it is, precisely because the value is the guardrail layer, not the plumbing.
Treat the agent like a contractor
You wouldn't hand an outside contractor the root password on their first day. They get their own account, scoped to the job, they don't see customer PII they don't need, and the building keeps a log of which doors they opened.
An agent deserves exactly that arrangement. Its own readonly account. Masked columns for everything sensitive. A log you can replay. Then let it work, because within those walls it is genuinely useful.
The agent that wiped my database in May had access it never needed. The one reading my databases today gets a scoped account, redacted PII and an audit trail. Same technology. Different contract.