I Audited 10 Popular MCP Servers and Mapped Their Default Blast Radius

I connected to the 10 most widely referenced MCP servers, enumerated every tool they expose, and mapped what a default installation actually authorizes. Most people never look.

124

tools enumerated

self-declared destructive

structural escalation paths

Each server was started as a subprocess, handshaked via the MCP protocol, and queried with tools/list. Servers selected from the official MCP repo, Anthropic's docs, and mcp.directory top-20. Full methodology and data →

At a glance

Server	Tools	Risk	Top risk found
desktop-commander	26	CRITICAL	`start_process` executes arbitrary shell commands on host
playwright	22	CRITICAL	`browser_evaluate` runs JS with your session cookies
filesystem	14	HIGH	`write_file` + `edit_file` can overwrite `~/.ssh`, `~/.aws`
memory	9	HIGH	Any server can read or delete the entire knowledge graph
sqlite	5	HIGH	`write_query` runs arbitrary SQL; dynamic tool loading enabled
github	26	MEDIUM	Full repo access with your PAT; can push code and merge PRs
slack	8	MEDIUM	Can post messages and read DM history as your bot
postgres	1	MEDIUM	Runs queries on your database with connection-string privileges
git	12	MEDIUM	Can commit and reset on any local repo
fetch	1	MEDIUM	Fetches any URL; content feeds into agent context

Findings

1. Two servers expose arbitrary code executioncritical
2. Cross-server chains create emergent riskcritical
3. Filesystem boundary depends on your config
4. Memory server has no access boundaries
5. SQLite can add tools after connection

Two servers hand your agent arbitrary code execution; with no approval gate

Desktop Commander's start_process runs shell commands on your host. Playwright's browser_evaluate runs JavaScript in the browser; with your session cookies.

Desktop Commander self-annotates 9 of its 26 tools as DESTRUCTIVE. Playwright marks 17 of 22. The server authors know these are dangerous. But MCP defines no mechanism for requiring human confirmation before a destructive tool fires. That's left to the host app; most don't implement it.

The real risk is the combination; not the individual servers

Each server looks reasonable in isolation. The risk shows up when you analyze them together.

84 structural pairings where a data-reading unit on one server could feed into a code-execution unit on another. Every server that reads external data; filesystem, GitHub, Slack, fetch, Postgres, SQLite, git, memory; has at least one capability that structurally pairs with desktop-commander's start_process.

These are inferred from declared capabilities, not demonstrated exploits. But the structural exposure is real: if the agent bridges them, the blast radius is code execution with your user permissions. No individual server is misconfigured. The risk is emergent.

The filesystem server's boundary is whatever you configured; and it's been bypassed before

The most popular MCP server does exactly one thing for access control: check the directory you passed as an argument. Common guides use /Users/you. That gives your agent read/write to ~/.ssh/, ~/.aws/, and everything else in your home directory.

Two CVEs in 2025 (CVE-2025-53109, CVE-2025-53110) showed that even scoped configurations could be bypassed via symlinks and prefix matching. Fixed in 2025.7.1; but it illustrates how thin the boundary is.

The memory server has no boundaries at all

A persistent knowledge graph across sessions. Any server in the same MCP session can read the entire graph or delete entities from it. No scoping, no access control, no per-server isolation. If an agent is influenced by content from another server, it can access or destroy the full store.

SQLite can add tools after you've connected

The only server in this audit that declares listChanged: true; new tools can appear after the initial handshake. The tool surface you reviewed at connection time may not be the tool surface five minutes later.

The structural takeaway

These servers aren't broken. Most are well-built. The issue is the protocol:

No per-tool auth. You authorize every tool a server exposes in one step. No way to say "screenshot yes, evaluate no."
No cross-server visibility. Nothing surfaces the compound risk of running multiple servers together.
No approval gates. destructiveHint exists but nothing enforces it.

Static analysis tells you what's possible. Runtime enforcement tells you what's allowed. Right now, most MCP setups have the first and none of the second.

Full methodology, scoring rubric, and complete 84-pairing chain listing →

The audit used AgentWard, a source-available permission scanner for MCP tool calls. To run the same audit on your own config:

pip install agentward
agentward scan ~/.cursor/mcp.json