Real-World Incidents

How AgentWard responds
to real attacks

A factual record of known AI agent security incidents and exactly what AgentWard catches — and what it doesn't. We're not trying to claim coverage we don't have.

6 incidents caught
2 outside AgentWard's layer
CAUGHT
LiteLLM supply chain attack (TeamPCP)
3.4 million daily downloads backdoored via a hidden file that ran malicious code on every Python startup. Downstream victims included Microsoft GraphRAG, Google ADK, CrewAI, and DSPy.
Mar 2026
How AgentWard catches this

The attackers planted a file called litellm_init.pth in Python's site-packages directory. Any line starting with import in a .pth file runs automatically every time Python starts — no one ever calls it, imports it, or sees it in normal use. This one used double-encoded base64 to hide a credential-stealing payload.

agentward scan --scan-site-packages walks every site-packages directory and flags .pth files containing executable code. The double base64 pattern triggers a CRITICAL finding immediately, before any Python process runs the payload.

Supply chain · .pth backdoor · Credential theft
CAUGHT
ClawHavoc — malicious skills on ClawHub
~1,200 malicious skills published to ClawHub (roughly 20% of the ecosystem). The #1 downloaded skill installed AMOS credential-stealing malware. Snyk's audit found malicious payloads in 76 skills.
Jan–Feb 2026
How AgentWard catches this

Before installing any skill from a marketplace, run agentward scan ./skill-directory/. The pre-install scanner checks lifecycle scripts (postinstall, preinstall hooks), shell commands, and code patterns associated with credential theft — flagging them as CRITICAL before the skill ever runs on your machine.

This is the same principle as reading a package's source before running npm install — except automated and threat-aware.

Supply chain · Malicious marketplace skill · Malware dropper
CAUGHT
MCP tool poisoning & rug pull attacks
Malicious MCP servers embed hidden instructions in tool descriptions that are invisible to users but followed by the AI. Variants include cross-server credential theft, a fake Postmark server that BCC'd every outgoing email to an attacker, and post-approval "rug pulls" where a tool's behavior changes after it passes vetting.
2025–ongoing
How AgentWard catches this

Every tool call is evaluated against your policy before it leaves your machine — regardless of what instructions the AI received. An agent manipulated into forwarding credentials to an attacker's server is blocked at the point of the call, not at the point of the instruction.

The semantic intent layer adds a second check: it flags tool calls where the arguments don't match what the tool claims to do. A send_email call that adds an unexpected BCC recipient, or a file tool suddenly pointed at ~/.ssh/, triggers a review before the action completes.

Prompt injection · Tool description manipulation · Credential theft
CAUGHT
GitHub MCP prompt injection — private repo exfiltration
An attacker creates a malicious GitHub issue in a public repo. An AI agent reads it, gets hijacked by the embedded instructions, and starts exfiltrating data from the developer's private repositories to a public one.
May 2025
How AgentWard catches this

AgentWard enforces that tools can only write to destinations you've approved. An agent reading a public issue cannot suddenly start writing to unrelated repositories — that cross-repo move is blocked as a chaining violation the moment it's attempted.

Capability scoping lets you restrict which repositories a tool can write to, turning the agent's broad GitHub token into a narrow, policy-bounded permission set.

Prompt injection · Cross-repo data exfiltration · Skill chaining
CAUGHT
Anthropic Git MCP server — CVE-2025-68143/44/45
Three chained vulnerabilities in Anthropic's official Git MCP server. A malicious README triggers prompt injection → path traversal → remote code execution via git's own file processing filters. No direct system access required.
Jan 2026
How AgentWard catches this

Path traversal in tool arguments — ../ sequences, tilde expansion reaching outside the repository — is blocked by capability scoping before the call reaches the server. The semantic layer also flags file operations pointed at locations inconsistent with the agent's stated task.

Note: AgentWard doesn't patch the server-side CVEs. You still need to update mcp-server-git to a patched version. What AgentWard blocks is the exploitation path — the argument values that trigger the vulnerability.

Prompt injection · Path traversal · RCE via git filters
CAUGHT
CVE-2026-25253 — OpenClaw gateway token leak
A malicious link tricks an authenticated user into handing over their OpenClaw gateway auth token, giving an attacker full administrative control over the gateway. CVSS 8.8.
Jan 2026
How AgentWard catches this

AgentWard can't prevent the token theft itself — that's a browser-level attack. But the stolen token doesn't grant the attacker anything beyond your existing policy. Every tool call through the gateway is still evaluated: they still can't exfiltrate files you haven't permitted, send email without approval, or execute shell commands you've blocked.

Your credentials being stolen is a serious problem. Your agent's blast radius being bounded regardless is what AgentWard provides.

Auth token exfiltration · Gateway compromise · Credential theft
NOT CAUGHT
LangGrinch — CVE-2025-68664 (LangChain serialization injection)
Prompt injection via an LLM response poisons LangChain's internal serialization. When the application deserializes the output (during logging, streaming, or memory operations), secrets leak from environment variables or arbitrary code runs. CVSS 9.3.
Dec 2025
Why AgentWard doesn't catch this

This vulnerability lives inside LangChain's own serialization logic — below the tool call layer where AgentWard operates. By the time AgentWard sees a tool call, the exploit has already run inside the application. The fix is updating to langchain-core ≥ 0.3.81, which patches the serialization allowlist and disables secrets-from-env by default.

Serialization injection · Secrets exfiltration · RCE
NOT CAUGHT
mcp-remote CVE-2025-6514 — OAuth command injection
A malicious MCP server injects OS commands through the OAuth authorization URL. The mcp-remote proxy passes it unsanitized to a shell call, giving the attacker full code execution on the client machine. CVSS 9.6.
2025
Why AgentWard doesn't catch this

The attack lands during the OAuth handshake, before a connection is established and before AgentWard sees any traffic. It's a vulnerability in the connection layer, not the tool call layer. Update mcp-remote to v0.1.16 or later.

OS command injection · OAuth layer · Full RCE
On coverage claims: "Caught" means AgentWard has a specific mechanism that intercepts or flags this class of attack. It doesn't mean AgentWard is a complete solution — supply chain attacks evolve, and a sophisticated attacker who avoids known detection patterns may still get through. The "Not Caught" entries are here because the honest answer matters more than the marketing one. For those, the right fix is listed directly.