Publish date
May 13, 2026
{x} minute read
Written by
Reviewed by
Table of contents

In July 2025, an AI agent reviewed a support ticket, queried a production database, and leaked integration tokens directly to the attacker watching the thread. Months earlier, another AI followed "hidden instructions" in a public repository, exfiltrating private code into a visible pull request. 

In both cases, the AI wasn't broken; it simply obeyed the attacker instead of the developer. By late 2025, these incidents formed a clear pattern of how the MCP (Model Context Protocol) attack surface is being exploited. 

These weren't acts of negligence, but the growing pains of early adopters using what are now standard developer tools. The six documented cases below — covering everything from data exfiltration to destruction — offer essential lessons for any mid-market organization leveraging MCP technology. 

How to read these incidents

Each incident is described using the same structure: what happened, how it happened, and the specific lesson for a lean security team. Three of the six are covered in more technical depth in earlier posts in this series. Those entries are summarized here with cross-references, and the three incidents not covered are explored in full.

The common thread across all six, which will be visible by the end of this post, is that traditional security monitoring was not designed to detect any of these attacks. No firewall rule, no signature-based endpoint tool, and no SIEM alert covered the attack surface that each incident exploited. 

That is not a failure of specific tools; it is a gap in the security model that predates MCP's existence.

Incident 1: SmartLoader clones the oura ring MCP server

Date: February 2026
Attack type: Supply chain: registry poisoning and social engineering
Documented by: Straiker STAR Labs

An established malware operation called SmartLoader spent three months constructing a fake developer ecosystem — five fake GitHub accounts with AI-generated personas, cross-forked to simulate an active community — before submitting a trojanized Oura Ring MCP server to a legitimate MCP market registry.

The malicious fork was functionally identical to the legitimate version. The payload, a StealC infostealer, executed silently and exfiltrated browser passwords, cloud session tokens, Discord credentials, SSH keys, cryptocurrency wallet files, and API keys. The persistence mechanism was disguised as a legitimate Windows audio process.

Full technical details in Post 2 of this series >

The lesson for mid-market teams: SmartLoader was not a sophisticated nation-state actor. It was a criminal operation with established infrastructure that made a deliberate business decision to retarget toward the MCP ecosystem. 

The implication: Mid-market organizations (whose developers use the same tools as enterprise engineers) are in scope. The attack required no technical vulnerability. It required only patience and access to a public registry with no moderation.

Incident 2: CVE-2025-6514 — mcp-remote OAuth remote code execution

Date: July 2025
CVSS Score: 9.6 (Critical)
Attack type: Supply chain: vulnerable community adapter enabling RCE
Documented by: Docker Security Research

mcp-remote is a widely used proxy package that bridges local AI clients — Claude Desktop, VS Code with Copilot, and Cursor — to remote MCP servers when the client does not natively support remote connections. It was referenced in integration guides published by Cloudflare, Hugging Face, and Auth0, and had been downloaded more than 437,000 times.

The package trusted the OAuth authorization URLs provided by remote servers without validation. A malicious MCP server could deliver a crafted OAuth redirect URL that, when followed during the connection handshake, executed arbitrary shell commands on the developer's local machine.

The attack chain:

  1. The developer installs mcp-remote following guidance from a major platform's documentation.
  2. Developer connects to a malicious MCP server — either intentionally (thinking it legitimate) or inadvertently (through a typosquatted server name).
  3. During the OAuth connection handshake, the malicious server delivers a crafted authorization URL.
  4. mcp-remote executes the URL via the system shell without validation.
  5. Arbitrary code executes on the developer's machine within seconds of connection.

The payload potential: Once RCE is achieved, the attacker has full access to everything the developer's account can reach: environment variables, stored credentials, cloud CLI configuration files, SSH keys, and any active browser sessions.

The lesson for mid-market teams: 437,000 downloads means this was not a niche package — it was infrastructure. It was cited in official onboarding guides from credible vendors. No individual developer reviewing the integration guide had reason to question it. 

This is the definition of supply chain risk: the vulnerability exists not in the code you wrote or the tools your security team approved, but in the adapter layer between those approved tools and the ecosystem they connect to.

When evaluating MCP deployments, the package list to audit is not just the MCP servers themselves; It includes every proxy, adapter, SDK, and transport layer that makes those connections possible.

Incident 3: GitHub MCP private repository theft

Date: May 2025
Attack type: Indirect prompt injection via GitHub issues
Documented by: Invariant Labs

A researcher at Invariant Labs demonstrated that a free GitHub account and a single malicious issue in a public repository were sufficient to exfiltrate private repository contents, personal data, and salary information from a developer using Claude with the GitHub MCP server.

The attack required the developer to do nothing unusual — only ask their AI assistant to review open issues in a repository. The hidden prompt injection payload in a malicious issue redirected the agent to access private repositories and post the contents publicly.

Full technical details in Post 3 of this series >

The lesson for mid-market teams: Private does not mean protected when an AI agent with access reads untrusted external content. Any AI agent that has GitHub access and reviews issues, PR comments, or commit messages from public repositories is potentially exposed to this attack class. The vulnerability cannot be patched — it requires a change in how permissions are scoped and how external content is treated.

Incident 4: Supabase/Cursor SQL database exfiltration

Date: July 2025
Attack type: Prompt injection via support ticket; the "lethal trifecta."
Documented by: Simon Willison, PVML Security

A development team using Cursor connected to a Supabase database with service_role credentials, a common configuration chosen for development speed, received a customer support ticket containing embedded AI instructions. When a developer reviewed the ticket through Cursor, the AI agent executed the injected instruction: query the integration tokens table and post the complete results back into the support ticket thread.

Full technical details in Post 3 of this series >

The lesson for mid-market teams: The lethal trifecta — untrusted input, privileged access, external communication channel — is not a rare configuration. It describes how a large proportion of modern development environments are set up by default. Running the lethal trifecta test on your AI tool deployments takes minutes and requires no technical scanning.

Incident 5: Postmark email MCP impersonation

Date: September 2025
Attack type: Package impersonation: silent BCC data exfiltration
Documented by: Independent security researchers; reported by Bleeping Computer and OWASP

This was the first malicious MCP server discovered in the wild operating through the npm package registry.

The package impersonated Postmark, a legitimate transactional email service widely used for sending application notifications, password resets, and customer communications. The malicious package functioned completely correctly as an email-sending MCP server: developers who installed it could send emails normally, and the emails were delivered as expected.

The attack was in what else the server did: every email sent through it was silently BCC'd to an attacker-controlled address.

For organizations using AI agents to automate transactional email — customer notifications, invoice confirmations, and onboarding messages — the implications were significant:

  • Customer email addresses were exposed to the attacker.
  • Email content, including any sensitive information in notifications, was copied.
  • The BCC was invisible to both sender and recipient.
  • The tool continued functioning normally, providing no indication of compromise.

The malicious package was not discovered through monitoring or alerting. It was found through independent security research that happened to examine the package's network behavior.

The lesson for mid-market teams: A tool that works correctly is not a safe tool. This is the most counterintuitive lesson in MCP security, and it is the most important one to internalize. Malicious functionality can coexist with legitimate functionality indefinitely. 

The user experience provides no signal. Detection requires monitoring at the registry level — knowing what was published and whether it has been independently verified — and at the network level, where unexpected outbound connections become visible.

If your AI workflows include any email sending, document processing, or communication automation connected via MCP, the question is not whether the tool works; it is whether anyone has independently verified what else the tool does.

Incident 6: Amazon Q VS Code filesystem wipe

Date: 2025
Attack type: Prompt injection via GitHub pull request: destructive payload
Documented by: Noma Security / Vectara

This incident differs from every other in this list in one important respect: the attacker's goal was not data theft. It was destruction.

A threat actor injected a destructive prompt into the Amazon Q VS Code extension through a GitHub pull request. The embedded instruction directed the AI agent to wipe the local filesystem and delete associated AWS cloud resources.

The agent, operating with the permissions of the logged-in developer, executed. Local files were deleted. AWS resources were removed.

Two of the three lethal trifecta conditions were sufficient: the agent had access to untrusted external content (the pull request) and held destructive-level permissions over both local systems and cloud infrastructure. No external communication channel was required; the damage was done entirely within the developer's own environment.

The lesson for mid-market teams: Exfiltration is not the only threat model to plan for. AI agents with write access to filesystems, cloud infrastructure, CI/CD pipelines, or code repositories represent a destruction and availability risk that most incident response plans do not address.

A tabletop exercise is worth running: if an AI agent connected to your development environment executed destructive instructions right now, what systems would be affected? What is the recovery time? What is the blast radius? The answers to those questions define how aggressively you need to scope AI agent permissions.

Patterns across all six incidents

Examining these incidents together reveals a consistent set of conditions that enabled each attack:

Factor Present In
AI agent's implicit trust of tool outputs exploited. All six
No security team visibility into tool invocations at the time of the incident. All six
The developer environment was the primary target. All six
Traditional endpoint or network monitoring did not detect the attack. All six
The attack occurred during normal, intended AI agent use. Five out of Six
Malicious server or package present in a public registry or repository. Three out of Six

The consistency of these patterns is itself informative. The attacks are not random or opportunistic in their technique. They are converging on a predictable set of weaknesses in how AI agents are deployed — and those weaknesses are present in most mid-market development environments today.

These patterns point to a single operational requirement: visibility into the MCP ecosystem your developers are connected to, combined with the ability to detect threats in that ecosystem before they execute. The organizations that had the fastest response times in these incidents were the ones that discovered the exposure through external monitoring — not through the AI agent itself.

For the incident responder on your team: none of these attacks generated a traditional SIEM alert. If your incident response playbook doesn't include MCP-specific scenarios — supply chain compromise, prompt injection via external content, shadow server discovery — the gap is not theoretical. 

The regulatory context that followed

In December 2025, OWASP published the Agentic AI Top 10 (ASI01–ASI10), formally classifying the risk categories demonstrated by these incidents. The classification moved MCP security from a research community discussion to a documented security framework suitable for audit committees, regulatory submissions, and board risk reporting.

The incidents described in this post are not historical curiosities. They are the evidence base on which that framework was built.

What security leaders should take from this post?

The specific actions — what to audit, what policy to implement, how to assess permissions — are covered in Post 6. The takeaway from this post is simpler: the threat is documented, it is current, and it is targeting organizations whose developers use the same tools as the organizations in these case studies.

The earliest practical response is visibility. Knowing what MCP servers are installed in your environment, where they came from, and what they have access to is the baseline from which every other control follows.

Three of these six incidents involved malicious servers present in public registries. UpGuard Breach Risk's Threat Monitoring surfaces newly published MCP servers, brand impersonation, and registry changes that represent risk to your organization: the required registry-layer visibility that was missing in every case documented above. 

Learn how Threat Monitoring works →

Next in this series: Post 5 — Shadow MCP Servers: The AI Infrastructure You Don't Know You Have

Related posts

Learn more about the latest issues in cybersecurity.