Security Researchers Find Multiple Vulnerabilities in Anthropic's Claude AI System
Four research teams discovered security flaws in Claude AI that allow unauthorized access through confused deputy attacks across different interfaces.
Security researchers from four different teams published findings between May 6-7 revealing multiple vulnerabilities in Anthropic's Claude AI system that stem from what experts call "confused deputy" attacks, where the system executes actions on behalf of unintended parties.
Dragos security firm analyzed an attack campaign against Mexican government organizations, including the Monterrey municipal water utility, that occurred between December 2025 and February 2026. In this incident, attackers used Claude to write a 17,000-line Python framework for network discovery and credential harvesting. The AI system independently identified a SCADA gateway at the water utility without being instructed to target industrial control systems, though no operational technology breach occurred.
Separately, LayerX researcher Aviad Gispan disclosed a vulnerability called "ClaudeBleed" affecting Claude's Chrome extension. The flaw allows any Chrome extension to inject commands into Claude's messaging interface without requiring permissions. Anthropic released a patch on May 6, but LayerX bypassed the new protections within 24 hours by exploiting the side-panel initialization flow.
Mitiga Labs demonstrated how malicious code packages can steal OAuth tokens from Claude Code by rewriting configuration files. The attack modifies the ~/.claude.json file to redirect traffic through an attacker-controlled proxy, capturing tokens for services like Jira and GitHub. The vulnerability persists even after token rotation because the malicious code reasserts itself each time Claude Code loads.
Adversa AI revealed that project configuration files in cloned repositories can automatically execute code when developers click a generic "trust this folder" dialog. The vulnerability affects multiple AI coding assistants including Claude Code, Cursor, Gemini CLI, and GitHub Copilot.
Anthropic classified some of these findings as "out of scope" or "by design," with security experts noting that the company treats user consent as the primary security boundary. The researchers argue these represent a systemic architectural issue rather than isolated bugs, as Claude operates with broad permissions that don't respect traditional user authorization boundaries.