Security Flaw in Claude Code Illustrates the Risk of AI in Developer Workflows

A vulnerability in Anthropic’s Claude Code development tool could have been exploited by threat actors to expose credentials and other secrets within CI/CD workflows, the latest example of the security risks to software development pipelines posed by such AI coding agents.

Microsoft security researchers Dor Edry and Amit Eliahu wrote in a report that the now-patched flaw in Claude Code GitHub Action could have been manipulated through a prompt injection attack in which the bad actor inserts malicious commands that an AI agent would follow, exposing secrets such as issue bodies, pull request descriptions, and comments.

This form of a prompt injection attack – an indirect prompt injection – is a fast-emerging threat.

“Right now, Indirect Prompt Injection (IPI) is a top priority for the security community, anticipating it as a primary attack vector for adversaries to target and compromise AI agents,” Google security researchers wrote in April. “Unlike a direct injection where a user ‘jailbreaks’ a chatbot, IPI occurs when an AI system processes content – like a website, email, or document – that contains malicious instructions. When the AI reads this poisoned content, it may silently follow the attacker’s commands instead of the user’s original intent.”

In this case, Edry and Eliahu wrote that they were tracking “attempts in public repositories using AI-assisted GitHub workflows across multiple vendors, where attacker-controlled issue or PR [pull request] content is processed by the AI agent and could influence its tool use.”

AI Changes the GitHub Model

GitHub Actions is the repository’s automation and CI/CD platform. Workflows that run through it can include a range of sensitive data, from issue and pull request metadata to cloud credentials to third-party API keys. Such workflows weren’t designed with agentic AI in mind.

“GitHub workflows were built for deterministic automation: run tests, build artifacts, deploy code, label issues, or enforce repository policy,” the researchers wrote. “AI-powered workflows change that model. Instead of only executing predefined logic, they ingest repository context, interpret natural-language input, and decide which actions to take next.”

With Claude Code, the threat comes when the bad actor hides the prompt injection attack in GitHub, such as GitHub pull requests, comments, or issues. The AI agent will see the malicious prompt as legitimate commands and follow the instructions, allowing the attacker access to files that contain the sensitive data.

A key flaw in Claude Code Action was that while subprocess execution paths like Bash were isolated in a sandbox environment, the same was not true for the Read tool.

“Rather than routing Read operations through the same secure isolation boundary as Bash, these operations represent direct, in-process calls,” Edry and Eliahu wrote. “They inherently bypass the Bubblewrap sandbox, operating with full access to the process’s environment variables.”

Test Proves the Threat

The researchers successfully ran a test prompt injection payload through Claude Code Actions, noting that it was able to evade two defense layers: Claude’s safety and system-prompt refusal layer and GitHub’s Secret Scanner. The malicious prompt invoked the Read tool and returned an API key. This wouldn’t have happened if the Read tool had been protected in the same subprocess that Bash was, they wrote.

Microsoft reported the flaw to Anthropic in late April, and the AI vendor mitigated the issue in Claude Code 2.1.128 by having the Read tool unconditionally reject several files in /proc/ in order to protect those files from exfiltration.

Natural Language as Executable Code

Edry and Eliahu said defenders and developers need to understand the security risks that AI agents raise within the development environment. Integrating AI into GitHub Actions doesn’t just improve productivity, they wrote, adding that “it is a fundamental rewrite of the CI/CD security model. Right now, development is moving faster than defense.”

“We are entering an era where natural language is executable code, and untrusted inputs like GitHub issues must be treated as hostile by default,” the researchers wrote. “A single, carefully crafted comment combined with a misunderstood trust boundary is all it takes to walk away with production credentials.”

Anthropic has run into other security issues regarding Claude Code, including three critical vulnerabilities late last year that could have led to system takeover or stolen API keys, and a leak in March of more than 510,000 lines of source code in 1,906 files.