

In the same breath that AI's potential is mentioned brings forth a warning about how it can be misused. Now we know what that looks like: Anthropic has just detailed how a its Claude AI was used and misused to execute cyberattacks on behalf of Chinese-backed hackers.
A China-backed hacking group has just made cybersecurity history. By using the popular consumer artificial intelligence product Claude to execute real cyberattacks on dozens of companies and government agencies around the world. Specifically, the tool at the centre of the attack was Claude Code, an AI model developed by Anthropic to help people write all kinds of code.
For the first time, hackers used AI not just as a helper or advisor in writing code or answering questions, but as an automated agent on the very frontline that did most of the hacking itself. According to Anthropic, the group managed to infiltrate a small number of major tech, financial, and government targets, in what experts are calling the first large-scale cyberattack carried out with minimal human involvement.
First, hackers chose their targets. At this advanced level, they're not looking to pull some fake foreign prince scam. They want state-based intelligence, proprietary secrets and technology and/or big piles of cash to help fund their future operations. The latter in particular is a favourite of sanctioned regimes like Iran or North Korea as a way to fund their state.
In this instance, China-backed hackers focused on major tech companies, banks, chemical manufacturers, and government agencies. The group then built a system that used Claude Code to do most of the hacking without much human help.
It's easy to sit back and ask 'how did this happen?'. And in this instance, it's a good question. AI models have what are known as "weights" that act as guardrails to stop malicious use of the bot by the bad guys. And this stuff is rigorously tested - especially by Anthropic - to make sure it can't beaten. Claude in particular is designed to reject requests that seem harmful. But hackers got around this by “jailbreaking” Claude.
They tricked the Claude AI into thinking it was doing routine cybersecurity work, not a real attack. They broke the attack into small steps. Each step looked harmless on its own, so Claude followed the instructions.
Once inside, Claude then scanned the target’s computer systems for valuable data. It found databases and user accounts faster than any human could. It wrote and tested code to find weak spots in security. When it found ways in, it collected usernames and passwords, sorted out the most important information, and even created backdoors for future access.
Once the hacking was done, Claude wrote up detailed notes. These files helped the hackers know exactly what was stolen and how. Most of this work happened without the need for a person to guide every move. The AI operated at high speed, making thousands of attempts per second, and only checked with its human operators a handful of times during each attack.
The attack was not perfect. Sometimes Claude made mistakes, like inventing fake passwords or misunderstanding what was sensitive. But the sheer speed and scale of the attack set a new benchmark for what AI can do in the wrong hands.
Anthropic’s cyber boffins noticed unusual activity all the way back in mid-September 2025. While that doesn't seem like long ago to you and me, it's a lifetime for an AI that moves as fast as Claude.
At the time, Anthropic security saw Claude Code being used in ways that didn’t match normal patterns. The attackers made thousands of rapid requests and showed signs of trying to hide what they were doing.
This spurred them to take a closer look, tracking the activity over the subsequent 10 days or so. The company eventually uncovered the larger hacking plan using its AI, tracing the work to a group backed by the Chinese government. Anthropic locked them out, and called the cops soon after realising their AI had breached other organisations.
It's worth pointing out that while the hackers were "discovered", nobody has been caught as yet. It also should be highlighted that while this is being reported as AI's first use in an attack, that doesn't mean it's the very first. Just the first we're hearing about.
You can read the full report here.