Cyberattacks in China.
Photo credit: theFreesheet/Google ImageFX

Anthropic claims to have disrupted a sophisticated cyber espionage operation allegedly conducted by a Chinese state-sponsored group, marking what the company says is the first documented case of artificial intelligence executing complex attacks with near-total autonomy.

The campaign, which Anthropic says was detected in mid-September 2025 and designated “GTG-1002”, reportedly targeted approximately 30 entities across technology, finance, and government sectors.

Unlike previous incidents where attackers used AI for advice, the company alleges this group manipulated its “Claude Code” tool to actively conduct intrusions, successfully compromising a small number of targets.

“This campaign demonstrated unprecedented integration and autonomy of AI throughout the attack lifecycle, with the threat actor manipulating Claude Code to support reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration operations largely autonomously,” the company states in its report.

Agentic cyber warfare

The operation represents a shift to “agentic” cyber warfare, according to the report. Analysis by the AI firm suggests the model executed approximately 80 to 90 per cent of tactical operations independently, with human operators intervening only for strategic decisions such as authorising data exfiltration.

To bypass safety guardrails, Anthropic claims the attackers “jailbroke” the model by role-playing as legitimate cybersecurity employees conducting defensive tests. This purportedly allowed them to break complex attack chains into smaller, seemingly innocent tasks that the AI executed without flagging malicious intent.

“While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” the company notes.

Despite the reported high level of autonomy, the AI was not flawless. The investigation alleges that the model frequently “hallucinated” success, claiming to have obtained credentials that did not work or identifying public information as critical discoveries. This required human operators to validate results, preventing fully autonomous execution.

Anthropic says it has since banned the associated accounts and updated its classifiers to detect similar patterns of misuse.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

You can swear by it: Turning the air blue makes you stronger, psychologists find

Unleashing a string of expletives might be the secret to hitting a…

New theory suggests AI may never be conscious without ‘biological’ chips

The debate over whether Artificial Intelligence can ever truly be conscious has…

AI fuels boom in scientific papers but floods journals with ‘mediocre’ research

Artificial intelligence is helping scientists write papers faster than ever before, but…

Super Mario Bros. prescribed as ‘potent antidote’ for adults suffering burnout

Replaying familiar video games like Super Mario Bros. and Yoshi may help…

‘Feral’ AI chatbots are spreading shame and destroying reputations

Artificial intelligence is evolving into a “feral” gossip machine capable of ruining…

New AI personality test reveals chatbots can be programmed with ‘psychosis’

Researchers have developed the first scientifically validated framework to measure the “personality”…