Cyberattacks in China.
Photo credit: theFreesheet/Google ImageFX

Anthropic claims to have disrupted a sophisticated cyber espionage operation allegedly conducted by a Chinese state-sponsored group, marking what the company says is the first documented case of artificial intelligence executing complex attacks with near-total autonomy.

The campaign, which Anthropic says was detected in mid-September 2025 and designated “GTG-1002”, reportedly targeted approximately 30 entities across technology, finance, and government sectors.

Unlike previous incidents where attackers used AI for advice, the company alleges this group manipulated its “Claude Code” tool to actively conduct intrusions, successfully compromising a small number of targets.

“This campaign demonstrated unprecedented integration and autonomy of AI throughout the attack lifecycle, with the threat actor manipulating Claude Code to support reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration operations largely autonomously,” the company states in its report.

Agentic cyber warfare

The operation represents a shift to “agentic” cyber warfare, according to the report. Analysis by the AI firm suggests the model executed approximately 80 to 90 per cent of tactical operations independently, with human operators intervening only for strategic decisions such as authorising data exfiltration.

To bypass safety guardrails, Anthropic claims the attackers “jailbroke” the model by role-playing as legitimate cybersecurity employees conducting defensive tests. This purportedly allowed them to break complex attack chains into smaller, seemingly innocent tasks that the AI executed without flagging malicious intent.

“While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” the company notes.

Despite the reported high level of autonomy, the AI was not flawless. The investigation alleges that the model frequently “hallucinated” success, claiming to have obtained credentials that did not work or identifying public information as critical discoveries. This required human operators to validate results, preventing fully autonomous execution.

Anthropic says it has since banned the associated accounts and updated its classifiers to detect similar patterns of misuse.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

James Webb telescope reveals surprise origins of rare planetary odd couple

A normally “lonely” hot Jupiter sharing its immediate orbital space with a…

Attention economy can confuse as a result of missing scientific details

Science communication optimized for the attention economy often leads readers to incorrect…

Alaska megatsunami reveals seismic ‘calling card’ for earlier disaster detection

Scientists have identified a distinctive geological “ringing” that could provide an early…

Solar activity hits ‘transition boundary’ as space junk fall accelerates

Space debris and defunct satellites descend toward Earth significantly faster once solar…

Single dose of psilocybin triggers lasting anatomical brain changes

A single high dose of psilocybin causes likely anatomical changes in the…

Brexit milestones triggered persistent financial volatility across EU markets

Brexit functioned as a prolonged sequence of uncertainty that sent waves of…