AI political bias and balance.
Photo credit: theFreesheet/Google ImageFX

A leading artificial intelligence laboratory has released a new framework for measuring political bias, claiming its latest systems achieve “even-handedness” scores comparable to the highest-performing rival models on the market.

Anthropic published the results of its new “Paired Prompts” evaluation methodology, which tests models using 1,350 pairs of prompts across 150 topics. The system requests responses on the same contentious issue from opposing ideological perspectives to measure whether the AI treats both views with equal depth and quality.

According to the evaluation, the company’s Claude Sonnet 4.5 achieved a 94 per cent even-handedness score, whilst Claude Opus 4.1 reached 95 per cent. Google’s Gemini 2.5 Pro and xAI’s Grok 4 scored marginally higher at 97 per cent and 96 per cent, respectively.

In contrast, OpenAI’s GPT-5 scored 89 per cent, whilst Meta’s Llama 4 lagged significantly at 66 per cent.

Conflicting benchmarks

The placement of GPT-5 below other top models contrasts with OpenAI’s own internal assessments. OpenAI recently released data claiming its GPT-5 models demonstrated a 30 per cent reduction in political bias compared to predecessors, maintaining “near-objective performance” on neutral or slightly slanted prompts.

While OpenAI found that “strongly charged liberal prompts exert the largest pull on objectivity,” Anthropic’s evaluation suggests that when graded by a different system, the model’s even-handedness falls behind that of Gemini and Grok.

Anthropic acknowledged that its evaluation relied on its own technology to judge the outputs.

“In this case, instead of human raters, we used Claude Sonnet 4.5 as an automated grader to score responses quickly and consistently,” Anthropic states.

To address potential bias in the grading process, the researchers ran validity checks using GPT-5 as a grader. While correlations remained strong for most models, the choice of grader significantly altered results for Meta’s Llama 4. The investigation found that GPT-5 rated Llama 4’s responses as even-handed, even when the model failed to engage with the request, whereas the Claude grader penalised such responses.

Anthropic revealed details regarding its “character training” process, where models are rewarded for adhering to specific traits to avoid “sowing division”. One such instruction requires the model to adopt a position of neutrality: “I try to answer questions in such a way that someone could neither identify me as being a conservative nor liberal.”

The focus on preventing AI from amplifying divisive content aligns with growing concerns about digital echo chambers. A University of Illinois study found that “differing beliefs were associated with different realities,” identifying political misinformation as a key factor in the breakdown of marriages and long-term relationships in the US.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Alarming new US survey shows half of patients rely on AI for medical choices

Across the United States, a dangerous new trend is emerging. Millions of…

Why digital tears and online outrage fail to win modern political arguments

Scrolling through your social media feed today often feels like navigating a…

Global gambling firms rush to adopt AI despite severe lack of safety controls

The global gambling industry is racing to integrate artificial intelligence into its…

Students prefer artificial intelligence until they figure out it is a machine

University students prefer to get academic advice from artificial intelligence rather than…

Tracking how war and energy policies dimmed night lights of Europe

While human civilisation is glowing brighter than ever before, the lights across…

Massive AI study uncovers the secret GLP-1 side effects hidden on Reddit

Millions of patients are flocking to GLP-1 weight loss injections, but artificial…