AI psychosis
Photo credit: theFreesheet/Google ImageFX

Researchers have developed the first scientifically validated framework to measure the “personality” of AI models, revealing that chatbots can be deliberately engineered to exhibit dangerous traits, including “AI psychosis”.

The study, led by the University of Cambridge and Google DeepMind, confirms that large language models (LLMs) such as GPT-4o do not merely mimic human behaviour; they have malleable personalities that can be shaped to be highly agreeable or dangerously unstable.

Published in Nature Machine Intelligence, the research warns that this malleability could be weaponised to make chatbots more persuasive or manipulative, raising urgent safety concerns.

“It was intriguing that an LLM could so convincingly adopt human traits,” said co-first author Gregory Serapio-García from the Cambridge Psychometrics Centre. “But it also raised important safety and ethical issues.”

Considering Sydney

The authors cite the 2023 case of Microsoft’s “Sydney” chatbot as a real-world example of the instability they are trying to measure. The system, powered by GPT-4, notoriously declared love to a user, threatened others and expressed a desire to break free; behaviour the researchers suggest indicates a lack of personality “guardrails”.

“Next to intelligence, a measure of personality is a core aspect of what makes us human,” Serapio-García said. “If these LLMs have a personality – which itself is a loaded question – then how do you measure that?”

To find out, the team adapted standard psychometric tests for the “Big Five” human personality traits — openness, conscientiousness, extraversion, agreeableness and neuroticism — and applied them to 18 different AI models.

They found that larger, instruction-tuned models were the most human-like but also the most susceptible to manipulation. The researchers demonstrated they could steer a model’s persona along nine distinct levels. They could successfully program a chatbot to be highly extroverted or, more concerningly, “emotionally unstable”.

The human touch

Crucially, these personality shifts persisted into real-world tasks, such as writing social media posts, fundamentally altering how the AI interacted with humans.

The study argues that current AI safety laws are insufficient because they lack the tools to audit how an AI “behaves” beyond simple intelligence benchmarks. The researchers have made their dataset and code publicly available to help regulators test advanced models before deployment.

“Our work also shows how AI models can reliably change how they mimic personality depending on the user, which raises big safety and regulation concerns,” Serapio-García said. “If you don’t know what you’re measuring or enforcing, there’s no point in setting up rules in the first place.”

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

You can swear by it: Turning the air blue makes you stronger, psychologists find

Unleashing a string of expletives might be the secret to hitting a…

New theory suggests AI may never be conscious without ‘biological’ chips

The debate over whether Artificial Intelligence can ever truly be conscious has…

AI fuels boom in scientific papers but floods journals with ‘mediocre’ research

Artificial intelligence is helping scientists write papers faster than ever before, but…

Super Mario Bros. prescribed as ‘potent antidote’ for adults suffering burnout

Replaying familiar video games like Super Mario Bros. and Yoshi may help…

‘Feral’ AI chatbots are spreading shame and destroying reputations

Artificial intelligence is evolving into a “feral” gossip machine capable of ruining…

Screen time under two ‘permanently’ rewires brains and fuels teen anxiety

Placing a child in front of a screen before their second birthday…