chatbot
Photo credit: James Grills

People speak to AI chatbots differently than to humans, using language that is 14.5 per cent less polite and 5.3 per cent less grammatically fluent, reducing chatbot accuracy when models are trained only on human-to-human conversations.

Research published on arXiv compared thousands of messages people sent to human agents with those sent to AI chatbots, focusing on features including grammar, vocabulary and politeness. The analysis used the Claude 3.5 Sonnet model to evaluate linguistic dimensions.

Researchers from Amazon trained an AI model called Mistral 7B on approximately 13,000 real chats between people, then tested how well it understood 1,357 messages people had sent to chatbots. The team created rewritten versions of messages simulating different communication styles, from blunt and informal to polite and formal.

Chatbots trained on a diverse mix of message styles were 2.9 per cent better at understanding user intent than AI trained solely on original human conversations. The researchers also attempted to improve understanding by rewriting informal messages to be more formal at inference time, but this approach led to a drop in understanding of 1.9 per cent.

The study quantified linguistic differences across six dimensions: grammar fluency, politeness and formality, lexical diversity, informativeness, explicitness and clarity, and emotional intensity. People communicating with human agents exhibited significantly higher grammar fluency, greater politeness and formality, and slightly richer lexical diversity compared to those chatting with AI assistants.

“Training-time exposure to diverse linguistic variation is more effective than inference-time normalisation,” the researchers stated. “Models must learn to interpret diverse communication styles during training, rather than rely on brittle post-hoc transformations that risk semantic distortion.”

The research revealed that while people adjust their linguistic style based on whether they are speaking to humans or AI, they maintain consistent levels of substantive detail and emotional expression across both interaction types. This stylistic divergence introduces a domain shift where models trained exclusively on polished human-to-human data may struggle when deployed in real-world AI assistant environments.

The study analysed user messages during the intent understanding phase in multi-turn conversations, extracting only initial user messages from each session to ensure clear intent signals. Non-informative utterances such as greetings or empty inputs were excluded.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Super-intelligent AI could ‘play dumb’ to trick evaluators and evade controls

The dream of an AI-integrated society could turn into a nightmare if…

Satellite dataset uses deep learning to map 9.2 million kilometres of roads

Researchers have combined deep-learning models with high-resolution satellite imagery to classify 9.2…

Universities quietly deploying GenAI to ‘game’ £2bn research funding system

UK universities are widely using generative AI to prepare submissions for the…

AI guardrails defeated by poetry as ‘smarter’ models prove most gullible

The world’s most advanced artificial intelligence systems are being easily manipulated into…

Researchers hijack X feed with ad blocker tech to cool political tempers

Scientists have successfully intercepted and reshaped live social media feeds using ad-blocker-style…

Doing good buys forgiveness as CSR becomes ‘insurance’ against layoffs

Companies planning to slash jobs or freeze pay should start saving the…