In an effort to combat digital hallucinations and fake news, many social media platforms and regulators are pushing for mandatory “AI-generated” labels on content. However, a new study reveals a troubling paradox: these well-intentioned transparency labels may actually be doing more harm than good.
Published in the Journal of Science Communication (JCOM), the research warns that slapping an AI label on a post reduces the public’s trust in accurate scientific information while simultaneously boosting the credibility of false claims.
“Truth-falsity crossover”
Researchers Teng Lin and Yiqing Zhang from the University of Chinese Academy of Social Sciences set out to test whether disclosure labels actually protect the public. They presented 433 online participants with Weibo-style social media posts containing either correct scientific information or misinformation, both with and without an AI-generated label.
The results revealed a highly counterintuitive pattern that the researchers dubbed the “truth-falsity crossover effect”.
“The same AI label pushes credibility in opposite directions depending on whether the information is true or false: it reduces the credibility of true messages and increases the credibility of false ones,” explained Teng.
Essentially, rather than helping users distinguish fact from fiction, the AI disclosure simply redistributes credibility in the worst possible way.
Algorithm aversion
The study also found that a user’s pre-existing bias against AI didn’t necessarily help them spot fakes. Participants who held negative views of artificial intelligence penalised accurate information even more harshly when it carried an AI label.
Yet, even among these AI sceptics, the label still provided a partial credibility boost to the misinformation. This suggests that “algorithm aversion” does not lead people to uniformly reject AI content; instead, it triggers a much more complex and asymmetric reaction.
Because simply informing audiences that a text was written by a machine is actively backfiring, the researchers recommend more nuanced policy designs before platforms implement broad regulatory interventions:
- Dual-labelling: Instead of just marking content as “AI-generated,” platforms should include a disclaimer stating the information has not been independently verified, or add a specific risk warning.
- Graded warning systems: Different types of information carry different risks. Medical or health-related AI content should trigger a much stronger warning label than a low-risk post about new technologies.