Stereo sound.
Photo credit: Markus Spiske/Pexels

Your holiday videos could soon sound as immersive as a cinema blockbuster thanks to a new artificial intelligence tool that generates realistic 3D audio from standard recordings.

Researchers at the University of Electro-Communications in Tokyo have developed a system that uses visual cues to “spatialise” flat, one-dimensional sound, allowing listeners to hear exactly where the noise is coming from.

Most smartphones and cameras record audio using a single microphone. This results in “monaural” sound that lacks depth and direction, so even if a car speeds past on the left side of the screen, the audio feels like it is coming from the centre.

“Even when a video clearly shows where sounds are coming from, the audio itself often feels flat and unrealistic,” the researchers explain.

Seeing the sound

To fix this, the new AI analyses the video to identify where sound-producing objects — such as a musician or a vehicle — are located on the screen.

It then modifies the audio to match these visual positions, mimicking how humans naturally hear with two ears (binaural audio) to create a sense of space and distance.

“The key idea… is that people naturally use vision to interpret sound,” the team notes. “If a musician is standing on the right side of the image, listeners intuitively expect the sound to be heard from the right.”

Real-world application

While previous attempts to fake 3D audio often failed under real-world conditions, this new method was trained specifically on real-world recordings.

Tests on a newly created dataset of synchronised video and audio showed that the system could successfully trick listeners into perceiving direction, whereas earlier methods collapsed into flat sound.

The technology could be used to upgrade online videos, enhance virtual reality experiences, or breathe new life into old archival footage without the need for expensive, specialist recording equipment.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Digital sovereignty: Why 2026 is Europe’s make-or-break year for sovereign cloud

theFreesheet is the official media partner for Manchester Edge & Digital Infrastructure…

AI models master complex multitasking by learning to ‘talk’ to themselves

Artificial intelligence systems can significantly improve their ability to tackle unfamiliar problems…

Engineering leaders urge profession to adopt peace as core design standard

Engineers must actively design systems to reduce conflict rather than treating peace…

Medical AI fails in real-world clinics due to ‘contextual errors’

Despite the massive hype surrounding artificial intelligence in healthcare, a vast gap…