Google launches Veo 3.1 with audio generation and enhanced editing tools

Google has introduced Veo 3.1, its latest AI filmmaking model, alongside significant updates to Flow that add audio generation across all features and new editing capabilities for greater creative control.

The company claimed more than 275 million videos have been generated in Flow since its launch five months ago. The updated platform now includes audio support for existing capabilities including “Ingredients to Video,” “Frames to Video” and “Extend” features.

Veo 3.1 delivers richer audio, more narrative control and enhanced realism capturing true-to-life textures, according to Google. The model builds on Veo 3 with stronger prompt adherence and improved audiovisual quality when converting images into videos, which Google described as state-of-the-art performance.

“We’re always listening to your feedback, and we’ve heard that you want more artistic control within Flow, with increased support for audio across all features,” said Jess Gallegos, senior product manager at Google DeepMind, and Thomas Iljic, director of product management at Google Labs, in the announcement.

The audio generation enables users to craft scene appearance using “Ingredients to Video,” which employs multiple reference images to control characters, objects and style. Flow uses these ingredients to create final scenes matching user vision, the company stated.

“Frames to Video” allows users to provide starting and ending images, with Flow generating a seamless video that bridges the two frames, designed for artful and epic transitions. The “Extend” feature creates longer videos lasting a minute or more that connect to and continue action from original clips, with each video generated based on the final second of previous clips.

Google introduced new editing capabilities directly within Flow to help users reimagine and perfect scenes. The “Insert” feature introduces new elements to any scene, from realistic details to fantastical creatures. Flow handles complex details, including shadows and scene lighting, to make additions appear natural, according to the company.

A “Remove” capability, coming soon, will enable users to extract anything from scenes, with Flow reconstructing background and surroundings to make objects appear never present.