Google is turning Gemini Omni into a video editing test for AI

Gemini Omni is not just another AI model announcement from Google. It is a bet that video editing will move from specialist software into ordinary conversation.

Google has spent the past two years trying to make Gemini feel less like a chatbot and more like an operating layer for its products. Gemini Omni is the clearest version of that strategy so far, because it puts the model inside the messy part of creative work: taking a video, changing it, keeping the scene coherent, and letting the user keep asking for edits without starting over.

According to Google's May 19 blog post, Gemini Omni Flash is now part of Google Flow for Google AI subscribers globally, with the model starting in video before expanding toward a broader create-anything-from-any-input ambition. In plain terms, Google wants Omni to understand text, images, audio and existing video as ingredients in the same request, then return a usable clip rather than a mood board or a prompt suggestion.

That is a more serious product question than the demo language suggests. Most people do not want to generate a random clip for its own sake. They want to change the shot they already have. Make the camera angle different. Keep the same person, but move them into another environment. Add sound that matches the action. Remove an object without breaking the rest of the frame. Those are editing jobs, and editing is where many AI video tools still feel brittle.

The important detail is that Omni is being put into Flow, Flow Music, the Gemini app and YouTube Shorts rather than being treated only as a lab model. Google says Flow's new agent can help with brainstorming, creating and editing, while Omni Flash can preserve character consistency, identity and voice across scenes. Flow Music is also getting Omni for music videos, where a user can guide subjects, style and pacing around a track.

The Verge reported that Omni Flash can generate video and audio clips up to 10 seconds long, citing Dumitru Erhan, senior research director at Google DeepMind, and that Google is working to make that duration longer. That limit matters. Ten seconds is enough for Shorts, ads, reaction clips, product teasers and music fragments. It is not enough for a film. Google is not hiding a finished studio replacement here, but it is giving creators a tool that fits the places where video volume is already highest.

There is also a difference between Omni and Veo, Google's existing video generation model. As The Verge noted, Omni Flash can use video as a basis for another video, while Veo has been framed more directly around text-to-video generation. Koray Kavukcuoglu, Google DeepMind's CTO and chief AI architect at Google, told The Verge that Omni Flash has much more world knowledge because of Gemini's training data. That is the part Google wants creators to feel: not only sharper pixels, but an edit that understands why a hand, a mirror, a song beat or a camera move should behave a certain way.

The YouTube part is the harder test

Flow is a controlled environment. YouTube Shorts is not. The Verge separately reported that Google is adding a Shorts Remix option that lets users choose a reimagine tool powered by Gemini Omni, with prompts that can restyle eligible Shorts or alter their contents. Creators can turn off the ability to have their videos reworked, and Google says remixed Shorts will carry a digital watermark and link back to the original video.

That opt-out is not a small feature. It is the line between AI video as a creator tool and AI video as a platform headache. If users can insert themselves into other people's clips, change a scene's style, or add new figures into a video, the platform has to make authorship and consent visible. Google is also expanding the tools meant to make AI media easier to check: The Verge reported from I/O that SynthID verification for image, video and audio is already available through the Gemini app, while C2PA checking is rolling into Gemini now and coming later to Search and Chrome.

WIRED's I/O roundup put the business context neatly: Google said 900 million people use its Gemini assistant, and more than 50 billion images have been generated with Gemini. Those numbers explain why Omni matters to Google even before it is perfect. Image generation showed that people will use creative AI at consumer scale when it sits inside a product they already touch. Video is larger, more sensitive and more valuable.

The risk is that easy video remixing floods feeds with synthetic sameness faster than viewers can understand what they are watching. The opportunity is that small teams, musicians, teachers, marketers and solo creators get editing power that used to require software, time and a second person who knew the timeline. Both things can be true. The market will not judge Gemini Omni by the phrase create anything. It will judge it by whether a user can make a specific edit twice, keep the same subject intact, and know what was generated when the clip starts moving through YouTube.

Also read: Britain is turning teen safety into a tech compliance test • GM is keeping robotaxis alive by turning autonomy into a car feature • Ant Group is rebuilding Alipay around AI after its regulatory reset