Jun 14, 2026 · 2:02 PM
Subscribe
Home Ai

Google is turning Gemini Omni into a video editing test for AI

Google's Gemini Omni brings conversational video editing into Flow, Flow Music, Gemini and YouTube Shorts. The real test is whether Google can make AI video useful without making authorship and trust harder to see.

Julian Lim
· 5 min read · 131 views
Google is turning Gemini Omni into a video editing test for AI

Gemini Omni is not just another AI model announcement from Google. It is a bet that video editing will move from specialist software into ordinary conversation.

Google has spent the past two years trying to make Gemini feel less like a chatbot and more like an operating layer for its products. Gemini Omni is the clearest version of that strategy so far, because it puts the model inside the messy part of creative work: taking a video, changing it, keeping the scene coherent, and letting the user keep asking for edits without starting over.

According to Google's May 19 blog post, Gemini Omni Flash is now part of Google Flow for Google AI subscribers globally, with the model starting in video before expanding toward a broader create-anything-from-any-input ambition. In plain terms, Google wants Omni to understand text, images, audio and existing video as ingredients in the same request, then return a usable clip rather than a mood board or a prompt suggestion.

That is a more serious product question than the demo language suggests. Most people do not want to generate a random clip for its own sake. They want to change the shot they already have. Make the camera angle different. Keep the same person, but move them into another environment. Add sound that matches the action. Remove an object without breaking the rest of the frame. Those are editing jobs, and editing is where many AI video tools still feel brittle.

The important detail is that Omni is being put into Flow, Flow Music, the Gemini app and YouTube Shorts rather than being treated only as a lab model. Google says Flow's new agent can help with brainstorming, creating and editing, while Omni Flash can preserve character consistency, identity and voice across scenes. Flow Music is also getting Omni for music videos, where a user can guide subjects, style and pacing around a track.

The Verge reported that Omni Flash can generate video and audio clips up to 10 seconds long, citing Dumitru Erhan, senior research director at Google DeepMind, and that Google is working to make that duration longer. That limit matters. Ten seconds is enough for Shorts, ads, reaction clips, product teasers and music fragments. It is not enough for a film. Google is not hiding a finished studio replacement here, but it is giving creators a tool that fits the places where video volume is already highest.

There is also a difference between Omni and Veo, Google's existing video generation model. As The Verge noted, Omni Flash can use video as a basis for another video, while Veo has been framed more directly around text-to-video generation. Koray Kavukcuoglu, Google DeepMind's CTO and chief AI architect at Google, told The Verge that Omni Flash has much more world knowledge because of Gemini's training data. That is the part Google wants creators to feel: not only sharper pixels, but an edit that understands why a hand, a mirror, a song beat or a camera move should behave a certain way.

The YouTube part is the harder test

Flow is a controlled environment. YouTube Shorts is not. The Verge separately reported that Google is adding a Shorts Remix option that lets users choose a reimagine tool powered by Gemini Omni, with prompts that can restyle eligible Shorts or alter their contents. Creators can turn off the ability to have their videos reworked, and Google says remixed Shorts will carry a digital watermark and link back to the original video.

That opt-out is not a small feature. It is the line between AI video as a creator tool and AI video as a platform headache. If users can insert themselves into other people's clips, change a scene's style, or add new figures into a video, the platform has to make authorship and consent visible. Google DeepMind says content created or edited with Omni in Gemini, Flow or YouTube includes SynthID watermarking and C2PA Content Credentials, with verification through the Gemini app and support coming to Chrome and Search.

WIRED's I/O roundup put the business context neatly: Google said 900 million people use its Gemini assistant, and more than 50 billion images have been generated with Gemini. Those numbers explain why Omni matters to Google even before it is perfect. Image generation showed that people will use creative AI at consumer scale when it sits inside a product they already touch. Video is larger, more sensitive and more valuable.

The risk is that easy video remixing floods feeds with synthetic sameness faster than viewers can understand what they are watching. The opportunity is that small teams, musicians, teachers, marketers and solo creators get editing power that used to require software, time and a second person who knew the timeline. Both things can be true. The market will not judge Gemini Omni by the phrase create anything. It will judge it by whether a user can make a specific edit twice, keep the same subject intact, and know what was generated when the clip starts moving through YouTube.

Also read: Britain is turning teen safety into a tech compliance testGM is keeping robotaxis alive by turning autonomy into a car featureAnt Group is rebuilding Alipay around AI after its regulatory reset

TOPICS
Julian Lim is an entrepreneur, technology writer, and a researcher. He started JL Data Analysis after graduating from NUS in Intelligent Systems. Julian writes about technology innovations and entrepreneurship on Business Times, Asia Pacific Magazine and occasionally contributes to Startup Fortune.
Related Articles
More posts →
Loading next article…
You're all caught up