Blog

Gemini Omni: Google's New AI Video Model, Demos & How to Use It

20 May 2026enYANAI Insights4 min read
cover-v2.jpg

Gemini Omni is Google's new AI model that turns any input into video. See every official demo, what Gemini Omni Flash does, if it's free, and when the API ships.

Gemini OmniGemini Omni FlashGoogle Gemini Omniwhat is Gemini OmniGemini Omni demoGemini Omni videoGemini Omni release dateGemini Omni APIGemini Omni freeGemini Omni vs Veo

Google just put Gemini Omni on stage at I/O, and the one-line version is simple: it takes almost any input you give it — images, audio, video, text, even a rough drawing — and turns it into high-quality video.

That sounds like every other video model pitch from the last two years. It isn't, and the demos are the reason. Google dropped five posts in a row. We pulled the clips so you can judge them yourself instead of reading a recap of a recap.

Meet Gemini Omni

The reveal. One model, any input, video out.

"Meet Gemini Omni — our new AI model that can create anything from any input, starting with video." — @Google

It understands physics, not just pixels

This is the part worth slowing down on. Most video models learn what scenes look like. Google's claim is that Gemini Omni reasons about how the world works — it pairs an intuitive sense of physics with Gemini's real-world knowledge. So a poured liquid settles, weight lands where weight should land, and the output behaves instead of just rendering.

Photorealism is table stakes now. Behaving like the real world is the new bar. — @Google

Any input in, video out

Feed it images, audio, video and text together. Or hand it a drawing and let it match your vision. The "omni" in the name is the actual point: the input side is wide open, not a single prompt box.

Combine images, audio, video and text — or sketch it. — @Google

Editing is now a conversation

For most people this is the one that lands. You edit your own footage by talking to it. Reframe the action, swap the point of view, push the lighting more cinematic — across multiple turns. Each instruction builds on the last, so characters stay consistent, the physics holds, and the scene remembers what came before. The timeline full of keyframes turns into a back-and-forth.

Multi-turn editing where the scene keeps its memory. — @Google

Where and when you can use it

The shipping tier is called Gemini Omni Flash, and the rollout is staged:

  • Today — Google AI Plus, Pro and Ultra subscribers globally, in the Gemini app and Flow by Google.

  • This week, free — YouTube Shorts and the YouTube Create app.

  • Coming weeks — developers and enterprise customers via API.

So creators get hands-on first, and the API — the part that matters if you're building on top of it — lands a bit later. (Google's rollout post)

Quick take

The physics-and-reasoning angle is the bet here. Plenty of models can make a pretty five-second clip. Far fewer keep a character consistent while you renegotiate the shot four times in a row. If the consistency holds outside a launch reel, the editing workflow is the real shift, not the generation.

It also points at where agent work is going. The interesting unit stops being a single prompt and becomes a multi-turn session that remembers state — exactly the shape of work people already run inside Kollab: give a model context, iterate over several turns, keep the thread coherent. A model that natively does that for video makes those workflows a lot more concrete.

Don't wait for the API — create with the latest video AI now
Kollab already puts the newest video models in one workspace, with the same multi-turn, context-keeping flow described above. No setup.
Start creating in Kollab →

FAQ

What is Gemini Omni?

Google's new AI model that generates high-quality video from any input — images, audio, video, text, or a drawing — and edits existing video through conversation. It was announced at Google I/O.

Is Gemini Omni free?

The Gemini Omni Flash tier is free in YouTube Shorts and the YouTube Create app starting this week. Full access ships first to Google AI Plus, Pro and Ultra subscribers in the Gemini app and Flow by Google.

When is the Gemini Omni API available?

Google says developer and enterprise API access arrives in the coming weeks after the consumer rollout.

What makes Gemini Omni different from other video models?

It combines an intuitive understanding of physics with Gemini's real-world reasoning, and supports multi-turn conversational editing where characters and scene state stay consistent across instructions.


Source: official posts from @Google at #GoogleIO — reveal, physics, any input, conversational editing, rollout. Clips embedded for commentary and reference; all rights belong to Google.

İlgili Yazılar