May 20, 2026

Google I/O 2026: The Price Pivot — When Smart Enough Gets Cheap Enough

Gemini 3.5 Flash delivers frontier performance at half the price of rivals, but costs 5x more than its own predecessor. DeepSeek V4-Flash costs 10x less. When models cross the 'good enough' line, price wins. Plus: Omni Flash's anything-in-anything-out, and the CLI retirement footnote.

Google I/O 2026 had two main acts and one footnote. The acts: Gemini 3.5 Flash, which beats Gemini 3.1 Pro on nearly all benchmarks while costing half of rival frontier models¹, and Gemini Omni Flash, the first model that generates video from any mix of text, images, audio, and video input². The footnote: Google is killing Gemini CLI with one month’s notice³.

But here’s the thing nobody at Shoreline Amphitheatre wanted to say out loud: DeepSeek V4-Flash does most of what 3.5 Flash does, at one-tenth the price, with MIT-licensed weights you can run yourself. When capability crosses a threshold, the fight stops being about who’s smarter. It becomes about who’s cheaper. That’s the real story of I/O 2026.

3.5 Flash: Good Enough to Win, Expensive Enough to Lose

The benchmark story is clean. Gemini 3.5 Flash outperforms Gemini 3.1 Pro across almost all benchmarks¹. Pichai said it plainly: “3.5 Flash is better than 3.1 Pro, which was just four months ago, and it’s at the almost, I would say, 90% of the performance of frontier models, 4x faster… about 1/3 to one half the cost” of those rivals⁴.

But cost compared to what? Let’s look at the full picture:

Table 1: The Real Price Landscape ⁵⁶

Model	Input $/M	Output $/M	License	Notes
DeepSeek V4-Flash	$0.14	$0.28	MIT open	13B active, SWE-bench ~79%
DeepSeek V4-Pro (promo)	$0.435	$0.87	MIT open	Promo ends May 31
DeepSeek V4-Pro (list)	$1.74	$3.48	MIT open	SWE-bench 80.6%
Gemini 2.5 Flash	$0.30	$2.50	Closed	Previous gen
Gemini 3.5 Flash	$1.50	$9.00	Closed	New release
GPT-5.5 (est.)	~$2.50	~$15.00	Closed	Frontier competitor
Claude Opus 4.7	~$3.00	~$15.00	Closed	Frontier competitor

3.5 Flash is cheaper than GPT-5.5 and Claude Opus 4.7 on paper. But DeepSeek V4-Flash — an MIT-licensed model you can download and run on your own hardware — costs 10x less on input and 32x less on output. V4-Pro’s SWE-bench score (80.6%) is functionally tied with Claude Opus 4.7 (80.8%) at one-seventh the output price⁷.

And here’s the uncomfortable comparison Google didn’t highlight:

Table 2: 3.5 Flash vs Its Own Predecessor ⁵⁶

Model	Input $/M	Output $/M
Gemini 2.5 Flash	$0.30	$2.50
Gemini 3.5 Flash	$1.50	$9.00

That’s a 5x increase on input, 3.6x on output. Google isn’t giving you a better deal on Flash — they’re charging a premium because it now performs at Pro-level. The “half the price” claim is only true if you compare against frontier competitors, not against Google’s own budget line.

The Crossover Point

The argument that matters: for most production workloads, the question isn’t “can the model do this?” anymore. It’s “how much does it cost to do it well enough?”

When V4-Flash at $0.14/$0.28 handles classification, summarization, and straightforward coding — the 80% of volume tasks — spending 10x more on 3.5 Flash only makes sense for the remaining 20% that genuinely need frontier intelligence. Google knows this. That’s why they positioned 3.5 Flash against competitors’ flagships, not their own Flash line. They’re selling the idea that you should replace Claude Opus or GPT-5.5 with 3.5 Flash, not that you should upgrade from 2.5 Flash.

The internal scale number backs up the demand story, if not the pricing logic: Google went from 500 billion tokens per day in March to over 3 trillion by I/O¹. That’s a feedback loop — scale improves the model, which attracts usage, which generates data.

Omni Flash: The Anything-In-Anything-Out Moment

If 3.5 Flash is about making frontier intelligence cheaper, Omni Flash is about making AI genuinely cross-modal. Not in the “we added image understanding” sense — in the “any input type can produce any output type” sense.

Omni Flash accepts text, images, audio, and video in any combination, and produces video output². Future releases add image and text generation. Hassabis described the long-term vision: “generate any type of output from any kind of input”⁸.

Table 3: What Omni Can Do Now vs What’s Coming ²⁹

Input types	Output (today)	Output (planned)
Text, images, audio, video	10-second video clips	+ Images, text
Camera roll uploads	Character-consistent avatars	Multi-scene narratives
Drawings / sketches	Conversational editing	Full post-production

The technical differentiator isn’t video generation — Veo and others already do that. It’s that Omni reasons across modalities simultaneously. It understands physics (gravity, fluid dynamics, kinetic energy)⁹, maintains character identity and voice across scenes, and lets you edit uploaded footage through natural language prompts.

This is what makes it interesting. You don’t need a timeline editor. You say “change the background to a beach,” and the model preserves the original motion while applying the change. That’s a different product category from text-to-video generators.

All generated content carries SynthID watermarks⁹. Speech and audio editing are withheld for further testing⁹. Reasonable precautions for a model this capable.

Omni Pro Is the One to Watch

Google confirmed Omni Pro exists but won’t launch it until it represents “a step change above Flash”¹⁰. That’s the right call. 10-second clips and conversational editing are impressive for consumers, but professional workflows need higher resolution, longer duration, and more precise control. If Omni Pro delivers, it could do to video production what Stable Diffusion did to image creation.

Omni Flash is available now to Google AI Plus/Pro/Ultra subscribers, with YouTube Shorts integration next week and APIs “in the coming weeks”¹¹.

The Footnote: Gemini CLI Dies

The smallest announcement drew the most developer heat. Google is retiring Gemini CLI on June 18, 2026³. Enterprise customers keep access; everyone else has 30 days to move to Antigravity CLI.

The Hacker News thread was blunt: “You can’t build a workflow around something that gets renamed or killed every 6 months”¹². Google’s explanation — that single-agent terminal tools can’t support multi-agent orchestration — is technically sound. Antigravity CLI is written in Go, supports async processing, and shares a harness with the Antigravity 2.0 desktop app³. But the signal it sends — “we’ll deprecate your tools with 30 days’ notice” — is hard to unsend when OpenAI and Anthropic are signing multi-year contracts.

References

Conclusion

The real story of I/O 2026 isn’t on the keynote stage. It’s in the pricing table. 3.5 Flash costs 5x more than 2.5 Flash — because it can, because it’s genuinely better, because Google has customers who’ll pay. But V4-Flash costs 10x less than 3.5 Flash with MIT-licensed weights you can self-host. When that gap exists, “frontier performance at half the price” stops being a headline and starts being a negotiation.

Omni Flash is different — it’s genuinely new. The ability to take any input modality and produce coherent video output, with physics understanding and conversational editing, creates a product category that didn’t exist before. Omni Pro, when it arrives, could reshape video production the way Stable Diffusion reshaped image creation.

The CLI retirement is a footnote, but a revealing one. Google is selling velocity. OpenAI and Anthropic are selling stability. The market will decide what it values more.

Google Official Blog — Sundar Pichai I/O 2026 Keynote — 3.5 Flash benchmarks, pricing, internal token scale, $1B savings claim https://blog.google/innovation-and-ai/sundar-pichai-io-2026/ ↩ ↩² ↩³
Google I/O 2026 Collection — Gemini Omni: “create anything from any input, starting with video” https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-collection/ ↩ ↩² ↩³
Google Developers Blog — Transitioning Gemini CLI to Antigravity CLI — June 18 deadline, Go rewrite, shared agent harness https://developers.googleblog.com/en/search/?query=Gemini+CLI ↩ ↩² ↩³
VentureBeat — Google says Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year — Pichai quote on 90% frontier performance at 1/3 to 1/2 cost https://venturebeat.com/technology/google-says-gemini-3-5-flash-can-slash-enterprise-ai-costs-by-more-than-1-billion-a-year ↩
Gemini Developer API Pricing — Official pricing for all models: 3.5 Flash ($1.50/$9.00), 2.5 Flash ($0.30/$2.50), 3.1 Pro ($2.00/$12.00) https://ai.google.dev/gemini-api/docs/pricing ↩ ↩²
BenchLM.ai — Gemini API Pricing (April 2026) — Detailed breakdown including 3 Flash Preview ($0.50/$3.00), batch rates https://benchlm.ai/blog/posts/gemini-api-pricing ↩ ↩²
Codersera — DeepSeek V4 Pro vs Flash: Benchmarks & Pricing 2026 — SWE-bench Verified 80.6%, Terminal-Bench 67.9%, pricing at $0.14/$0.28 (Flash) and $1.74/$3.48 (Pro list) https://codersera.com/blog/deepseek-v4-pro-vs-flash/ ↩
LatestLY — Gemini Omni Launched — Hassabis on any-input-any-output long-term vision https://www.latestly.com/technology/gemini-omni-launched-google-unveils-new-ai-video-generator-at-io-2026-7437717.html ↩
Times of India — Google takes next big step towards AGI — Omni physics simulation, SynthID, safety withholding https://timesofindia.indiatimes.com/technology/tech-news/google-takes-next-big-step-towards-agi-launches-gemini-omni-what-is-it-how-it-works-and-more/articleshow/131210372.cms ↩ ↩² ↩³ ↩⁴
Firstpost — Google I/O 2026: Gemini Omni Flash arrives — Hassabis demo, Omni Pro “step change” comment https://www.firstpost.com/tech/google-i-o-2026-gemini-omni-flash-arrives-with-more-accurate-ai-video-generation-than-veo-14012977.html ↩
Cybernews — Google unveils Gemini Omni at I/O 2026 — Rollout timeline https://cybernews.com/ai-news/google-io-2026-gemini-omni-antigravity-agentic-ai/ ↩
Hacker News — Gemini CLI will stop working from June 18, 2026 — Developer reactions https://news.ycombinator.com/item?id=48196867 ↩