Gemini 3.5 Pro GA Window Opens With 2M Token Context

Google's Gemini 3.5 Pro entered its expected general availability window on June 23, 2026, with analyst tracking placing the public launch between June 23 and June 30. The timeline is laid out by Build Fast with AI.

As of June 23, the model is still in limited Vertex AI enterprise preview, with no public GA announcement made. The headline specification is a 2-million-token context window, the largest in any production frontier model to date.

The Deadline Google Set for Itself

Gemini 3.5 Pro was announced at Google I/O on May 19, where Sundar Pichai told developers to give us until next month, a line that drew audible groans from an audience expecting an immediate launch, per Build Fast with AI.

That phrasing set a hard expectation. A slip past June 30 would mark the second consecutive Google I/O commitment the company failed to deliver on schedule, which is why the next week carries more weight than a normal product window.

The competitive timing, by contrast, is excellent. The model is arriving just as Claude Fable 5 moved behind a paywall and before OpenAI's GPT-5.6 has shipped.

What the 2M Context Window Unlocks

The 2-million-token window is double Gemini 3.5 Flash's 1 million and is the real product story. DEV Community describes it as the defining feature of the post-Fable-5 frontier.

At that size, a single call can process an entire large codebase, years of corporate communications, or several long filings at once for comparative work. These are workloads that simply did not fit in earlier context limits.

The model also ships with Deep Think, a reasoning mode gated to the $250 monthly Ultra tier, alongside multimodal support across text and images. Deep Think reflects the wider industry push toward step-wise AI reasoning, where models are rewarded for sound intermediate steps rather than only final answers.

The Pricing Is the Catch

Capability comes at a steep rate. Estimated pricing is $15 per million input tokens and $60 per million output tokens, roughly ten times the cost of Gemini 3.5 Flash, according to AI Weekly.

That gap forces the same tiering decision now common across frontier models. Long-context and hard-reasoning tasks justify Pro, while routine work belongs on Flash or another cheaper engine.

For Google, the bet is that owning the largest context window justifies the premium for the workloads that need it. The supporting infrastructure is not in doubt, given Google's compute buildout and its stated goal of multiplying capacity while holding energy use roughly flat.

How It Stacks Against Fable 5 and GPT-5.6

Three frontier models are converging in the same ten-day window, and each has a different edge. Independent analysis at Andrew.ooo lays out the split.

Claude Fable 5 leads on coding benchmarks. Gemini 3.5 Pro is expected to lead on long-context retrieval thanks to the 2-million-token window. GPT-5.6 is expected to close the coding gap and improve reasoning consistency when it ships.

For buyers weighing all three, the comparison sits next to the broader leaderboard picture on the Claude product page and across AI Weekly's Google coverage. The honest advice from analysts is that if a decision can wait until early July, third-party benchmarks for all three will have settled by then.

What to Watch Next

The immediate signal is binary. Either Google ships Gemini 3.5 Pro to general availability before June 30, or it misses a public deadline it set on stage, and the second outcome would dent credibility more than the model's specs would repair.

The second signal is real-world long-context performance. A 2-million-token window is only valuable if retrieval quality holds across the full window rather than degrading in the middle, and that is the test independent benchmarks will run first once the model is broadly available.

What Changed

Gemini 3.5 Pro crossed from announcement into its delivery window. Google now has a one-week runway to ship the model publicly or miss a self-imposed deadline for the second time.

Why It Matters

The timing is favorable, with Fable 5 newly behind a paywall and GPT-5.6 not yet shipped. A 2-million-token window opens long-context workloads that did not fit before, but the premium pricing limits who will use it by default.

Suggested Actions

Teams with long-context needs such as whole-codebase analysis should pilot Gemini 3.5 Pro against current models before committing, and budget owners should weigh the roughly ten-times Flash pricing before routing routine work to the Pro tier.

Tools Mentioned

Horizontal Suites

Claude – AI assistant for analysis, writing, coding, and enterprise workflows

Claude is built for teams that need AI assistant for analysis, writing, coding, and enterprise workflows. It helps reduce manual work, improve consistency, and turn a fragmented workflow into something more repeatable for operators and stakeholders.