Models

Google's Gemma 4 Lineup Is Expanding, Possibly With a 120B Model

June 3, 2026 1 min read

Image: Google

More Gemma 4 models are on the way. Google has signaled that the family will expand beyond the current 12B instruction-tuned model, with reports pointing toward a 120B parameter variant as one possibility.

For context on what that size difference means: the current Gemma 4-12B runs on a high-end consumer GPU with 16-24GB of VRAM (the dedicated memory that chips use for AI workloads). A 120B model is roughly 10 times larger, typically requiring 60-80GB of VRAM depending on how aggressively it's quantized - that is, compressed to fit into less memory at some cost to accuracy. That puts it squarely in multi-GPU or server-grade hardware territory for most users.

The 12B has been competitive on coding and instruction-following tasks, though independent benchmarks published the same day show it losing to Qwen3.5-9B on 5 of 8 tests despite having more parameters. A 120B variant would give the Gemma family a higher ceiling for complex reasoning tasks where raw model capacity makes a real difference.

Google hasn't confirmed a release date or final specs. The Gemma line has expanded faster than most observers predicted, so additional sizes in the near term are plausible.

Related Tools

More from today

Google's Gemma 4 12B Drops the Separate Vision Encoder for a Unified Architecture

Microsoft Releases Aion 1.0 Instruct and Plan Models on Azure

Opus 4.8 Burned 12 Hours With Zero Output. Sonnet 4.6 Finished the Same Job in One Session.

Cookie Preferences