More Gemma 4 models are on the way. Google has signaled that the family will expand beyond the current 12B instruction-tuned model, with reports pointing toward a 120B parameter variant as one possibility.
For context on what that size difference means: the current Gemma 4-12B runs on a high-end consumer GPU with 16-24GB of VRAM (the dedicated memory that chips use for AI workloads). A 120B model is roughly 10 times larger, typically requiring 60-80GB of VRAM depending on how aggressively it's quantized - that is, compressed to fit into less memory at some cost to accuracy. That puts it squarely in multi-GPU or server-grade hardware territory for most users.
The 12B has been competitive on coding and instruction-following tasks, though independent benchmarks published the same day show it losing to Qwen3.5-9B on 5 of 8 tests despite having more parameters. A 120B variant would give the Gemma family a higher ceiling for complex reasoning tasks where raw model capacity makes a real difference.
Google hasn't confirmed a release date or final specs. The Gemma line has expanded faster than most observers predicted, so additional sizes in the near term are plausible.