Russia's Sber Open-Sources 702B Parameter Model Under MIT License

AI news: Russia's Sber Open-Sources 702B Parameter Model Under MIT License

Sber, Russia's largest bank and a major force in the country's AI development, just open-sourced two new language models under the MIT license: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B.

Both models were pretrained from scratch on Sber's own hardware, not fine-tuned from Meta's Llama or another existing base. That alone makes them unusual in the open-weight space, where most releases are derivatives of a handful of foundation models.

The Two Models

GigaChat-3.1-Ultra-702B is a mixture-of-experts model (MoE) - an architecture that activates only a fraction of its total parameters per query, making it more efficient than its raw size suggests. At 702 billion parameters, it sits in the same weight class as the largest Llama models and DeepSeek-V3.

GigaChat-3.1-Lightning-10B-A1.8B targets the opposite end. It's a tiny MoE model with 10 billion total parameters but only 1.8 billion active at inference time. That's small enough to run on consumer GPUs and potentially even higher-end laptops.

MIT License Is the Real Story

The MIT license matters more than the parameter count. It's the most permissive mainstream open-source license available - you can use these models commercially, modify them, and redistribute them with essentially no restrictions. Many open-weight releases come with tighter terms. Meta's Llama models, for instance, have usage thresholds and acceptable use policies that limit commercial deployment above certain scales.

Sber's stated motivation: more open-weight models are better for the community. There's a geopolitical angle too - Russian AI labs have been building independently from the US and China research corridors, and releasing under MIT is a way to gain international adoption and community contributions despite ongoing sanctions-related friction.

The Lightning model is the one most people will actually touch. A 1.8B active parameter MoE on consumer hardware fills a gap for lightweight local tasks - quick text processing, simple chat, summarization - where you don't want to pay for API calls or send data to a cloud provider. The Ultra model is more of an infrastructure play, requiring multi-GPU server setups that put it out of reach for individuals but within range for companies and research labs.

Both models are available on HuggingFace under the ai-sage organization.