128 billion parameters. That's what Mistral just made freely available with the release of Mistral-Medium-3.5-128B on Hugging Face, an open-weight model anyone can download and run. Parameters are the adjustable internal settings that determine how a model responds - more parameters generally means a more capable model, at the cost of more computing power to run it.
Calling a 128B model "Medium" is a deliberate positioning choice. Mistral's Medium tier sits between its lightweight API models and the full Mistral Large, but the 128B scale is competitive with models many providers treat as flagship-level. Releasing it as open-weight - meaning the model files are public, downloadable, and legally adaptable - sets it apart from similar-scale proprietary offerings.
For context, models in the 7B to 70B parameter range handle most professional workloads in the self-hosting community today. Running a 128B model locally requires serious hardware, typically multiple high-end GPUs or a dedicated server. Most individual users will access this through cloud providers rather than their own machines.
The practical audience is developers and enterprise teams who need a capable model they control directly - no per-token costs routed through ChatGPT or Claude APIs, full data privacy, and the ability to fine-tune (adapt the model on your own data) without platform restrictions.
Mistral hasn't published benchmark comparisons yet. The weights are live on Hugging Face now.