Meta just launched Muse Spark, its first multimodal reasoning model, built by a new internal group called Meta Superintelligence Labs (MSL). The model is live now at meta.ai and the Meta AI app, with a private API preview opening to select developers.
The "superintelligence" framing is deliberately bold. Meta describes Muse Spark as "the first step on our scaling ladder" toward what they're calling personal superintelligence - AI that reasons through complex problems on your behalf rather than just answering questions.
What "Contemplating Mode" Actually Means
The headline feature is something Meta calls Contemplating mode. When you ask Muse Spark a difficult question, instead of generating one answer straight through, it spins up multiple AI agents that reason in parallel, then combines their thinking. Think of it as asking three smart people to work on the same problem independently, then synthesizing the best answer.
This approach is showing up in benchmark numbers: Muse Spark hits 58% on Humanity's Last Exam (HLE) in Contemplating mode. HLE is a benchmark of extremely hard questions across math, science, and humanities - designed to stump frontier AI models. For context, early GPT-4 class models scored in the low single digits. 38% on FrontierScience Research rounds out the hard-reasoning picture.
Meta also claims the model achieves equivalent performance to its Llama 4 Maverick model using "over an order of magnitude less compute" - meaning it gets similar results while being roughly 10x more efficient at inference (the process of actually generating a response). If that holds in practice, it has real cost implications for developers building on the API.
What It Can Do With a Camera
Muse Spark is multimodal, meaning it processes both text and images. Meta's example use cases: point your camera at a broken appliance to get annotated repair steps, analyze a food photo for nutritional content, or get a visual breakdown of which muscles an exercise activates. The health angle got specific attention - Meta says over 1,000 physicians contributed to training the model's medical reasoning.
There's also a tool-use layer where the model can identify objects in your environment and create interactive experiences from them, including building simple minigames from a single image. That last part reads more like a demo feature than a workhorse capability, but the underlying visual context-to-response loop is genuinely useful for the practical cases.
The Architecture Behind It
Meta's post details three levers they're pulling to improve the model: better pretraining (a rebuilt data and architecture stack), reinforcement learning that shows log-linear improvement (meaning each doubling of training compute produces predictable, consistent gains), and a test-time reasoning technique called "thought compression." Thought compression reduces the number of tokens - the chunks of text an AI processes internally - needed during complex multi-step reasoning, making Contemplating mode faster and cheaper to run.
The existence of a dedicated superintelligence lab alongside Meta's open Llama team signals a structural split: open-source models on one track, a closed capability-focused push on another. Muse Spark is not open-source.
For daily AI users, the most interesting combination here is visual reasoning plus Contemplating mode for hard questions - the kind of tasks where ChatGPT or Claude tend to produce one confident-sounding answer that may or may not be right. Whether Meta's parallel-agent approach actually produces more reliable answers in practice will become clear as broader access rolls out.
The gated API preview is the piece to watch. If Meta opens Muse Spark to developers at competitive pricing, it becomes a serious third option for anyone building AI-powered products who currently defaults to OpenAI or Anthropic.