Meta Muse Spark : why Meta betrayed open-source — the first closed model from the Superintelligence Lab
🔎 The day Meta stopped giving
Nine months. That's how long it took Meta's Superintelligence Lab, led by Alexandr Wang, to deliver Muse Spark. A powerfully specialized model in medicine, but one that marks a historic turning point: for the first time, Meta is refusing to open a model's weights.
On April 8, 2026, the New York Times revealed the existence of Avocado, Muse Spark's internal codename. The news hit like a slap in the face for a community that had grown accustomed to relying on Llama as the open foundation of the entire ecosystem.
Zuckerberg no longer talks about open-source models. He talks about a "ladder to personal superintelligence." And the first step of that ladder is locked.
The key points
- Muse Spark is the first closed model in Meta's history, developed by the Superintelligence Lab (MSL) in 9 months.
- The model excels in medicine but shows a significant lag in coding compared to market leaders.
- Alexandr Wang justifies the closure due to incomplete "safety checks", according to Implicator.
- This shift comes after a $14.3B investment in Scale AI, of which Wang is the founder and CEO.
- Chinese models, notably Qwen, now dominate 69% of the open-source ecosystem that Meta had built with Llama, according to Skila.
- Muse Spark is available in private preview only, a strategy modeled on OpenAI and Anthropic.
Recommended tools
| Model | Primary usage | Agentic score (June 2025) | Ideal for |
|---|---|---|---|
| GPT-5.5 | General-purpose agent | 98.2 | Complex multi-step tasks |
| Claude Opus 4.7 (Adaptive) | Long reasoning | 94.3 | In-depth analysis, writing |
| Gemini 3 Pro Deep Think | Research & reasoning | 95.4 | Document synthesis |
| DeepSeek V4 Pro | Code & reasoning | 88 (general) | Code, open alternative |
| Muse Spark (MSL) | Medical, diagnostics | Unranked | Specialized medical domain |
Alexandr Wang, the man who turned Meta around
From Scale AI to the Superintelligence Lab
Alexandr Wang's arrival at the head of the MSL is a strong signal of this shift. The founder of Scale AI, Wang built his fortune on data labeling for AI. When Meta injects $14.3B into Scale AI in early 2025, it is not a simple investment — it is a merging of destinies.
Wang brings to Meta an industrial methodology for data quality. But above all, he brings a culture of secrecy. Scale AI has always operated with strict confidentiality agreements with its clients in the defense and healthcare sectors.
Appointing Wang to head superintelligence labeling meant accepting in advance that the models emerging from it would not be public. The conflict of interest is structural: Scale AI's business model relies on proprietary data. It is difficult to publicly release a model trained on it.
The 9 months that changed everything
According to the New York Times, the development of Muse Spark was accelerated. Nine months of design based on a proprietary architecture, with priority access to Scale AI's data pipelines.
The speed is impressive. But it raises a question: can a model developed this quickly, with so little transparency, seriously compete with models that benefit from years of open research? This is the entire paradox of Muse Spark.
Muse Spark : what benchmarks really reveal
Medical dominance, a lag in code
Muse Spark doesn't shine everywhere. Leaked benchmarks show a highly unbalanced profile.
In medicine, the model outperforms GPT-5.5 on several clinical test sets. This is its area of expertise, and it makes sense: Scale AI has the world's largest labeled medical datasets, the result of contracts with hospitals and pharmaceutical labs.
On the other hand, coding is Muse Spark's Achilles' heel. On code generation benchmarks, the model sits below GPT-5.3 Codex and Claude Sonnet 4.6. A lag explained by the choice to prioritize medical data in training.
This imbalance is not insignificant. In a market where the best LLMs for coding dominate enterprise use cases, a model that is weak in code is an incomplete model.
Comparison table against the competition
| Criterion | Muse Spark (MSL) | GPT-5.5 (OpenAI) | Claude Opus 4.7 (Anthropic) | DeepSeek V4 Pro |
|---|---|---|---|---|
| Medical | Excellent | Very good | Good | Average |
| Coding | Weak | Excellent | Excellent | Excellent |
| Openness | Closed | Closed | Closed | Open |
| Availability | Private preview | Public (API) | Public (API) | Public (API + local) |
| Agentic score | Not published | 98.2 | 94.3 | 88 (general) |
The verdict is clear: on paper, Muse Spark does not compete with the leaders. Its only differentiation is medical, and that is precisely what makes it strategically vulnerable.
Wang's justification: conveniently useful "safety checks"
What Implicator says
Implicator reports Wang's statements: Muse Spark is "not ready" for open-source due to incomplete safety checks. An argument that could have been credible... if it hadn't been made by the CEO of Scale AI.
The problem with this argument is twofold. First, Meta has always released Llama with restrictive licenses for high-risk uses, without ever closing the weights. Second, academic research on meta-learning, such as the 2020 study « Yet Meta Learning Can Adapt Fast, It Can Also Break Easily », shows that models based on meta-learning are precisely those that require the most external validation.
Closing a model for security reasons means preventing the community from verifying those very securities. It's a vicious circle.
The real motive: protecting Scale AI's data
Security is a smokescreen. The real reason for closing Muse Spark is that its medical performance relies on Scale AI's proprietary data. Publishing the weights indirectly exposes the nature and quality of this data.
This is a fundamental paradigm shift. Until now, Meta's value resided in the model's architecture and weights. With Muse Spark, the value migrates to the training data. And this data belongs to Scale AI, not Meta.
Zuckerberg's "ladder": personal superintelligence as an excuse
What this scale means
Zuckerberg has repeatedly mentioned a "ladder" — a scale of models leading to personal superintelligence. Muse Spark would be the first rung. Except that this metaphor serves as a retroactive justification for a change in strategy.
A ladder implies progressive steps. But nothing justifies these steps being closed. Llama 3, Llama 4 were also steps, and they were open. The ladder doesn't explain the pivot, it masks it.
The link with meta-learning
Meta's research work on meta-learning sheds light on this strategy. The study « Meta Prompting for AI Systems » shows how a system can learn to learn, that is, optimize its own adaptation process. This is exactly what Zuckerberg calls the ladder.
The Meta Omnium benchmark proposes an evaluation framework for this learning-to-learn. Muse Spark seems to fit into this lineage: a model that would quickly adapt to new medical domains.
But the foundational study « Yet Meta Learning Can Adapt Fast, It Can Also Break Easily » reminds us of an uncomfortable truth: meta-learning models are fragile. They adapt quickly, but they also break quickly. It's hard to sell superintelligence when research says your approach is intrinsically unstable.
The open-source ecosystem: the void left by Meta
Qwen takes the throne
The figure is staggering: according to Skila, Chinese models, led by Qwen, now dominate 69% of the open-source ecosystem. A total reversal.
Two years ago, Llama accounted for over 80% of downloads on Hugging Face. The Llama family was the de facto standard for all open-source development. Today, that leadership has been reduced to nothing.
Alibaba's Qwen has filled the void with a simple strategy: open everything, systematically, without restrictions. Every new Qwen is released with its weights, its documentation, its training datasets. Total transparency as a competitive advantage.
The concrete consequences for developers
For developers who had built their stack on Llama, the signal is clear: Meta is no longer a reliable partner for open-source. Alternatives are flourishing.
Those who want to run des LLM en local are turning to Qwen or DeepSeek. Those who want agents IA open-source avec Ollama are abandoning Llama in favor of newer and better-maintained models. The ecosystem is reorganizing without Meta.
DeepSeek V4 has accelerated this movement by proving that an open model could rival closed ones on code and reasoning. The paradox is cruel: it is a Chinese startup that now embodies what Meta claimed to defend.
The private preview: a strategy copied from OpenAI
Why this approach
Muse Spark is not available to the public. It is in a "private preview," an invitation-only access reserved for selected partners. The Next Web points out that this strategy is an exact copy of what OpenAI did with GPT-4 in 2023.
The private preview serves three purposes. First, to create scarcity and media buzz. Next, to test the model under controlled conditions with partners who sign NDAs. Finally, to avoid direct comparison with public models on the same benchmarks.
It is this last motivation that is the most revealing. If Muse Spark were truly superior, Meta would have every interest in publishing it to prove its dominance. The private preview suggests that the results are not as convincing as the official discourse.
The business model behind the preview
The private preview is not free. Selected partners pay for access, which generates revenue even before the commercial launch. It is a model that Scale AI knows well: selling access to premium data and models before they are available to the public.
Wang is directly importing Scale AI's business model into Meta. The difference is that Scale AI's clients accepted the secrecy because they were buying a custom service. Meta's clients expected open-source.
Can we speak of "betrayal"?
What the community is saying
The word "betrayal" is strong, but it is being used. Skila headlines directly on the topic. The feeling within the community is that of a broken moral contract.
Zuckerberg repeated dozens of times that open-source was a strategic imperative, not an act of charity. He argued that openness prevented any single player from dominating AI. Today, Meta is joining exactly the camp it claimed to be fighting.
The necessary nuance
However, a nuance is needed. Meta was never an open-source company in the pure sense of the term. Llama was under a custom license, not under an Apache or MIT license. Meta controlled the use cases, prohibited certain applications, and revoked the license if the conditions were not met.
Llama was "open weights", not open-source. The distinction is important: Meta gave access to the weights, but not to the training data, not to the infrastructure, not to the complete methodology. Muse Spark is simply taking things a step further in a logic that already existed.
Implications for the AI ecosystem
A dangerous precedent
The real danger of Muse Spark is not the model itself. It is the precedent it creates. If Meta, the self-proclaimed champion of open-source, closes its models without consequence, then no company has any reason left to stay open.
Research on meta-learning like Auto-Meta, which automates the search for meta-learners, shows that open research advances faster than closed research. Closing models slows down innovation for the entire ecosystem.
The impact on French and European models
For French LLMs seeking to differentiate themselves through openness, the signal is mixed. On the one hand, Meta's withdrawal leaves a gap to fill. On the other hand, if even Meta abandons open-source, investors might see openness as a competitive disadvantage.
The Meta ControlNet model, which uses meta-learning for visual task adaptation, nevertheless shows that open innovation can create specific advantages. European research has cards to play, but it needs certainty regarding the business model of openness.
Can Meta catch up with OpenAI and Anthropic with a closed approach?
The numbers don't lie
In June 2025, the landscape of the meilleurs LLM is dominated by OpenAI and Anthropic. GPT-5.5 scores 98.2 in agentic, Claude Opus 4.7 at 94.3. Even in generalist, Google's Gemini 3.1 Pro reaches 92.
Muse Spark isn't even in these rankings. Its score hasn't been published, which speaks volumes. When a model is good, benchmarks are published. When it's average, it's called a "private preview."
Why closed won't save Muse Spark
Wang's reasoning seems to be: if we close the model, it can't be directly compared, so we can build a narrative of superiority. This is a strategic error.
OpenAI and Anthropic have spent years building trust in their closed models. They have millions of users, deep enterprise integrations, plugin ecosystems. Meta is arriving late, with a lesser model, in a market that others have already locked down.
The meilleurs LLM pour les agents IA are already established. Muse Spark has no agent ecosystem, no marketplace, no developer community. And because it's closed, it can't build one organically.
The only credible path: reopening
If Meta really wants to compete with OpenAI, the only credible strategy is to return to openness. Not necessarily on Muse Spark — Scale AI data probably prevents that. But on the next model in the ladder.
AI history shows that open models eventually catch up with closed ones. Llama 3 caught up with GPT-4 on many benchmarks. DeepSeek caught up with Claude on code. Openness is an accelerator, not a brake.
❌ Common mistakes
Mistake 1: Confusing open weights and open-source
Meta was never open-source in the strict sense. Llama was open weights: the weights were downloadable, but everything else (data, training code, infrastructure) remained proprietary. Muse Spark merely makes explicit what was already implicit. The lesson: do not idealize Meta's past regarding openness.
Mistake 2: Thinking that Muse Spark is a technical failure
The model excels in medical. It is a high-value-added field where data is rare and expensive. The mistake would be to judge Muse Spark solely on its weaknesses in code. Its medical specialization could make it a very profitable niche tool, especially through hospital partnerships.
Mistake 3: Underestimating the impact of Scale AI in this decision
Many commentators blame Zuckerberg for the pivot. But the decision comes structurally from the integration of Scale AI into Meta's value chain. As long as Wang controls the training data, he controls the degree of openness of the models. The mistake is to separate Muse Spark from the $14.3B investment in Scale AI.
❓ Frequently Asked Questions
Is Muse Spark based on Llama?
No. Muse Spark (codenamed Avocado) is a distinct architecture, developed internally by the Superintelligence Lab. It does not share weights or the architecture of the Llama family, which explains why Meta can close it without calling into question the existing Llama series.
Can Muse Spark be used today?
No, unless you are part of the private preview. Access is invite-only, reserved for partners selected by MSL. No public availability date has been announced. For immediate alternatives, the meilleurs LLM gratuits remain accessible.
Why the medical field specifically?
Scale AI has the world's largest labeled medical datasets, thanks to contracts with the healthcare sector. This is the natural competitive advantage of the MSL + Scale AI combination. The medical field is also an area where legal liability is high, which provides additional justification (after "safety checks") for keeping the model closed.
Has Qwen really become the open-source leader?
Yes. With a 69% share of the open-source ecosystem according to Skila, Alibaba's Qwen family has surpassed Llama. Those who want to installer un LLM local now predominantly choose Qwen or DeepSeek rather than Llama.
Is Zuckerberg's ladder credible?
On paper, the idea of a progression towards personal superintelligence makes sense. However, research on meta-learning, such as the Meta Omnium study, shows that learning-to-learn is still an experimental research field. Promising a ladder when the first rung is closed and not publicly comparable looks more like storytelling than a scientific roadmap.
✅ Conclusion
Muse Spark is not a bad model — it's a model that belongs to the wrong company. In nine months, Alexandr Wang has imported Scale AI's culture of secrecy into what was the last bastion of openness in AI. The result is a medically powerful but strategically isolated model, which sacrifices the ecosystem Meta had built with Llama on the altar of a closed bet it is not sure to win. Meanwhile, Qwen, DeepSeek, and open research continue to move forward. To keep up with the evolution of this rapidly reshaping landscape, check out our monthly comparison of the best LLMs.