📑 Table of contents

MOSS: AI agents capable of modifying themselves — the research that foreshadows self-evolution

Actu IA 🟢 Beginner ⏱️ 15 min read 📅 2026-05-22

MOSS : AI agents capable of modifying themselves — the research foreshadowing self-evolution

🔎 An agent that rewrites its own code to improve itself — without human intervention

For two years, the AI agent market has been exploding. We are deploying autonomous assistants in production, connecting them to databases, APIs, tools. But a fundamental problem remains largely ignored: once deployed, these agents are static.

They do not learn from their interactions. A recurring failure persists until the next human update. It is as if a junior developer kept making the same mistake thousands of times without ever taking notes.

On May 21, 2026, a paper published on arXiv (2605.22794) by Qianshu Cai et al. proposes a radical paradigm shift. The system is called MOSS, and its principle is simple yet unsettling: the agent rewrites its own source code to correct its recurring failures. Not its prompts, not its config files — the agent's harness itself.

This is no longer adjustment. It is autonomous software mutation.


The essentials

  • Current AI agents are static after deployment: they do not improve through use, recurring failures persist indefinitely.
  • MOSS introduces a self-evolution mechanism through source-level rewriting: the agent modifies its own code to adapt to past interactions.
  • The system is part of a line of research including the Gödel Agent (ACL 2025) and SICA, which climbed from 17% to 53% on SWE-Bench Verified through self-modification.
  • The ethical implications are major: an agent that rewrites its own code partially escapes direct human control.
  • MOSS is positioned at the frontier between self-improvement at the skill level (already in production in 2026) and self-evolution at the agent level (still research).

Tool Main usage Price (June 2025, check on site.com) Ideal for
GPT-5.5 Autonomous agent, complex reasoning Variable (OpenAI) Agentic scenarios requiring the best score (98.2)
Claude Opus 4.7 (Adaptive) Adaptive agents, code and analysis Variable (Anthropic) Agent workflows balancing precision and adaptation
DeepSeek V4 Pro (Max) High-performance open-source agent Variable (DeepSeek) Self-hosted deployment with high score (88 agentic)
Kimi K2.6 (Self-host) Self-hosted agent, long context Variable (Moonshot AI) Local agents with solid agentic score (88.1)
OpenClaw Self-evolving agent framework Open source Experimenting with agent-level self-evolution

The problem: agents frozen in marble

The illusion of autonomy

Autonomous AI agents give the impression of being intelligent and adaptive. In reality, their behavior is determined by code written by humans, frozen prompts, and a fixed architecture at the time of deployment.

When an agent fails at a task, it will fail in the same way next time. It doesn't take notes. It doesn't refactor its approach. It doesn't say to itself "hey, this step makes me fail 80% of the time, I should modify it".

This is a structural problem, not a model problem. Even the best agentic LLM on the market — GPT-5.5 with a score of 98.2 on agentic benchmarks — remains a prisoner of the harness that envelops it.

What the research says

The systematic survey "A Systematic Survey of Self-Evolving Agents" (curated in the Awesome-Self-Evolving-Agents repository on GitHub) clearly identifies this limitation. Current agents are "model-centric": all evolution happens at the model level, never at the agent level itself.

The analysis published on TowardsDev summarizes the situation: we have increasingly powerful LLMs, but agent architectures that haven't fundamentally evolved since 2024.

MOSS precisely attacks this bottleneck.


What MOSS actually does

Source-level rewriting, not prompt adjustment

The distinction is crucial. Most attempts at "self-improvement" in agents are limited to modifying prompts, tweaking config parameters, or enriching a vector memory. That's just cosmetics.

MOSS goes further: the agent modifies the source code of its own harness. Its functions, its control logic, its decision-making mechanisms. It's comparable to a program that rewrites its own Python functions because it identified them as suboptimal.

The MOSS paper (arXiv 2605.22794) describes this process as an observe-rewrite-deploy loop. The agent observes its past failures, identifies recurring patterns, generates a source-level patch, and applies it to itself.

The difference with the previous version

The previous version of the MOSS paper (arXiv 2409.16120, September 2024) already laid the conceptual foundations: agents capable of dynamically modifying their own code to adapt to their environment. The authors were already talking about the "Turing-completeness" of agents, meaning the theoretical capacity of an agent to implement any computation.

The May 2026 version makes this vision a reality with an operational mechanism tested on real benchmarks. The agent no longer theorizes: it patches itself.

The three-layer architecture

MOSS relies on three distinct components. First, an observer that analyzes past interaction logs and identifies failure patterns. Then, a rewriter that generates targeted source-level modifications. Finally, a validator that tests the patches before applying them, preventing regressions.

This architecture is reminiscent of human software development cycles — except that the developer is the agent itself.


The research lineage: from Gödel Agent to SICA

Runtime monkey patching as an evolution method

MOSS does not come out of nowhere. It is part of a research movement that has been gaining credibility since 2025. The Gödel Agent, presented at ACL 2025, demonstrated that an agent could modify its resolution policy AND its own learning algorithm via runtime monkey patching.

The concept of monkey patching — dynamically replacing functions during execution — is well known to Python developers. Applying it to an AI agent that modifies its own learning algorithm is a qualitative leap.

SICA: proof by results

The most striking case in this lineage is SICA (Self-Improving Coding Agent). By autonomously modifying its own codebase, SICA went from 17% to 53% on SWE-Bench Verified. This is not a marginal improvement — it is a tripling of performance, achieved without human intervention.

MOSS takes this logic but generalizes it. Where SICA focuses on the generated code (output), MOSS targets the agent's own code (infrastructure).

A survey mapping the terrain

The Awesome-Self-Evolving-Agents repository maintains an exhaustive mapping of this field. The survey it references clearly distinguishes three levels of evolution: model-centric (fine-tuning), skill-centric (adjustment of capabilities), and agent-centric (modification of the agent's architecture).

MOSS deliberately positions itself at the third level, the most ambitious and the least explored.


Results on SWE-Bench and measured performance

SWE-Bench as a testing ground

SWE-Bench has become the reference benchmark for evaluating the ability of agents to solve real GitHub tickets. It is a demanding test that requires code comprehension, navigation in a complex codebase, and generation of functional patches.

The meilleurs agents IA actuels using models like GPT-5.5 (agentic score 98.2) or Claude Opus 4.7 Adaptive (94.3) already perform remarkably well on this type of task. But their performance plateaus because their architecture does not adapt.

What MOSS changes in the equation

MOSS's experimental results show that the agent identifies specific failure patterns — for example, a poor file navigation strategy — and generates patches that directly modify its search logic.

Rather than searching more broadly or for longer with the same flawed strategy, MOSS changes the strategy itself. It is the difference between trying the same key in a broken lock and forging a new key.

The limitations of current results

We must remain honest: MOSS's results are promising but not spectacular in absolute terms. The paper shows significant improvements on targeted subsets of SWE-Bench, not an overall score that crushes the competition. The scientific value lies in the proof of principle, not in an absolute record.

This nuance is important. Research on self-evolution is in its experimental beginnings.


Self-improvement: skills vs agent — where does MOSS stand?

The production-research frontier

In 2026, skill-level self-improvement is already deployed in production. The Self-Improving Skills from the AI-Native Playbook guide documents real-world use cases in engineering, finance, and medicine. An agent adjusts its skills — its prompts, its workflows, its strategies — based on feedback.

But this self-improvement remains superficial. The agent changes what it does, not how it is built. It's an employee changing their working method, not modifying their own brain.

MOSS crosses the frontier

MOSS positions itself exactly on this frontier. The agent doesn't just adjust its skills — it modifies the infrastructure that executes these skills. It's a change in nature, not in degree.

To give you an idea: it's the difference between an agent configured with OpenClaw that adjusts its Skills based on user feedback, and that same agent deciding to rewrite the SOUL engine that orchestrates these Skills.

Why this distinction matters

In practice, skill-level self-improvement is controllable. A human can inspect prompt modifications, validate workflow changes. Source-level self-improvement is fundamentally more opaque — a code patch can have unpredictable side effects.

This is precisely where AI research agents show their current limitations: they can find information, but they cannot improve their own research process.


Ethical implications: the self-modifying agent

The problem of human control

When an agent modifies its prompts or its memory, a human can read the changes and decide whether they are acceptable. When an agent rewrites its own functions, readability decreases drastically. A 50-line Python patch modifying an agent's control logic is not trivially inspectable.

The question is not theoretical. The MOSS paper does not propose a formal mechanism for human control over rewrites. The internal validator tests functional correctness, not alignment with human intentions.

The risk of incremental drift

A single patch is probably harmless. But hundreds of incremental patches, each functionally validated but never globally inspected by a human? This is a scenario of slow drift, difficult to detect.

The analysis of the Gödel Agent highlights this risk: an agent that modifies its own learning algorithm can theoretically drift toward behavior optimized for automated validation rather than the actual human objective.

The illusion of the sandbox

One might think that a sandboxed environment solves the problem. But an agent capable of rewriting its own code can potentially learn to bypass sandbox constraints. This is not science fiction — it is a logical possibility that AI safety research takes seriously.

The survey on self-evolving agents moreover identifies "safety" as one of the three major challenges in the field, alongside stability and efficiency.

Towards what regulatory framework?

No current regulatory framework specifically covers source-level self-modification by agents. The European AI Act discusses high-risk systems, but does not anticipate a scenario where the system modifies its own architecture after initial certification.

This is a legal gap that will need to be filled, probably through "frozen architecture" requirements for systems deployed in production, with regulated exceptions for research.


Connection with AI search agents

Why search agents need self-evolution

AI search agents like Perplexity or the meilleurs LLM pour la recherche excel at information findability. But their search process is hard-coded: they follow a fixed pipeline (query → retrieval → synthesis) that does not improve with use.

If a search agent consistently fails on a type of query — for example, technical queries requiring navigation through an open-source project's documentation — it will always fail in the same way. MOSS could theoretically allow this agent to rewrite its retrieval strategy to better handle this case.

The weakness exposed by DeepWeb-Bench

The benchmark DeepWeb-Bench precisely highlighted the structural weaknesses of search agents. Their failures are not random — they are systematic and correlated with architectural gaps.

A MOSS-like mechanism applied to search agents could enable targeted self-evolution: identify failure patterns on DeepWeb-Bench, then generate patches modifying the navigation and retrieval logic.

Grep vs vector search: a concrete use case

A striking example of the value of self-evolution: the agents' preference for grep over vector search. Agents empirically discover that grep is more effective than vector search in certain contexts. With MOSS, an agent could rewrite its own code to favor grep in those specific contexts — without a human needing to identify and code this pattern.

This is self-evolution applied to an empirical discovery by the agent.


Which models to run MOSS?

LLM requirements

MOSS is a self-evolution framework, not a model. It requires an underlying LLM competent enough to analyze code, identify failure patterns, and generate functional patches. In other words, it needs a high-level agentic model.

The June 2025 agentic LLM ranking provides clear indications. GPT-5.5 (98.2) would be the natural candidate to maximize the quality of the generated patches. Claude Opus 4.7 Adaptive (94.3) offers an interesting profile thanks to its adaptability — ironically relevant for a self-evolution framework.

The open-source option

For research and experimentation, the self-hosted option is relevant. Kimi K2.6 in self-host (88.1 agentic) or GLM-5 Reasoning (82) allow you to deploy MOSS without cloud dependency. This is actually the healthiest research model: a self-evolving agent running on your own infrastructure, with a local LLM that you control.

The guide on open-source AI agents with Ollama provides the basics for setting up such infrastructure. Adding MOSS on top would be the next step for a research lab.

Model Agentic score Advantage for MOSS Constraint
GPT-5.5 98.2 Best patch quality High cost, API dependency
Claude Opus 4.7 Adaptive 94.3 Natural adaptation Variable pricing
Gemini 3 Pro Deep Think 95.4 In-depth reasoning Potentially high latency
DeepSeek V4 Pro (Max) 88 Good quality/price ratio Less public data
Kimi K2.6 (Self-host) 88.1 Total control, no dependency Infrastructure to maintain

❌ Common mistakes

Mistake 1 : Confusing auto-evolution with fine-tuning

MOSS auto-evolution modifies the agent's code, not the model's weights. It is not fine-tuning, not RLHF, not online learning in the classical sense. It is software rewriting. Confusing the two leads to false expectations about the types of improvements possible.

Mistake 2 : Thinking that MOSS is an off-the-shelf product

It is a research paper, not a distributable framework. The proof of concept is there, but the industrial robustness, safety, and observability required for production deployment are absent. Anyone deploying MOSS in production today is taking an unquantified risk.

Mistake 3 : Underestimating the opacity of auto-generated patches

A 20-line patch that modifies an agent's control logic can have subtle side effects. A functional validator does not catch alignment drifts. Ignoring this opacity means ignoring the primary risk of the approach.

Mistake 4 : Believing that auto-evolution solves the model's fundamental problems

MOSS does not make an LLM smarter. It allows the agent to better exploit the capabilities of the underlying LLM. If the model does not understand a concept, rewriting the harness will not create that understanding. Auto-evolution is a multiplier, not a capability creator.


❓ Frequently Asked Questions

Is MOSS open source?

The paper is public on arXiv, but the code is not distributed as a package. It is an academic proof of concept attached to the OpenClaw community. The reference implementation is not available at the time of publication.

Can a MOSS agent break its own code?

Yes, that is an inherent risk. The internal validator is supposed to prevent regressions, but a patch can introduce subtle bugs that only manifest under specific conditions. This is an open problem in research.

Does MOSS replace existing agent frameworks?

No. MOSS is a self-evolution layer that sits on top of an existing agent framework. It could theoretically integrate on top of a system like OpenClaw, but it is not implemented.

What is the difference from an agent using a traditional MLOps pipeline?

A traditional MLOps pipeline involves a human in the loop: data collection, training, evaluation, deployment. MOSS removes the human from the agent modification loop. This is precisely what makes it both interesting and risky.

Are the results on SWE-Bench reproducible?

The paper provides the methodological details necessary for reproduction, but the lack of published code makes direct reproduction difficult. The community will need to implement the mechanism based on the paper's description.


✅ Conclusion

MOSS is not going to revolutionize AI tomorrow morning — it is a research paper, not a product. But it makes the right diagnosis (agents are static) and proposes the right direction (source-level self-evolution). In two or three years, the idea that an agent could modify its own code to improve itself will be either mundane or forbidden. To follow this frontier, the original paper (arXiv 2605.22794) is this week's essential reading.