Vercel eve: the open source framework that wants to do for AI agents what Next.js did for the web
🔎 Why a framework for AI agents changes the game in June 2026
Building an AI agent in production today is a hack job. You take an LLM, you add tool calling, you manage state manually, you cobble together some checkpointing, and you pray the execution survives a crash.
Vercel just declared the end of this era. On June 17, 2026, at Ship 26 in London, the company unveiled eve: an open source framework in TypeScript designed as the Next.js of AI agents.
The idea is radically simple: an agent is neither an API nor a graph, it's a folder of files. The framework compiles this folder into a durable production service, with sandbox, approvals, and scaling included.
The validation isn't theoretical. Vercel uses eve internally to run v0 and its own fleet of agents, according to TechTimes. This isn't a lab POC, it's code serving millions of requests.
The essentials
- Filesystem-first: an agent = a folder (
instructions.md,tools/,skills/,channels/,subagents/), directly inspired by Next.js's file-system routing. - Durable execution: automatic checkpointing at each step, the agent picks up where it left off even after a crash.
- Sandboxed compute: code generated by the agent is executed as untrusted, in an isolated environment.
- Open source: Apache-2.0 license, available on npm via
npm install eve@latest, scaffolding withnpx eve init. - Native human-in-the-loop: approvals are built into the framework, not bolted on as an afterthought.
Recommended tools
| Tool | Main usage | Price (June 2026, check on vercel.com) | Ideal for |
|---|---|---|---|
| eve | Production AI agents framework | Open source (Apache-2.0) | TypeScript devs who want to deploy agents |
| v0 | Agent-based UI generation | Freemium | Rapid interface prototyping |
| Hostinger | Web hosting for deployment | Starting from €2.99/month | Deploying apps around your agents |
One agent = one folder: the filesystem-first paradigm
This is eve's stroke of genius. Forget graphical builders, endless YAML files, and impossible-to-debug state graphs.
An eve agent is a directory on your disk. Inside, conventional files describe everything the agent knows how to do.
instructions.md contains the system prompt. tools/ hosts typed TypeScript functions. skills/ groups reusable behaviors. channels/ defines the interfaces (Slack, web, API). subagents/ allows delegation to other agents.
As Towards AI points out, this approach is unprecedented for an agent framework. An agent is neither an API you call nor a graph you build: it's a folder.
The parallel with Next.js is fully embraced by Vercel. In 2016, Next.js imposed file-system routing for the web. In 2026, eve imposes file-system agents for AI. Same logic, same ambition: eliminate manual plumbing.
For developers who have already used similar approaches with open source agents run locally, the concept will be familiar. But eve adds the production layer that was missing.
The architecture in detail
Each file has a specific role. instructions.md is the brain. tools/ are the hands. channels/ are the ears and mouth. subagents/ are the colleagues.
The framework compiles everything, plugs in durable workflows, and connects the channels. You don't write glue code. You declare capabilities, eve takes care of the rest.
This architecture makes inspection and extension trivial. You open a folder, you read files, you understand what the agent does. No more black boxes.
Durable execution: the checkpointing that changes everything
An agent that runs for 30 minutes then crashes at step 28 is 30 minutes lost. It's every developer's nightmare who has put agents into production.
eve solves this with what the official documentation on GitHub calls checkpointing at every step boundary. Every time the agent completes a step, its entire state is serialized and saved.
If the process crashes, if the server restarts, if the cloud bill explodes: the agent resumes exactly where it left off. No retry logic to write. No manual state management.
The code runs in a managed step. The tools, the sandbox and the subagents appear synchronous in your code, even though the underlying session is durable. It's a powerful abstraction that makes you write simple code while eve handles the complexity underneath.
For code agents that manipulate entire projects — like Qwen3-Coder-Next with its 80B MoE parameters does — this durability is non-negotiable. A code agent that loses its context mid-generation is a useless agent.
Sandboxed compute: generated code is treated as hostile
This is perhaps eve's most important architectural decision. Any code generated by the agent is executed in a sandboxed compute environment, as if it were untrusted.
Because it is. An LLM can generate malicious code, whether intentionally or not. An agent with unfettered access to your filesystem is a production security vulnerability.
eve isolates the execution of generated code in a secure environment. The agent can write, execute, and iterate on code, but this code never directly touches your infrastructure.
This approach stands in stark contrast to most of the best autonomous AI agents on the market which often execute code in partially isolated environments, or sometimes not isolated at all.
The sandbox is not an optional add-on. It is a core component of the framework, enabled by default. Vercel learned from its own production mistakes and baked this lesson into eve's DNA.
Human-in-the-loop: approvals are not an afterthought
Many frameworks add human-in-the-loop as a plugin. eve integrates it into the execution model itself.
An agent can request approval before executing a sensitive action. The request is routed via the appropriate channel (Slack, email, web interface). The human approves or rejects. The agent resumes its execution.
This mechanism is particularly critical for agents that manipulate code in production. A coding agent like Claude Code that switched to monthly credits shows that the industry is becoming aware of the need to finely control what agents actually do.
With eve, approvals are declarative. You define in the agent's configuration which actions require human validation. The framework handles the routing, the blocking, and the automatic resumption.
Subagents, channels and skills: composition at the service of complexity
A single agent has limits. eve makes it possible to compose complex agents from specialized subagents.
subagents/ is a folder like any other. You place child agents there that inherit the same durable execution model. An orchestrator agent can delegate a search task to a subagent, a code task to another, and aggregate the results.
channels/ define how the agent communicates with the outside world. Web, Slack, REST API, email: each channel is a configuration file. Same principle as routing in Next.js.
skills/ are reusable behaviors that an agent can invoke. Think of Next.js middleware, but for agent capabilities.
This modular architecture changes the game for search agents. An agent built with eve could use Crawl4AI, the number 1 open source crawler on GitHub as a tool, and delegate the analysis to a specialized subagent. All while benefiting from durable execution.
The parallel with OpenSeeker-v2 which breaks the monopoly of industrial search agents is enlightening: the market is moving toward composable and open source agents, not proprietary monoliths.
eve vs LangGraph vs Microsoft Agent Framework: where does it stand?
The agent framework landscape is consolidating. Here is how eve positions itself.
| Criterion | eve (Vercel) | LangGraph | Microsoft Agent Framework |
|---|---|---|---|
| Paradigm | Filesystem-first (folder) | State graph | Declarative/orchestration |
| Language | Native TypeScript | Python (partial TypeScript) | C# / Python / TypeScript |
| Durability | Native automatic checkpointing | Manual via checkpointer | Via external state store |
| Sandbox | Integrated by default | Optional | Non-native |
| Human-in-the-loop | Native in the execution model | Via interrupt/guard | Via extension |
| Open source | Apache-2.0 | MIT | MIT |
| Philosophy | One agent = one folder | One agent = one graph | One agent = one component |
LangGraph requires you to think in terms of nodes and edges. This is powerful for complex workflows, but the learning curve is steep. Microsoft Agent Framework takes a more enterprise-oriented approach, with all the verbosity that comes with that world.
eve chooses a different path: declarative simplicity via files. You don't build a graph, you describe an agent. The framework takes care of the execution.
This approach also echoes the Hermes Agent philosophy, where instructions, skills, and tools are organized in files. If you use agent-optimized LLMs like those from the OpenClaw/Hermes family, the transition to eve will be natural.
The choice of LLM behind eve
eve does not impose a model. You plug in whatever LLM you want. But in practice, production agents require solid agentic models.
A GPT-5.5 (agentic score 98.2) or a Claude Opus 4.7 Adaptive (94.3) will deliver optimal results. For code, GPT-5.3 Codex (80) or Claude Sonnet 4.6 (81.4) offer an excellent quality/cost ratio. Self-hosted models like Kimi K2.6 (88.1) or GLM-5 Reasoning (82) are viable if you control the entire stack.
The framework is agnostic, but the quality of the agent will depend directly on the LLM you choose to drive it.
npx eve init: start an agent in 30 seconds
Eve's scaffolding follows the Vercel tradition: one command, a complete project.
npx eve init generates an agent folder with the conventional structure: instructions.md, tools/, channels/, skills/. You edit the files, add your tools in TypeScript, configure your channels, and launch.
The coverage on WebDeveloper.com confirms that the package is available on npm (npm install eve@latest) and that the documentation is included directly in the package (node_modules/eve/docs).
You define your agents in agent.ts and instructions.md. The first points to the LLM and the execution configuration. The second contains the system prompt. The rest is filesystem convention.
It's radically simpler than configuring a LangGraph graph with its state channels, its reducers, and its custom checkpointer. And it's designed for immediate deployment on Vercel infrastructure.
For devs coming from the world of open source agents with local Ollama, the transition is natural: you keep local control, you add the production layer.
Production security: what eve doesn't solve
eve sandboxes the code generated by the agent. This is essential but insufficient for a complete security posture.
An agent that has access to poorly configured tools can cause damage without executing any code. A tool that sends emails, deletes files, modifies a database: the sandbox doesn't protect against that.
The security of an eve agent depends on what you put in tools/. Each function is an attack surface. The framework provides the infrastructure, not the business logic validation.
The threat model for agents in production is still immature. Attacks like agentjacking, where a fake bug report is enough to compromise an agent, show that the attack surface goes beyond simply the executed code.
eve gives you the building blocks to build securely (sandbox, approvals, least privilege per tools). But the responsibility for the configuration remains yours.
Evals and tracing: observing what your agent is actually doing
An agent in production without observability is a black hole. You don't know what it's doing, why it made a certain choice, or where it failed.
eve integrates tracing and evals directly into the framework. Every step of the execution is traced. You can see which tool was called, with what parameters, what result was returned, and how long each step took.
Evals allow you to define success criteria and automatically measure your agent's performance on test sets. This is essential for iterating.
According to SiliconANGLE, eve combines all the elements to assemble, evaluate, and run agents. Evaluation is not an external tool; it's part of the development cycle.
For teams deploying agents at scale, this native integration alone is worth switching to eve. No more cobbled-together observability pipelines with custom logging.
Who should adopt eve (and who shouldn't)
eve is made for TypeScript developers who want to put agents into production without tinkering with the infrastructure. If you're comfortable with Next.js, you'll be comfortable with eve.
This is particularly relevant if you're building agents that need to run for a long time (durable execution), that execute generated code (sandbox), or that require human validations (approvals).
On the other hand, if your agents are simple RAG chains with two or three LLM calls, eve is probably overkill. A Python script with LangChain will do the job.
If you're in pure Python, with no desire to switch to TypeScript, eve is not for you. LangGraph remains the best choice in this ecosystem.
If you need complex state graphs with deep conditional branching, eve's file approach might frustrate you. In this case, LangGraph offers more granular control.
For teams that want an open source coding agent like OpenCode with its 172K GitHub stars, eve offers the framework to take it from prototype to production.
❌ Common mistakes
Mistake 1: Confusing eve and Vercel Agent
eve is an open source framework for building agents. Vercel Agent is a distinct autonomous code review product, as clarified by Agent Swarm. These are two different things. Do not mix them up in your architecture.
Mistake 2: Putting all logic in instructions.md
The system prompt is important, but if your agent needs complex procedural logic, it belongs in tools and skills, not in a 5000-word prompt. Current LLMs, even a GPT-5.5 at 98.2, are not reliable at executing deterministic algorithms via prompt alone.
Mistake 3: Ignoring approvals configuration
The sandbox protects against untrusted code execution. Approvals protect against untrusted actions. If you don't configure approvals on your sensitive tools, your agent could send emails, delete data, or make payments without validation. This is a design flaw, not a framework flaw.
Mistake 4: Choosing your LLM without testing on your use case
A high agentic score doesn't guarantee performance on your specific case. GPT-5.5 dominates general benchmarks, but Claude Sonnet 4.6 might be better at certain coding tasks. Test before locking in your choice.
❓ Frequently Asked Questions
Is eve really open source?
Yes. eve is released under the Apache-2.0 license, according to the sources from the Ship 26 presentation. The code is on GitHub in the vercel/eve repo, and the package is available on npm.
Does eve replace LangGraph?
No. eve and LangGraph target different needs. eve is filesystem-first and optimized for declarative simplicity in TypeScript. LangGraph is graph-first and offers more control over complex workflows, especially in Python. The choice depends on your stack and the complexity of your workflows.
Can eve be used without Vercel infrastructure?
Yes. eve is an open source framework. You can run it anywhere. Integration with Vercel infrastructure is an asset for deployment, but not a requirement. You can run it on your own server, including on Hostinger for modest projects.
Which LLMs are compatible with eve?
eve is agnostic. Any LLM with a compatible API can be plugged in. The top-performing models in agentic are GPT-5.5 (98.2), Gemini 3 Pro Deep Think (95.4) and Claude Opus 4.7 Adaptive (94.3). For code specifically, GPT-5.3 Codex and Claude Sonnet 4.6 are excellent choices.
Does durability work with subagents?
Yes. Each subagent benefits from the same durable execution model as the parent agent. Checkpointing applies at all levels of the agent hierarchy. If a subagent crashes, it resumes individually without affecting the others.
✅ Conclusion
eve is the first credible attempt to create an AI agent framework that is as easy to adopt as Next.js was for the web. The filesystem-first paradigm, native checkpointing, built-in sandbox, and declarative approvals address the real problems of production. If you are building agents in TypeScript, eve deserves your immediate attention.