📑 Table of contents

Google launches the Interactions API in general availability: the new default interface for building Gemini agents (and generateContent retires)

Agents IA 🟢 Beginner ⏱️ 11 min read 📅 2026-06-24

Google launches the Interactions API in general availability: the new default interface for building Gemini agents (and generateContent retires)

🔎 On June 22, 2026, Google made official what everyone had seen coming for six months

The Interactions API is moving to general availability (GA). It becomes the default interface for interacting with Gemini models and agents in Google AI Studio, the official documentation, and all new agent features. The old generateContent API is not disappearing immediately, but it is relegated to legacy status for simple use cases.

This is an architectural shift. Google is no longer just offering a language model — it is offering a complete runtime for autonomous agents, with server-side state management, a Linux sandbox, native audio streaming, and integrated tool use. All of this behind a single endpoint.

The June 8, 2026 migration deadline has already passed. Developers who have not migrated their agent workflows to the Interactions API find themselves locked out of frontier features.


The key points

  • The Interactions API is officially GA as of June 22, 2026, and replaces generateContent as the default interface across the entire Gemini ecosystem.
  • It unifies model and agent interactions: a single entry point for chat, tool use, background execution, and multimodal streaming.
  • Managed Agents offer hosted Linux sandboxes for code execution, with support for background execution and typed execution steps.
  • The gemini-3.1-flash-tts-preview model supports native speech streaming (TTS) via the Interactions API as of June 17, 2026.
  • The agent framework war is intensifying: Google is locking down its ecosystem around a single API, facing OpenClaw, Microsoft MAF, and other initiatives.

Tool Main usage Price (June 2026, check on ai.google.dev) Ideal for
Interactions API Unified interface for Gemini models + agents Free (Free tier), pay-as-you-go (Flex/Priority) Developers building autonomous agents with Gemini
Google AI Studio IDE and playground for Gemini Free Rapid prototyping and testing of the Interactions API
Guide de migration Migration from generateContent Free Developers needing to convert their existing integrations
Référence API Complete technical documentation Free Detailed implementation of endpoints and schemas

What the Interactions API actually changes

A single endpoint replaces several scattered calling patterns. The Interactions API natively handles multi-turn multimodal conversations, tool use, and server-side state persistence — without the developer having to manually manage message history.

According to the official Google blog post, this API has "quickly become developers' preferred way to build applications with Gemini." The press release confirms that the GA brings a stable, sustainable, versioned schema.

The old generateContent API remains available for simple use cases (text generation, summarization, translation). But any advanced functionality — long-running agents, background execution, sandbox — is now exclusive to the Interactions API.

This is a clear architectural choice: Google is separating the "prompt → response" use case from the "stateful autonomous agent" use case. Two interfaces, two paradigms.

Server-side state management

This is the most underestimated change. With generateContent, the developer had to store and send back the entire conversation history with every call. For an agent running over 50 turns with tool use, this becomes unmanageable.

The Interactions API handles this state server-side. The client sends a message, the server maintains the context. This drastically reduces network payload sizes and simplifies the client code.

For architectures where the agent interacts with a browserless CRM — like the Salesforce Headless 360 — this centralized state management eliminates a major point of friction.


Managed Agents: Linux sandboxes directly in the API

The GA introduces Managed Agents as a stable release. These are Gemini agents that run in remote Linux sandboxes hosted by Google. The agent can execute code, manipulate files, install dependencies — all in an isolated manner.

According to the detailed analysis by Mer.vin, these sandboxes support background execution and typed execution steps. The developer can define a structured output schema for each step of the agent.

In practice, an agent can launch a Python script in the sandbox, retrieve the result, decide on the next step, all without the client needing to poll or manage a complex lifecycle. The API handles the agentic loop.

Flex vs Priority: two service tiers

Google introduces two service tiers with the GA. The Flex tier offers pay-as-you-go pricing with variable latencies. The Priority tier guarantees low latencies and priority throughput — essential for production agents that need to respond in real time.

The choice of tier is made at the interaction configuration level, not at the project level. The same project can mix both depending on the use case.


Native Streaming TTS with gemini-3.1-flash-tts-preview

On June 17, 2026, two days before GA, Google added streaming speech via the gemini-3.1-flash-tts-preview model. According to the Gemini API release notes, this streaming works both via streamGenerateContent and via the stream: true parameter in the Interactions API.

The obvious use case: voice agents. A Gemini agent can now reason, decide to use a tool, execute code in a sandbox, and then stream its voice response in real time — all within a single Interactions API session.

This is a structural advantage over architectures that require an LLM for reasoning + a separate TTS model + an orchestrator between the two. Google integrates all three layers into a single pipeline.

For developers comparing the multimodal capabilities of current models, the Gemini vs ChatGPT vs Claude comparison remains a reference, but the Interactions API adds an infrastructural dimension that competitors do not offer natively.


Migrating from generateContent: what changed on June 8

The June 8, 2026 deadline was the cutoff date to migrate agent workflows to the Interactions API. The official migration guide details the steps.

The request schema changes. Instead of sending a contents object with the full history, you create an interaction with a session identifier. Subsequent messages reference this session. The endpoint also changes — you no longer call models/gemini-X:generateContent but the dedicated Interactions endpoint.

For simple stateless cases, generateContent continues to work. But if your code uses tool use, function calling, or multi-turn conversations, migration is ultimately not optional. The new frontier features — Gemini Deep Research in preview, MCP support, collaborative planning — are Interactions-only.

Gemini Deep Research deserves a mention. Available in preview via the Interactions API, it supports collaborative planning, visualization, and the MCP (Model Context Protocol). It is an autonomous research agent that plans, executes, and iterates — exactly the type of workflow that generateContent could not support.


The agent framework wars: why Google is locking down its ecosystem

The timing is no coincidence. The Interactions API is going GA just as the agent framework market explodes. Every player is trying to impose its own standard.

Google is betting on a proprietary but unified API, integrated into the cloud, with native sandboxes. This is consistent with their strategy: the model is the runtime. No need for a third-party framework — the Gemini API is the framework.

This approach contrasts with open-source initiatives. Projects like OpenClaw, which can be found in our ranking of the best autonomous AI agents, offer an abstraction layer above several models. The Interactions API, on the other hand, is optimized exclusively for Gemini.

For developers, the calculation is simple: if you are 100% Gemini, the Interactions API offers you deeper integration, lower latency, and more native features. If you are multi-model, you will go through an orchestration framework and only use the Interactions API as one backend among others.

The underlying question: will developers accept locking themselves into the Google ecosystem in exchange for superior integration? Industry history suggests that convenience often wins over openness — at least in the short term.

Local agents: an alternative that remains credible

It should be noted that everything described here concerns Gemini agents hosted by Google. For developers who want to keep total control — sensitive data, compliance, long-term costs — open-source AI agents with Ollama locally remain a serious alternative. You won't have managed Linux sandboxes, but you will keep control over the entire pipeline.

Similarly, the choice of the underlying model remains strategic. The Interactions API provides access to Gemini models, but an agent built with Claude Opus 4.7 or GPT-5.5 via other APIs can be just as performant, or even more so, depending on the agentic benchmark (98.2 for GPT-5.5 vs 87.3 for Gemini 3.1 Pro).


Interactions API vs generateContent : comparison table

Criteria generateContent Interactions API (GA)
State management Client-side (history returned on each call) Server-side (persistent session)
Autonomous agents Not supported Managed Agents with Linux sandbox
Background execution No Yes (background execution)
Streaming TTS Via streamGenerateContent only Native (stream: true) + integrated TTS
Tool use / Function calling Supported but without agentic loop Full agentic loop with typed steps
Gemini Deep Research Not available Preview via Interactions
Service tier Standard only Flex + Priority
Status Maintained for simple use cases Default interface, recommended

❌ Common mistakes

Mistake 1: Continuing to use generateContent for agent workflows

This is the most frequent mistake post-June 8. generateContent still works for simple calls, but if your agent does multi-step tool use or background execution, you are in a dead end. The new frontier features will never be backported. The solution: follow the migration guide and refactor to a session-based architecture.

Mistake 2: Ignoring the choice of service tier

By default, developers stay on the standard tier. With the Interactions API in production, the Flex tier is more cost-effective for batch processing, but the Priority tier is essential for real-time user interactions. Configuring the wrong tier means either paying too much or delivering a degraded experience. The solution: analyze your call patterns and configure the tier per interaction, not per project.

Mistake 3: Underestimating session management

The Interactions API manages state on the server side, which does not mean "I no longer have to think about state." Sessions have a lifespan, a storage cost, and their cleanup must be managed. A developer who creates sessions without a TTL strategy will see their costs explode. The solution: implement a rotation and cleanup policy for inactive sessions from the very beginning.

Mistake 4: Choosing Gemini solely for the API while ignoring the model's score

The Interactions API is excellent as infrastructure. But the underlying model matters. Gemini 3.1 Pro scores 87.3 on the agentic benchmark, well below GPT-5.5 (98.2) or Claude Opus 4.7 (94.3). If reasoning quality is the primary criterion, the most elegant API does not compensate for a less performant model. The solution: evaluate the model and the infrastructure, not one without the other. The Gemini vs ChatGPT vs Claude comparison helps to see things clearly.


❓ Frequently Asked Questions

Will generateContent be deprecated?

No. Google is maintaining generateContent for simple use cases (single generation, no state, no agent). However, it will no longer receive new agent features. All of Google's R&D is focused on the Interactions API.

Is the Interactions API free?

No. It follows the same pricing model as the Gemini API, with two new tiers: Flex (pay-as-you-go, variable latency) and Priority (guaranteed latency, extra cost). A Free tier exists for prototyping, just like the free AI APIs available from other providers.

Can the Interactions API be used with models other than Gemini?

No. It is a proprietary API optimized for Gemini models. For a multi-model architecture, you need to use an orchestration framework like those listed in our article on the best autonomous AI agents.

Do Managed Agents replace tools like OpenClaw?

Not exactly. Managed Agents are integrated into the Interactions API and optimized for the Gemini ecosystem. OpenClaw and similar tools offer a multi-model abstraction. The choice depends on your lock-in strategy: depth of integration vs. model freedom. For the best LLMs for AI agents, the choice of runtime is just as important as the choice of model.

Does TTS streaming work with all Gemini models?

No. Only the gemini-3.1-flash-tts-preview model supports speech streaming as of June 2026, as indicated in the release notes. Other models continue to operate in text.


✅ Conclusion

The Interactions API in GA is not just an API update — it's Google transforming Gemini from a model into an agent platform. A single endpoint, managed state, Linux sandboxes, streaming TTS: the message is clear, if you are building agents with Gemini, you no longer need a third-party framework.

The question of the underlying model remains. The infrastructure is excellent, but when facing GPT-5.5 or Claude Opus 4.7 on complex reasoning tasks, Gemini 3.1 Pro is not always the best choice. It's up to you to decide if the infrastructural integration makes up for the difference in raw performance — the detailed model comparison should help you decide.