Devin Auto-Triage : Cognition turns its AI agent into a 24/7 on-call engineer
🔎 From coding agent to on-call agent
For months, AI coding agents have operated on a simple model: you open a chat, describe a task, the agent executes, and you verify. This "on-demand" mode has its limits. Bugs don't wait for an engineer to open a session. Production incidents don't trigger at a convenient time.
On May 18, 2026, Cognition announced Devin Auto-Triage, a feature that fundamentally changes the nature of its agent. Devin no longer just executes tasks assigned to it. It becomes a persistent on-call engineer, permanently connected to your alert channels, reacting to incidents in real time without initial human intervention.
This is the shift from "on-demand" mode to "always-on" mode. And according to Frontier News, this shift signals a true maturation of AI agent infrastructure. Teams are no longer wondering if AI can help with code, but how to integrate it safely into production workflows.
The key points
- Devin Auto-Triage is a persistent agent that monitors incoming bugs, alerts, and incidents via Slack, Datadog, Sentry, PagerDuty, and Raindrop.
- It automatically investigates each report, connects related incidents, deduplicates the noise, and routes to the right owner.
- It can open a pull request to fix the issue without waiting for Monday morning's standup.
- Modal (YC) is already using it in production to triage incidents for its inference team, with concrete results.
- The setup takes a few minutes: invite Devin to a Slack channel and activate the "Triage bug reports on Slack" template.
Recommended tools
| Tool | Main usage | Price (May 2026, check on devin.ai) | Ideal for |
|---|---|---|---|
| Devin Auto-Triage | Automated 24/7 incident triage | On quote (Devin Automations plan) | Engineering teams with high incident volume |
| Bundle Incident Copilot | AI-assisted context during incidents | On quote (check on bundle.app) | Teams that want a human-in-the-loop copilot |
| Devin Docs | Auto-Triage documentation and setup | Free | Any engineer setting up Devin |
What Devin Auto-Triage does exactly
Devin Auto-Triage continuously monitors a Slack channel (or other sources like Datadog, Sentry, PagerDuty, Raindrop). When a bug or alert arrives, it doesn't just log it.
It investigates. Specifically, this means it searches the codebase, checks the logs, cross-references information from your observability stack, and produces an initial analysis of the problem.
Then, it connects related reports. If three users report the same issue using different words, Devin identifies that it's the same incident. It deduplicates the noise, which prevents engineers from wasting time on duplicates.
It routes to the right owner based on the context of the affected code. And if the diagnosis is clear and the fix is obvious, it can open a pull request directly. All of this without any human having to click on anything.
This sequence — monitoring, investigation, deduplication, routing, potential fix — is exactly what an experienced on-call engineer would do. Except Devin does it in a few seconds, at any hour, without fatigue.
How to configure it in practice
The official Devin documentation describes a surprisingly simple process for a tool of this magnitude.
Two steps are enough. First, you invite Devin into the Slack channel dedicated to bug reports or incidents. Next, you create an automation in the Devin interface using the "Triage bug reports on Slack" template.
That's it. No complex CI/CD pipeline, no webhooks to manually configure for each alert source. Devin connects to your existing tools and gets to work.
Of course, the quality of the triage depends on the quality of the context Devin has access to. A well-structured codebase, actionable logs, a consistent observability stack — all of this significantly improves the results. But the starting threshold is low, and this is a deliberate choice on Cognition's part to encourage adoption.
For teams that want to go further in customizing autonomous AI agents, our guide on how to create an AI agent details the underlying architectures.
Modal in production: the key testimony
The most concrete use case comes from Modal, the YC startup specializing in serverless compute infrastructure. Hari Subbaraj, from the Modal team, has been speaking publicly since May 19, 2026.
His feedback is unequivocal: "Devin Automations feels like a step forward from other auto-triage tools we have tried. It monitors our channel, works with our codebase and observability stack, and comes back with useful investigation."
What sets this testimony apart from the usual corporate endorsements is its precision. Subbaraj doesn't talk about "digital transformation" or "revolution." He says that Devin works with their codebase and their observability stack, and that it comes back with a useful investigation. It is pragmatic and verifiable.
Modal uses Devin Automations to automatically triage the inference team's incidents. This is a particularly demanding context: compute infrastructure issues are often complex, involve interactions between multiple system layers, and require a fine understanding of the code to be diagnosed correctly. The fact that Devin is effective in this context speaks volumes about its investigative capabilities.
From on-demand mode to always-on mode: why it's a paradigm shift
The distinction between these two modes is fundamental to understanding the evolution of the AI agent market.
An on-demand agent is a tool that you activate when you need it. You have a bug, you open Devin, you give it the context, it helps you. It's useful, but it depends on you. If you're sleeping, if you're in a meeting, if you didn't see the alert — the agent does nothing.
An always-on agent is a system that works independently of you. It has its own activity loop. It monitors, reacts, acts. You don't need to be there for it to produce value.
According to NoonVibe, Devin Auto-Triage "closes the gap completely" between an incident occurring and an engineer starting to work on it. The bug no longer waits for Monday morning's standup. It is handled immediately.
This evolution is reminiscent of the shift from manual backups to automatic backups, or from passive monitoring to proactive alerts. Each time, the paradigm shift does not come from a new fundamental technology, but from a change in when the technology intervenes.
To understand the full range of agents operating in this autonomous mode, check out our comparison of the meilleurs agents IA autonomes.
Technical architecture: persistent memory and integrations
What makes Auto-Triage possible isn't a new language model. It's the architecture around the model.
Digg reports that Devin Auto-Triage works like a "first-responder with long-term memory." This is the crucial detail. An agent without persistent memory starts from scratch with every interaction. It cannot connect an incident from today with a similar incident from last week.
With persistent memory, Devin accumulates context about your codebase, your past incidents, your failure patterns. This memory is what allows it to intelligently deduplicate (it has already seen this type of bug) and route appropriately (it knows which engineer recently worked on this part of the code).
Integrations also play a central role. Slack for receiving reports, Datadog and Sentry for metrics and stack traces, PagerDuty for escalations, Raindrop for alerts. Devin doesn't replace these tools — it connects to them and orchestrates them.
This architecture of a connected agent with persistent memory linked to multiple tools is exactly what we describe in our article on what OpenClaw is, another example of this new generation of agents.
Comparison with trending PostMortem workflows
Auto-Triage is part of a broader movement: AI-driven incident management automation. But it stands out from approaches that have trended recently, notably PostMortem workflows.
An article published on tianpan.co on May 14, 2026 proposes a rigorous framework for writing AI quality regression postmortems. The idea is excellent: when an AI model degrades without a visible crash (less relevant responses, latency that subtly increases), you need a root-cause vocabulary, severity scales, and a follow-up cadence.
But this is a posteriori analysis. Auto-Triage is a priori action. The two are complementary, not competing. Devin handles the incident in real time. The postmortem framework helps understand why the incident happened and how to prevent it from coming back.
On the other hand, Bundle offers an "Incident Copilot" that provides AI-assisted context during incidents. The approach is different: the human remains in control, the AI assists them. It's the classic "copilot" model, applied to incidents.
Devin Auto-Triage goes further by removing the human from the initial loop. The agent acts alone, then presents its conclusions (and potentially its PR) for validation. It's an "agent" model rather than a "copilot" one, and the difference is significant in terms of response time.
Parallelism: one agent per ticket
An often underestimated aspect of Cognition's approach is parallelism. In the blog post about automating failure triages, Cognition explains that Devin works in the cloud and that multiple agents can triage tickets in parallel — one agent per ticket.
This is a major difference from a human engineer who handles incidents sequentially. If five bugs are reported at the same time, Devin launches five simultaneous investigations. Each instance has access to the codebase, logs, and context. None of them block the others.
Hilsil, mentioned in the article, has deployed this approach in its workflows. The result is a triage capacity that automatically scales to the volume of incidents, without requiring manual scaling of the on-call team.
For teams that want to understand which LLM models enable this type of agentic parallelism, our guide to the best LLMs for AI agents details the benchmarks and optimal configurations.
Safeguards: Security and Controls
An agent that acts without human permission, that opens PRs, that accesses the codebase and observability tools — the question of security is inevitable.
Cognition has integrated safeguards into Auto-Triage, but the precise details of sandboxing and permissions are crucial. Our guide on how to secure your AI agent details the essential controls: defined scope of action, mandatory human validation for high-risk actions, a complete audit trail.
In the case of Auto-Triage, the PR opened by Devin is not merged automatically. It requires a human review. This is a minimal but important safeguard: the agent proposes, the human disposes. The time saving is not in eliminating the review, but in eliminating the initial investigation — often the most time-consuming part of triage.
Persistent memory also raises questions. What exactly does Devin store? For how long? Who has access to it? These questions must be asked to Cognition before any enterprise deployment, especially in regulated contexts.
Choosing the underlying model
Auto-Triage relies on Devin's cloud infrastructure, which uses the most performant models on the market for agentic tasks. In June 2025, the agentic LLM rankings placed OpenAI's GPT-5.5 at the top with a score of 98.2, followed by Google's Gemini 3 Pro Deep Think at 95.4 and Anthropic's Claude Opus 4.7 at 94.3.
Cognition has not publicly detailed which model specifically powers Auto-Triage. However, the investigation quality reported by Modal suggests the use of a top-tier model, likely in the top 5 of this ranking. Triage tasks require reasoning, an understanding of complex code, and the ability to cross-reference heterogeneous information sources — areas where the most performant models excel.
For teams that prefer to keep full control by running their agents locally, the approach with Ollama offers an alternative, but with significant trade-offs in cloud integrations and persistent memory.
What it changes for engineering teams
The concrete impact can be measured across several dimensions.
Incident response time. An incident that would have waited 2 hours (at night, on the weekend) is investigated in seconds. Even if the final fix requires a human, the diagnosis is already done when the engineer logs in.
On-call cognitive load. The on-call engineer no longer receives dozens of raw alerts. They receive already sorted, deduplicated investigations with suggested routing. This makes a considerable difference in fatigue and quality of work life.
Triage quality. An agent that has access to the entire incident history and the complete codebase makes fewer routing errors than a human discovering the context. Deduplication is also more systematic — a human might miss a duplicate, Devin won't.
Scaling. As mentioned, parallelism makes it possible to absorb incident spikes without mobilizing the entire team. A single Devin can handle what would have required three engineers in parallel.
For teams that want to take the leap and build their first autonomous agent, our tutorial to create your first autonomous AI agent offers a practical starting point.
Current limitations
Despite the leap forward, Auto-Triage has limitations that Cognition does not hide.
Dependence on available context. If your logs are incomplete, if your codebase is poorly documented, if your observability tools are not connected, Devin's investigation will be proportionally worse. It's a classic garbage in, garbage out, but applied to agentic.
Complex cases requiring human judgment. An incident that simultaneously affects infrastructure, business logic, and user data often requires decisions that only a human can make (business prioritization, customer communication, rollback vs fix forward decision). Devin can investigate, but judgment remains human.
Cost. A persistent agent running 24/7 and consuming high-end model tokens is not free. The ROI depends directly on the volume of incidents and the human cost they represent. For a small team with few incidents, the equation may not be positive.
Organizational adoption friction. Giving an AI agent the ability to open PRs without prior validation requires a cultural change. Some teams will naturally accept it, others will need trust-building periods.
❌ Common mistakes
Mistake 1: Deploying Auto-Triage without cleaning up your alert channels
If your Slack bugs channel is a mix of real incidents, informal discussions, and off-topic notifications, Devin will sort through everything — including the noise. The quality of the triage depends directly on the cleanliness of the input channel. The solution: create a channel strictly dedicated to bug reports and incidents, with clear posting rules.
Mistake 2: Assuming Devin replaces the on-call engineer
Auto-Triage is a first-responder, not a replacement. It investigates and suggests. Humans are still necessary for validation, making decisions on complex fixes, and communicating with stakeholders. Deploying it as a complete substitute is a risk.
Mistake 3: Ignoring the audit trail
When an agent automatically opens a PR, you must be able to trace every step of its reasoning. Ignoring the audit trail means depriving yourself of the ability to understand why Devin made a certain decision — and to correct it if necessary. Configure your logs from the start.
Mistake 4: Failing to define a clear scope of action
Without a defined scope, Devin could theoretically investigate parts of the codebase that shouldn't be touched automatically (critical systems, sensitive data). Clearly define which repos, which environments, and which types of actions are within Auto-Triage's scope.
❓ Frequently Asked Questions
Does Devin Auto-Triage replace PagerDuty?
No. Auto-Triage integrates with PagerDuty, it does not replace it. PagerDuty handles escalation and human notification. Devin handles preliminary investigation and deduplication. The two are complementary.
How much does Devin Auto-Triage cost?
Pricing is not public as of May 2026. Auto-Triage is part of the Devin Automations plan, sold on a quote basis. Contact Cognition directly for a price tailored to your incident volume (check on devin.ai).
Can Devin's actions be limited to read-only?
Yes. The "Triage bug reports on Slack" template can be configured so that the agent investigates and reports without opening a PR. This is a recommended configuration for the first few weeks of deployment, while building trust.
Does Auto-Triage work with private codebases?
Yes, Devin already has access to your codebase as part of its normal operation. Auto-Triage uses this existing access for investigation. No additional configuration is needed on this side.
What is the delay between an incident and Devin's investigation?
According to feedback from Modal and the official documentation, the investigation starts immediately after the incident is detected in the monitored channel. The delay is on the order of a few seconds, not minutes.
✅ Conclusion
Devin Auto-Triage marks the moment when AI coding agents go from tools you consult to systems that act on their own initiative. It's a subtle but profound paradigm shift — from on-demand mode to always-on mode. If your team spends more than two hours a day on incident triage, now is the right time to test an autonomous AI agent in this specific workflow.