OpenAI Codex Computer Use arrives on Windows: the AI agent controls your PC remotely
🔎 An AI agent that takes control of your Windows PC is now a reality
OpenAI has just crossed a threshold that will reshuffle the cards of desktop automation. On May 29, 2026, the v26.527 update of Codex deploys Computer Use on Windows 11. Until now, this feature remained confined to macOS. The agent can now see your screen, click, navigate, and type in any Windows application, completely autonomously.
Why now? Because Windows still represents over 75% of the desktop market according to StatCounter data (2025). Ignoring this platform meant missing out on the vast majority of professional users. OpenAI is making up for this delay with calculated timing: the competition (Claude Code from Anthropic, Grok Build from xAI) is gaining momentum, and the need for agents capable of acting directly on the operating system is becoming the new frontier of agentic AI.
This isn't a simple assistant that suggests commands. Codex acts. It opens VS Code, runs tests, fixes bugs, validates results — without you having to touch the keyboard.
The essentials
- Codex Computer Use is now available on Windows 11 (v26.527, May 29, 2026), after an initial launch on macOS.
- The agent sees the screen, clicks, navigates, and types in any Windows application autonomously.
- Remote control via the ChatGPT mobile app allows you to launch and supervise tasks from your smartphone.
- This deployment covers 75%+ of the desktop market, a major strategic issue against Claude Code and Grok Build.
- The security implications are considerable: an AI that controls your PC implies a new attack surface.
Recommended tools
| Tool | Main usage | Price (June 2025, check on openai.com) | Ideal for |
|---|---|---|---|
| OpenAI Codex | Coding agent with Computer Use | Included in ChatGPT Pro/Team plans | Developers who want an autonomous agent on their PC |
| Hostinger | Hosting to deploy generated projects | Starting from 2.99€/month | Developers and freelancers who deploy quickly |
| ChatGPT Mobile (iOS/Android) | Remote control of Codex tasks | Included in the ChatGPT subscription | Coding from your phone while the agent works |
What Computer Use actually does on Windows
Computer Use is not about generating code in an editor. It's an agent that interacts with your operating system just like a human would. It takes screenshots, analyzes the interface, identifies clickable elements, and executes actions.
Concretely on Windows 11, Codex can open the PowerShell terminal, execute commands, launch Visual Studio, navigate the file explorer, open a browser to test a web app, and verify the results visually. All of this without human intervention between each step.
According to The Decoder, the agent is capable of hunting down bugs and testing applications autonomously. It doesn't stop at the first error: it reads the error message, identifies the source, modifies the code, reruns, and iterates until resolved.
This is a fundamental difference from classic Copilot, which simply completes text in an editor. Here, we are in pure agentic territory: an objective is given, the agent breaks down the steps, executes them, and adapts in real time.
How it works technically
The flow is as follows: you describe a task in natural language. Codex breaks it down into subtasks. For each subtask, it captures the current state of the screen, decides on the action (click, typing, navigation), executes it, and observes the result. The loop repeats until completion.
The GPT-5.3 Codex (agentic score of 80) and GPT-5.4 Pro (91.8) models are the engines behind this capability. The first is optimized for pure code, the second brings the advanced reasoning necessary to navigate complex interfaces.
From macOS to Windows: why this delay, why now
Computer Use was first deployed on macOS, a more homogeneous platform that is easier to target for a visual agent. macOS apps share more uniform interface conventions, which simplifies the agent's learning process.
Windows is another world. The ecosystem is heterogeneous: legacy Win32 applications, UWP apps, modern applications with Fluent Design, and third-party software with chaotic interfaces. An agent must be able to navigate through all this mix.
NewsBytes reports that OpenAI had to adapt its visual recognition system to handle this diversity. Windows interface elements do not behave like those on macOS: menus, modal windows, and system notifications have distinct visual patterns.
The timing is not coincidental. Anthropic is pushing Claude Code (based on Claude Opus 4.7, score 94.3) as the reference agent for developers. xAI has launched Grok Build with Grok 4.1 (score 79) as an alternative. OpenAI could not afford to leave Windows without an agent when it is the primary playground for the majority of developers.
Controlling Codex from your phone: the mobile stroke of genius
This is perhaps the most underestimated detail of this update. Integration with the ChatGPT mobile app allows you to control Codex tasks remotely from your smartphone. You give it an objective, you close your PC, and the agent works.
You can check the progress from your phone, validate steps if necessary, or simply let the agent finish. It's a form of asynchronous computing applied to software development.
This feature opens up concrete scenarios: launching a test suite before heading home, asking the agent to fix a critical bug spotted while checking things on the subway, or triggering a deployment from a café terrace.
To understand the full scope of this mechanism, our dedicated article on Codex in ChatGPT Mobile details how the mobile-agent pipeline works in practice, from sending the request to execution on the remote machine.
Codex vs Claude Code vs Grok Build : the war of coding agents
The market for coding agents with system control is exploding. Here is how Codex Computer Use positions itself against its direct competitors.
| Agent | Engine Model | Agentic Score | Computer Use Windows | Computer Use macOS | Native CLI |
|---|---|---|---|---|---|
| OpenAI Codex | GPT-5.4 Pro | 91.8 | Yes (v26.527) | Yes | Yes |
| Claude Code | Claude Opus 4.7 | 94.3 | Partial | Yes | Yes |
| Grok Build | Grok 4.1 | 79 | Not announced | No | Yes |
Anthropic's Claude Code remains the leader in pure reasoning score (94.3 for Claude Opus 4.7 compared to 91.8 for GPT-5.4 Pro). But Codex has a tangible advantage: the native availability of Computer Use on the two main desktop OSes, coupled with mobile control.
Grok Build, launched by xAI, positions itself as the open-source-friendly alternative but has not yet deployed Computer Use. Its agentic score of 79 with Grok 4.1 places it behind its two direct competitors on complex reasoning tasks.
A crucial point: these agents are not just autocomplete. The study Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs (2021) already demonstrated that Codex could repair programs automatically. Five years later, this capability extends from the editor to the entire operating system.
The evaluation of OpenAI Codex for HPC kernel generation, documented in this 2023 study, also showed the limitations of the era regarding parallel programming models. The current agentic scores of GPT-5.4 Pro (91.8) show the monumental progress made in three years.
Security: when an AI takes control of your PC
It's the elephant in the room. An agent that sees your screen, clicks everywhere, types commands, accesses your files — it looks suspiciously like a Trojan horse, except that you are the one installing it voluntarily.
The attack surface is real. If a malicious prompt manages to hijack Codex, the agent has the same privileges as your user session. It can read sensitive files, modify system configurations, execute destructive commands.
OpenAI has integrated safeguards: a validation system before sensitive actions (file deletion, admin commands), complete logs of every action, and the ability to kill the process at any time from mobile or desktop.
But as noted by Kingy.ai, the question is not so much whether the current safeguards are sufficient, but how the ecosystem will evolve when millions of users trust an agent to act on their machine.
The OpenAI o1 system card already documented concerns related to the uncontrolled behaviors of reasoning models. With Computer Use, these concerns move from theoretical to practical: the agent is no longer in a sandbox, it is in your actual work environment.
Security best practices
Never give Codex unnecessary administrator access. Use a limited user account for automated tasks. Check the action logs after each session. And above all: do not let an agent run unsupervised on a machine containing unencrypted sensitive data.
Desktop automation: beyond code
Computer Use on Windows is not limited to software development. It is a visual automation framework that applies to any desktop workflow.
An accountant can ask Codex to open Excel, import a CSV, apply consolidation formulas, and generate a PDF report. A designer can ask the agent to open Figma, export assets, and organize them in a structured folder. A technical support agent can have a ticketing tool opened, sort requests, and apply template responses.
The difference with classic macros or RPA (Robotic Process Automation) is fundamental. Macros are deterministic: they replay a recorded sequence. If the interface changes by a single pixel, they break. Codex adapts visually: it understands what it sees, not just where to click.
This is the same principle of dexterous manipulation that researchers are exploring in robotics. The study Learning Dexterous In-Hand Manipulation showed that a system can learn to manipulate physical objects with dexterity. Computer Use transposes this paradigm to the manipulation of digital interfaces.
Creating a custom AI agent with Codex: the possibilities
For developers who want to go beyond standard usage, Codex Computer Use opens the way to creating specialized AI agents. The concept is simple: you define a scope of action, a set of authorized tasks, and the agent operates within this framework on your Windows machine.
If you want to understand the fundamentals of this approach, our guide on how to create an AI agent covers the design principles: objective definition, observation-action loop, error handling, and recall mechanisms.
The key lies in memory. An agent without memory repeats the same mistakes, doesn't remember your preferences, and doesn't capitalize on previous sessions. This is a critical point for Computer Use: if Codex forgets that you prefer PowerShell over CMD, or that your project uses pnpm instead of npm, each session starts from scratch.
Our article on AI memory details the architectures that allow an agent to build persistent memory — essential when the agent interacts with your actual work environment and not a sandbox.
Extensibility: plugins and integrations
A standalone desktop agent becomes truly powerful when specific capabilities can be grafted onto it. This is where the plugin ecosystem comes into play.
The principle is analogous to what the Hermes Agent ecosystem does with its extensions. As we detailed for Hermes Agent plugins, a base agent is a general-purpose engine. Plugins add business skills: connection to specific APIs, integration with internal tools, particular communication protocols.
With Codex on Windows, developers can create plugins that specialize the agent for their tech stack, internal tools, or business processes. A plugin could, for example, allow Codex to interact directly with Jenkins to trigger CI/CD pipelines, or with Docker Desktop to manage containers.
The example of TagLib-Wasm, a TypeScript library for music tagging compiled to WebAssembly, illustrates the trend: specialized and high-performance tools that can be exposed as agent capabilities. A Codex plugin could use TagLib-Wasm to automate the tagging of audio files directly from the Windows Explorer.
What this changes for the developer on a daily basis
The concrete impact is immediate for certain categories of tasks. Regression testing, for example. Instead of writing fragile Selenium scripts, you ask Codex: "Launch the app, navigate to the checkout page, fill out the form with this data, verify that the confirmation message appears." The agent does this visually, like a human.
Debugging also changes in nature. You no longer search for the error in the logs. You tell Codex: "This function crashes when passing an empty array." The agent opens the code, identifies the problem, applies the fix, runs the tests to verify, and notifies you when it's resolved.
Refactoring legacy code, often a nightmare, becomes manageable. Codex can navigate a codebase it doesn't know, understand the structure, identify the patterns to modernize, and apply the changes gradually.
The study Vision Based Game Development Using Human Computer Interaction (2010) was already exploring the idea of vision-based interfaces for development. Sixteen years later, OpenAI brings this vision to life with an agent that literally reads the screen to interact with any software.
❌ Common mistakes
Mistake 1: Letting Codex run unsupervised on a production machine
This is the most dangerous mistake. Computer Use has the rights of your session. On a production machine, an agent that modifies a config file or kills a process by mistake can cause a major incident. The solution: use Codex in an isolated development or staging environment, never live on a production server.
Mistake 2: Giving vague objectives to the agent
"Improve the code" or "Make the app work" are not actionable objectives. Codex will interpret these instructions in an unpredictable and potentially destructive way. The solution: be ultra-specific. "In the file src/auth/login.ts, the validateToken function does not handle expired tokens. Add an expiration check and return a 401 error with the appropriate message."
Mistake 3: Ignoring action logs
Every click, every keystroke, every command executed by Codex is logged. Not checking these logs is like not reading a diff before merging. The agent can take shortcuts or make choices that you would not have manually validated. The solution: take 2 minutes after each session to scan the logs.
Mistake 4: Confusing Computer Use with a simple assistant
Computer Use is not Copilot in VS Code. It is an agent with real execution rights on your system. Treating it as a mere suggestion tool radically underestimates what it can do — and the damage it can cause if it goes off the rails.
❓ Frequently Asked Questions
Does Codex Computer Use work on Windows 10?
No. OpenAI is exclusively targeting Windows 11 for this first version. The screenshot and accessibility APIs used by Computer Use rely on features specific to Windows 11. No announcement regarding Windows 10 has been made.
Which GPT model drives Computer Use on Windows?
OpenAI primarily uses GPT-5.4 Pro (agentic score 91.8) for Computer Use tasks requiring complex reasoning, and GPT-5.3 Codex (score 80) for more direct code tasks. The switch between the two happens automatically based on the complexity of the task.
Can Codex access my saved passwords?
Technically yes, if they are visible on screen or in an open password manager. This is why it is crucial not to launch Computer Use when sensitive information is displayed. Use a password manager that does not automatically unlock on screen.
How to stop Codex if it goes off the rails?
From the desktop: an emergency stop button is available in the Codex notification bar. From mobile: swipe down in the ChatGPT conversation to access the stop button. The agent stops at the end of the current action, not in the middle of typing.
Does Codex replace Claude Code?
Not entirely. Claude Code remains superior on pure reasoning (Claude Opus 4.7 at 94.3 vs GPT-5.4 Pro at 91.8) and some developers prefer it for complex code analysis. But Codex has the advantage of native Computer Use on Windows + mobile control, making it the logical choice for automating complete desktop workflows.
✅ Conclusion
OpenAI didn't just add a feature to Codex. It just turned every Windows 11 PC into an executable environment for AI agents — and having it piloted from a smartphone changes the game for asynchronous work. If you develop on Windows and you don't test Computer Use this week, you are literally losing time compared to those who have. Start with our guide to creating an AI agent to understand the basics before letting Codex take control of your machine.