GitHub Copilot switches to token billing: end of the subscription, start of pay-as-you-go billing — your bill is going to explode

Outils IA 🟢 Beginner ⏱️ 15 min read 📅 2026-06-01

GitHub Copilot switches to token billing: end of the subscription, start of pay-as-you-go — your bill is going to explode

🔎 On June 1, 2026, developers discovered their new bill

On April 27, 2026, GitHub published a relatively quiet official announcement: all Copilot plans would switch to a per-token billing system called "AI Credits" on June 1. On May 30, a developer posted on Reddit that their monthly bill was going from $29 to $750. On June 1, the system was in production. The phrase "What a joke" started trending on X, and Hacker News threads exceeded 2000 comments.

This shift is not a simple pricing adjustment. It is the end of a model — the flat subscription for AI coding — and the signal that Microsoft has decided to aggressively monetize agentic usage. The problem? Developers weren't ready. Companies weren't either.

The key points

All GitHub Copilot plans switched to per-token billing (AI Credits) on June 1, 2026. The old "premium requests" no longer exist.
Developers are reporting bills multiplied by 25, going from $29/month to $750/month, primarily due to agentic workflows.
The $0/month Copilot Pro plan includes no credits. The $9/month Business plan includes a limited balance, with a temporary promo at $30 for June-August.
Alternatives like Cursor and Claude Code offer flat subscription models and are massively attracting fleeing devs.
This move confirms a market trend: publishers want to capture the value of agentic usage, even if it means losing users.

Recommended tools

Tool	Main usage	Price (June 2026, check on site.com)	Ideal for
Cursor	AI IDE with multiple models	Fixed subscription (~$20/month)	Individual devs wanting predictable costs
Claude Code	Terminal coding agent	Anthropic API billing	Advanced devs who control their token usage
Continue.dev	Open-source IDE extension	Free (you pay for the API)	Teams wanting to control every token
Cline	Autonomous agent in the IDE	Free (you pay for the API)	Custom agentic workflows

What really changed on June 1, 2026

The change is radical: GitHub replaced the "premium requests" system (a fixed number of requests per month) with "AI Credits" billed per token. Every interaction with a model — whether it's an inline code completion, a chat in the sidebar, or an agent workflow in Copilot Workspace — consumes credits proportionally to the number of input and output tokens.

According to the official GitHub documentation, each model has its own rate. Claude Sonnet 4.6, GPT-5.4, and GPT-5.5 do not have the same costs per token. And this is precisely where the trap closes: the most powerful models (and therefore the most used in agentic workflows) are also the most expensive.

The Copilot Pro plan, formerly at $10/month, is now at $0/month. Zero credits included. You only pay for what you consume. The Copilot Business plan at $9/month includes a credit balance, with a summer promotion at $30/month (June-August 2026) that provides more credits. The Enterprise plan at $39/month follows the same logic with a promo at $70/month.

In practice, according to How2Shout, developers discovered on D-day that their credit balance melted away after just a few hours of normal work.

The numbers of the disaster: x25 on some bills

The most discussed case comes from Reddit, picked up by TechCrunch: a developer reports a bill going from $29/month to $750/month. This is not an isolated case.

According to the analysis by Awesome Agents, bills multiplied by 25 mainly affect users of agentic workflows. Why? Because an agent refactoring a file doesn't send a single prompt. It reads the file, analyzes the context, generates a plan, executes several steps, verifies the result, and iterates. Each step consumes tokens. A single agentic task can therefore consume the equivalent of 50 to 100 classic chat requests.

The following table, based on data from SeptimLabs and TokenCost, illustrates the monthly projections according to the usage profile:

Usage profile	Main model	Old cost (subscription)	New estimated cost (AI Credits)	Multiplier
Light dev (chat only)	Claude Sonnet 4.6	$10-19/month	$15-30/month	x1.5 to x2
Standard dev (chat + inline)	GPT-5.4	$19-29/month	$80-200/month	x4 to x7
Agentic dev (Copilot Workspace)	GPT-5.5	$29-39/month	$400-750/month	x14 to x25
10-dev team (mix)	GPT-5.4 / GPT-5.5	$190-390/month	$3,000-8,000/month	x15 to x20

These figures are estimates based on the per-token rates published by GitHub and the usage scenarios described by developers on Reddit and Hacker News. Your actual bill depends on your token volume, which varies enormously depending on the projects and models chosen.

Why agentic workflows are the most affected

The switch to token-based billing is not neutral depending on the type of Copilot usage. Inline code completion (the gray suggestion that appears as you type) remains relatively token-efficient. The sidebar chat consumes more. But the real financial black hole is agentic workflows.

An agentic workflow with Copilot Workspace works like this: you describe a task ("refactor this function and add tests"), and the agent orchestrates the steps itself. It reads the relevant files, generates code, executes it, analyzes errors, corrects them, and iterates. Each iteration is a complete request with the entire project context.

For a task that would take a developer 10 minutes using classic chat, the agent can consume 20 to 50 round trips with the model. With GPT-5.5, the highest-performing agentic model according to the ranking (score of 98.2 on the agentic benchmark), the cost per token is also the highest. Result: a single agentic work session can drain a credit balance that would have lasted a month with classic usage.

This is exactly the scenario described by developers reporting $750 bills. It is not abuse. It is the normal use of a tool sold as agentic.

Rate breakdown by model: what you are actually paying

GitHub does not publish a single "per token" price. Each model has its own rate, and the differences are significant. Based on the official GitHub documentation and the analysis by TokenCost, here are the available models and their positioning:

High-end models (agentic):
- GPT-5.5 (OpenAI) — The highest performing in agentic (98.2), the most expensive per token. To be reserved for complex tasks.
- Claude Opus 4.7 Adaptive (Anthropic) — Agentic score 94.3, high cost but lower than GPT-5.5.
- Gemini 3 Pro Deep Think (Google) — Agentic score 95.4, good performance/cost ratio according to feedback.

Mid-range models (standard chat and code):
- GPT-5.4 Pro (OpenAI) — Agentic score 91.8, a good compromise.
- GPT-5.4 (OpenAI) — General score 89, agentic 87.6. The workhorse for many teams.
- Claude Sonnet 4.6 (Anthropic) — General score 83, agentic 81.4. The cheapest of the models mentioned, often recommended to limit costs.

The immediate takeaway: if you want to keep your bill in check, move your daily tasks to Claude Sonnet 4.6 and reserve GPT-5.5 only for tasks that justify it. Except Copilot doesn't force you to make this choice — by default, it uses the model best suited to the task, which is often the most expensive.

The backlash: "What a joke" and developers' anger

The community reaction was immediate and fierce. As early as May 30, the Reddit post revealing the $750 bill was massively shared. On June 1, BuildFastWithAI reported that "What a joke" was trending on X in reference to Copilot.

On Hacker News, criticisms focused on three points:

The lack of transparency. Developers feel they were not sufficiently warned about the change. The April 27 announcement did not provide concrete examples of bills. It took the first real-world reports to understand the magnitude of the increase.

The absence of safeguards. No spending cap is enabled by default. A developer who starts an agentic workflow before leaving for the weekend could come back to an astronomical bill on Monday. This is a product design issue, not just a pricing issue.

The feeling of bait-and-switch. GitHub sold Copilot for years as a predictable, fixed subscription. Companies budgeted on this basis. Changing the model overnight, without a graceful transition period, is perceived as a betrayal of trust.

A senior developer commented on X: "I defended Copilot to my CTO for two years. This morning, I had to explain to him why our bill was multiplied by 15. I can't defend it anymore."

Why Microsoft made this choice

The decision is not improvised. It reflects an economic reality that Microsoft has internalized: the fixed subscription model for coding AI is no longer viable in the agentic era.

With chat and inline, consumption per user was relatively predictable. A dev made 50 to 100 requests per day, server costs were under control, and the margin on the subscription was comfortable. With agentic, a single workflow can consume thousands of tokens in a few minutes. Compute costs explode, but the subscription remains fixed. Microsoft was subsidizing agentic usage through subscriptions.

The Gartner MQ 2026, which positions OpenAI Codex, Cursor and GitHub Copilot as leaders of enterprise coding agents, notes in fact that the monetization of agentic is the central challenge of all vendors. Microsoft chose the most brutal solution: transfer the cost to the customer, token by token.

It is also a strategic signal. By making agentic expensive on Copilot, Microsoft might be looking to steer users towards Azure AI, where pay-as-you-go billing is already the norm and where margins are better for Microsoft. Copilot becomes an acquisition channel towards Azure, not a standalone product.

What this says about the AI coding tools market in 2026

Copilot's shift is not an isolated incident. It reveals a deep fracture in the AI coding tools market in 2026.

On one side, the vendors who charge based on usage (GitHub Copilot, direct APIs). On the other, those who maintain a fixed subscription (Cursor, Windsurf). This divergence is not trivial. It reflects two opposing visions of the value of AI coding.

The "usage-based" vision considers AI to be a compute service like the cloud. You consume tokens, you pay for tokens. It's logical, transparent at the micro level, but unpredictable at the macro level. One company recently blew up its Claude bill through incompetence to the tune of 500 million dollars in a single month — an extreme case but revealing of the risks of usage-based billing without governance.

The "subscription" vision considers AI coding to be a productivity tool like an IDE. You pay for access, not for each action. It's predictable, budgetable, but the vendor takes the risk that power users cost more than they bring in.

In 2026, the reality is that a fixed subscription only survives if the vendor controls its compute costs. Cursor achieves this by optimizing context (not sending the entire project with every request) and limiting agentic behavior to controlled scenarios. GitHub chose not to optimize, but to bill.

The alternatives: Cursor, Claude Code, Continue.dev and Cline

The backlash immediately benefited competitors. According to HiTechies, migrations to Cursor and Claude Code accelerated as soon as the April announcement was made. BitDoze lists five concrete alternatives for teams looking to control their budget.

Cursor: the winning flat-rate subscription

Cursor is the big winner of this crisis. Its model is simple: a fixed monthly subscription (~$20/month) that gives access to GPT-5.4, Claude Sonnet 4.6, and other models. No per-token billing, no end-of-month surprises. Agentic is included in the plan, with reasonable limits that prevent runaway costs.

For teams, this is the most predictable model. You know that 10 developers will cost $200/month, not $2,000 or $20,000. This is exactly what engineering directors want to hear right now. This is also why Cursor ranks among the best AI tools for code alongside Copilot and Cline.

Claude Code: total control, but you have to manage it

Anthropic's Claude Code is a command-line coding agent. It doesn't charge a subscription: you pay for tokens directly via the Anthropic API. It's paradoxical — it's also usage-based billing — but with a crucial difference: you choose the model, you see tokens in real time, and you can set hard limits.

A developer who masters Claude Code and sets a daily API budget of $5 will likely spend less than with Copilot, because they control every interaction. But it's a tool for experienced developers, not for teams who want a "transparent" copilot.

Continue.dev and Cline: the open-source that puts the API first

Continue.dev and Cline take a different approach: they are open-source tools (IDE extension for Continue, autonomous agent for Cline) that connect you directly to the API of your choice. You pay for the API, not the tool.

The advantage is total transparency. You see exactly how many tokens each request consumes, which model is used, and what it costs. The downside is that there is no "all-inclusive" subscription: you have to manage your API keys, your provider-by-provider bills, and your own governance.

For technical teams that want fine-grained control and aren't afraid of complexity, this is the healthiest financial solution.

What teams must do immediately

If you are on Copilot and haven't checked your June bill yet, do it now. Then, take these steps.

Activate spend alerts. GitHub lets you configure notifications when you reach a certain credit threshold. If your company hasn't enabled this option, it is an emergency.

Switch your daily tasks to a cheaper model. Claude Sonnet 4.6 costs significantly less per token than GPT-5.5. For inline code completions and simple chat, the quality difference does not justify the extra cost.

Reserve agentic for tasks that justify it. Using an agent workflow to rename a variable is a waste of tokens. Reserve Copilot Workspace for complex refactorings, code migrations, and multi-file tasks.

Evaluate Cursor or Continue.dev in parallel. Don't migrate blindly, but test. Run a two-week POC with Cursor for a sub-team and compare the bill and productivity. The data will speak for itself.

Negotiate with Microsoft. If you are on Enterprise, your account has a sales rep. Summer promotions ($70/month instead of $39 for Enterprise) are a sign that GitHub feels the winds changing. Negotiate caps, extra credits, or a hybrid model.

❌ Common mistakes

Mistake 1: Thinking the $0 Pro plan is free

The $0/month Copilot Pro plan does not include any AI credits. You can install the extension, but every interaction will be billed per token. It's a "pay-as-you-go" model disguised as free. A developer who enables it without understanding the mechanism can end up with a $50 to $100 bill in the very first month without any prior alert.

Mistake 2: Letting the agent run unsupervised

Copilot Workspace agentic workflows can consume hundreds of thousands of tokens in a single session. Starting an agent before leaving the office is like leaving a taxi with the meter running all night. The solution: monitor consumption in real time and set limits per task.

Mistake 3: Not checking which model is used by default

Copilot automatically selects the "most suitable" model for your task. Often, this is GPT-5.5 — the most expensive one. If you don't manually switch the model to Claude Sonnet 4.6 or GPT-5.4 for simple tasks, you are systematically overpaying. Check your workspace settings.

Mistake 4: Migrating to an alternative without testing

Anger leads to knee-jerk reactions. But switching from Copilot to Claude Code or Cursor without a trial period is replacing a billing problem with an adoption problem. Test on a pilot project, measure actual productivity, and compare total costs (tool + adaptation time + API if applicable).

❓ Frequently Asked Questions

Is my old Copilot subscription cancelled?

No, it is converted. Your fixed subscription now entitles you to an AI credits balance (depending on the plan), and any consumption beyond that is billed per token. "Premium requests" no longer exist since June 1, 2026.

Can I cap my Copilot bill?

GitHub does not offer a native hard cap, but you can configure spend alerts in your organization's billing settings. For a true cap, you need to go through the APIs directly or use a third-party tool that manages limits.

Isn't Claude Code also pay-as-you-go?

Yes, but the difference is control. With Claude Code, you see tokens in real time, you choose the model for each request, and you can enforce limits per session. With Copilot, the agentic part consumes in the background without fine-grained visibility until the bill arrives.

It's a band-aid on a hemorrhage. The promo temporarily increases your credits balance, but the underlying model remains the same. If your team does intensive agentic work, even the promo won't be enough to cover the real costs. Use it as a transition period to evaluate alternatives.

Will Cursor also switch to per-token billing?

Nothing indicates this as of June 2026. Cursor's fixed subscription model is a major competitive advantage in the current context. But if agentic usage explodes among their users as well, financial pressure could push them to revise their model. For now, it's the safest refuge.

✅ Conclusion

GitHub Copilot's switch to token billing is a turning point for the AI coding market: it marks the end of the illusion that flat-rate subscriptions could survive the agentic era. If you are still on Copilot, compare the alternatives now — your July bill depends on what you decide this week.

#github-copilot #cout-intelligence-artificielle #facturation-par-token #ai-credits #facture-developpeur

📚 Related articles

Outils IA 🟢 Débutant 13 min

Graphify : the 84k-star repo that turns any codebase into a queryable knowledge graph

Discover Graphify: the 84k-star Python repo turning any codebase into a queryable knowledge graph for your AI agents.

2026-07-14 17:09

Outils IA 🟢 Débutant 12 min

Chrome DevTools launches its official MCP: Claude, Codex and Copilot can now drive Chrome live

Discover chrome-devtools-mcp: the official Chrome DevTools MCP letting Claude, Codex and Copilot drive Chrome live.

2026-07-04 17:10

Outils IA 🟢 Débutant 16 min

Google DESIGN.md : the open-source standard that gives code agents a visual memory

Discover Google DESIGN.md, the open-source standard that gives code agents like Claude Code or Cursor visual memory to improve UI.

2026-06-28 16:05

📑 Table of contents