ElevenLabs surpasses $500 million in ARR: voice AI has become a major business
🔎 A 2022 startup now generates over $500M per year
In April 2026, ElevenLabs surpassed $500 million in annual recurring revenue (ARR). The figure is raw, verified by multiple publications including TechCrunch and Pulse2. Founded in 2022, the company reached $100M in ARR in 20 months, then $330M at the end of 2025. The first four months of 2026 added an additional $150M.
This is not a startup burning through cash. It is a revenue machine that is accelerating.
In the same month, ElevenLabs expanded its Series D beyond $550M, led by Sequoia Capital, with the arrival of BlackRock, NVIDIA (via NVentures), Santander, Wellington, D.E. Shaw and Schroders. Celebrities like Jamie Foxx and Eva Longoria also joined the round. The valuation: $11 billion, more than triple the $3.3B from January 2025.
Voice AI is no longer a demo gimmick. It is an industry.
The essentials
- ElevenLabs surpasses $500M in ARR in April 2026, after finishing 2025 at ~$350M and reaching ~$450M by the end of Q1 2026.
- The Series D exceeds $550M at an $11B valuation, with Sequoia Capital at the helm and BlackRock, NVIDIA, Santander among the new investors.
- Growth is driven by enterprise deployments (Fortune 500), not by general public users.
- Against OpenAI Realtime API, ElevenLabs remains the leader in synthesis quality and voice cloning, while OpenAI excels in multimodal latency.
- The trajectory ($100M in 20 months, then x5 in 30 months) suggests that audio-first AI is becoming an interaction channel as strategic as text chat.
Recommended tools
| ElevenLabs | Enterprise voice synthesis and cloning | Starting at $5/month (May 2026, check on elevenlabs.io) | Large-scale audio products |
| OpenAI Realtime API | Low-latency real-time voice interactions | Pricing per audio token (May 2026, check on openai.com) | Conversational voice assistants |
| Play.ht | Conversational voice synthesis | Starting at $31/month (May 2026, check on play.ht) | Podcasts and long-form narration |
The ARR trajectory: from $100M to $500M in 30 months
ElevenLabs' growth curve is atypical, even for the AI sector.
| Metric | Value | Source |
|---|---|---|
| $100M ARR reached | 20 months after founding (2022) | Chief AI Officer |
| End of 2025 ARR | ~$330-350M | TechCrunch, Pulse2 |
| Net new ARR Q1 2026 | +$100M | TechCrunch |
| End of Q1 2026 ARR | ~$450M | TechCrunch |
| April-May 2026 ARR | >$500M | Pulse2, Economic Times |
Going from $330M to $500M in four months represents a quarterly growth rate of ~50%. For a company already generating hundreds of millions, this is exceptional.
CEO Mati Staniszewski told Bloomberg in January 2026 that the $100M in net new ARR in the first quarter of 2026 came primarily from enterprise contracts. These aren't individual $5 subscriptions making up these numbers. These are six- or seven-figure deployments at Fortune 500 companies.
Analytics India Mag confirms that this acceleration coincides with massive adoption of voice AI in call centers, automated customer support, and audio content localization.
The lesson: when an AI startup finds its enterprise product-market fit, the curve bends violently. ElevenLabs is no longer in the discovery phase. It is in the industrialization phase.
The Series D: $550M+ at an $11B valuation
The funding round deserves a closer look, as its structure says a lot about ElevenLabs' maturity.
The round's figures
The Series D was initially announced at $500M in February 2026, led by Sequoia Capital, with Lightspeed Venture Partners among the participants, according to Reuters. The valuation was then set at $11B.
In May 2026, the round was expanded beyond $550M with the arrival of new investors, reports Tech.eu.
Who are the new investors?
The list, revealed by TechCrunch, is divided into three categories:
Institutional: BlackRock, Wellington, D.E. Shaw, Schroders. These names don't invest in speculative bets. Their presence signals that voice AI is considered an infrastructure asset.
Strategic: NVentures (NVIDIA's VC arm) and Santander. NVIDIA is investing because voice inference consumes GPUs. Santander is investing because the bank is likely a current or prospective client — the banking sector is a major consumer of voice AI for customer support.
Celebrities: Jamie Foxx and Eva Longoria. Their participation is less strategic than symbolic, but it reinforces the narrative of voice as a mass medium.
From $3.3B to $11B in 14 months
The valuation more than tripled between January 2025 ($3.3B during the $180M Series C) and February 2026 ($11B), again according to Reuters. This tripling was accompanied by a ~3.5x multiplication in ARR (from ~$100M at the end of 2024 to ~$350M at the end of 2025).
The ARR/valuation multiple therefore remained remarkably stable around 30x. This is not a speculative bubble. It is a valuation consistent with real growth.
The enterprise pivot: why it works now
ElevenLabs hasn't always been an enterprise platform. The startup first made a name for itself with the general public through voice cloning on social media.
The transition to enterprise was decisive. Chief AI Officer analyzes this shift: ElevenLabs developed dedicated offerings for Fortune 500 companies with security, compliance, and latency guarantees that public APIs could not provide.
The enterprise use cases that pay
Enterprise voice AI deployments focus on three areas:
Automated customer support. Replacing or augmenting call centers with voice agents that understand context, handle accents, and don't lose patience. This is the use case generating the largest contracts.
Content localization. Multinational companies duplicate their audio content (training, marketing, internal) in dozens of languages with a consistent voice. The meilleure IA pour cloner une voix becomes a strategic distribution tool.
Narration and production. Publishers, production houses, podcast platforms — all need audio volumes that humans alone can no longer produce.
This enterprise pivot explains why the meilleurs chatbots IA pour business now integrate native voice capabilities. Voice is no longer an add-on. It is a full-fledged interaction channel.
Those who have already automatiser leur business sans coder grâce à l'IA intuitively understand this shift: textual automation was the first step, voice automation is the next.
ElevenLabs vs OpenAI Realtime: two visions of voice AI
The comparison is inevitable. OpenAI launched its Realtime Audio API, and the two approaches clash on different criteria.
Synthesis quality: ElevenLabs remains the leader
According to the comparison by Toolhalla, ElevenLabs produces the most natural voices on the market in 2026, particularly in English. Voice cloning requires only 30 seconds of audio sample. Play.ht comes in second for conversational speech, but the gap with ElevenLabs remains significant in tonal fidelity.
Dasha confirms this positioning: ElevenLabs remains the leader in voice synthesis quality and cloning, while OpenAI excels in low-latency multimodality for real-time interactions.
Latency and multimodality: OpenAI's advantage
OpenAI's Realtime API, detailed by TokenMix, is designed for bidirectional voice interactions with minimal latency. The model processes input audio and generates output audio in a unified pipeline. This is ideal for conversational assistants where the fluidity of exchange is paramount.
Inworld AI's benchmark places OpenAI ahead in raw latency, and ElevenLabs ahead in perceived quality.
What to choose in practice?
| Criterion | ElevenLabs | OpenAI Realtime API |
|---|---|---|
| Voice quality (naturalness) | Absolute leader | Good, inferior |
| Voice cloning | 30 seconds, high fidelity | Limited, less faithful |
| Conversational latency | Good | Better |
| Multimodality (text + audio + visual) | Audio-focused | Natively multimodal |
| Price for high volumes | Competitive in enterprise | More expensive per token |
| Ideal for | Audio production, localization, narration | Interactive voice assistants |
The two are not in direct competition. They serve different use cases. ElevenLabs dominates vocal content production. OpenAI dominates real-time voice interaction.
For businesses that want to automatiser leur business en 7 jours avec l'IA, the choice depends on the channel: if it's large-scale audio production, ElevenLabs. If it's a conversational voice agent, OpenAI Realtime.
What a $500M ARR Means for the Voice AI Market
When a single voice AI company triples its valuation in 12 months to reach $11B, the entire sector benefits. This is what Ringly notes in its 2026 voice AI statistics report: ElevenLabs is validating the market.
The Signal to Investors
The presence of BlackRock, Wellington, and D.E. Shaw in the Series D sends a clear signal to traditional funds. Voice AI is no longer a tech niche. It is an asset class that institutional wealth managers are ready to integrate into their portfolios.
Santander, as a strategic investor, indicates that banks will deploy voice AI at scale. The financial sector is traditionally slow to adopt new technologies. When a major European bank invests directly in a voice AI provider, it means internal POCs have validated the ROI.
The Signal to Competitors
ElevenLabs' figures create a benchmark. $500M in pure voice AI ARR (without a generalist LLM, without cloud computing) proves that there is an autonomous voice market. Play.ht, Murf, and other speech synthesis players can now point to ElevenLabs to legitimize their own projections.
For entrepreneurs exploring the 5 business models rentables autour de l'IA, voice AI now offers a concrete precedent of scaling. The voice SaaS model works. Proof: $500M in recurring revenue.
The Signal to Developers
The arrival of NVentures (NVIDIA) in the round means that GPU infrastructure is going to be optimized for voice AI. Developers can expect better APIs, reduced inference costs, and more performant models specifically for voice processing.
The Hidden Challenges Behind $500M ARR
Such impressive figures mask real risks. No exponential growth is without flaws.
Dependence on Large Accounts
If Fortune 500 deployments represent the bulk of the net new ARR, revenue concentration is a risk. A single enterprise contract can represent tens of millions. The loss of two or three major clients could drop the ARR by 10-15%.
The diversification of the client portfolio will be the real test of ElevenLabs' maturity in 2027.
Competition from Multimodal LLMs
Models like GPT-5.5 (OpenAI), the leader in the agentic ranking with 98.2 according to benchmarks, and Gemini 3.1 Pro (Google), first in general LLM with 92, natively integrate audio capabilities. As these models improve in speech synthesis, the need for a dedicated tool like ElevenLabs could decrease for basic use cases.
ElevenLabs must maintain its qualitative lead. The gap with OpenAI TTS exists today, but it is shrinking.
Regulation and Voice Deepfakes
30-second voice cloning raises ethical and legal questions. The more ElevenLabs grows, the more it becomes a regulatory target. The company has put safeguards in place, but legislative pressure will increase, particularly in Europe with the AI Act.
Infrastructure and Deployment: Behind the Scenes
$500M in voice AI ARR also means massive consumption of computing resources.
The Role of NVIDIA
NVentures' investment is not altruistic. High-quality speech synthesis, especially in real-time, requires powerful GPUs. Every enterprise voice call that goes through ElevenLabs consumes compute cycles on servers equipped with NVIDIA GPUs.
As the ARR grows, the infrastructure bill increases proportionally. The partnership with NVIDIA likely allows ElevenLabs to negotiate preferential rates on chips and get early access to architectures optimized for voice inference.
Hosting and Scalability
For businesses deploying voice agents via ElevenLabs, the underlying infrastructure must be rock-solid. This is an often-underestimated point: voice quality is not enough if network latency adds an extra 500ms. The choice of a reliable host becomes critical, hence the importance of solutions like Hostinger for ancillary components (landing pages, dashboards, middleware APIs).
❌ Common Mistakes
Mistake 1: Confusing ARR with Total Revenue
A $500M ARR does not mean ElevenLabs has cashed $500M in cash. ARR annualizes monthly recurring revenues. If the company added $150M in ARR in four months, it means its monthly contractual commitments increased by ~$12.5M/month. The actual cash received depends on billing terms (annual vs. monthly).
Mistake 2: Thinking Voice AI Replaces All Channels
Voice AI is complementary, not substitutive. The meilleurs chatbots IA pour business combine text and voice. An agent that forces voice interaction when the user wants to type text is bad UX. Voice excels when hands are busy (car, workshop) or when tone and emotion matter (customer support, training).
Mistake 3: Comparing ElevenLabs' Valuation to OpenAI's
OpenAI is valued at over $300B. But OpenAI is a generalist LLM player with AGI ambitions. ElevenLabs is specialized in voice. The multiples are not comparable. A 30x ARR multiple for a specialized company with $500M in ARR and 50% quarterly growth is rational. Comparing it to OpenAI's multiple makes no sense.
❓ Frequently Asked Questions
How Many Employees Does ElevenLabs Have?
The exact number is not made public in the sources from May 2026. However, with $500M in ARR and approximately $450-500M in total capital raised, the revenue-per-employee ratio is likely among the highest in the AI sector, typical of API-first companies that scale with relatively few staff.
Is ElevenLabs Profitable?
No source from May 2026 mentions profitability. With $500M in ARR and high GPU infrastructure costs, profitability depends on gross margins. Voice AI companies generally have lower margins than traditional SaaS because of computing costs. But with $550M in fresh cash, ElevenLabs doesn't need to be profitable immediately.
What AI Model Does ElevenLabs Use Internally?
ElevenLabs develops its own proprietary speech synthesis models. The company does not use GPT-5.5 or Claude Opus 4.7 to generate voice — these generalist LLM models are not optimized for audio synthesis. However, ElevenLabs can rely on LLMs for the understanding/language part in its conversational products.
Will Voice AI Replace Humans in Call Centers?
Partially. ElevenLabs' enterprise deployments aim to augment, not eliminate. Voice agents handle repetitive queries (FAQs, order tracking, appointment scheduling). Complex cases are still escalated to humans. The ROI comes from reducing the volume of human calls, not from their total elimination.
✅ Conclusion
ElevenLabs proved in 30 months what many doubted: voice AI can generate hundreds of millions in recurring revenue without relying on a generalist LLM. The jump from $330M to $500M in ARR in four months, the arrival of BlackRock and NVIDIA in the capital, and an $11B valuation confirm that audio has become a pillar of AI infrastructure — on par with text and image.