Alibaba Zhenwu M890: the AI chip that wants to dethrone Nvidia in China
🔎 3× in one generation: the signal of a technological break
On May 20, 2026, at the Alibaba Cloud Summit, T-Head — Alibaba's semiconductor subsidiary — unveiled the Zhenwu M890. A GPU-class AI acceleration chip that claims three times the performance of the previous generation, which was itself already comparable to Nvidia's H20.
The timing is not insignificant. The US sanctions of 2022, strengthened in 2023 and 2024, cut China off from Nvidia's H100 and B200 chips. Beijing responded with a massive investment plan in domestic semiconductors. The M890 is the most tangible result of this policy.
But beyond the patriotic symbol, the figures are concrete: 144 GB of HBM3 memory, an 800 GB/s interconnect bandwidth, and a supernode server integrating 128 chips. This is an architecture designed for scale, not for a laboratory demonstration.
Alibaba took advantage of the announcement to simultaneously launch the Qwen 3.7-Max model, signaling that hardware and software are advancing in tandem. The Chinese AI ecosystem no longer waits for Nvidia to exist.
The essentials
- The Zhenwu M890 is an AI acceleration chip designed by T-Head (Alibaba), unveiled on May 20, 2026, at the Alibaba Cloud Summit.
- It claims performance 3× superior to Nvidia's H20, with 144 GB HBM3 and 800 GB/s of interconnect bandwidth.
- The Panjiu AL128 server system integrates 128 accelerators per rack, designed for large-scale deployment of autonomous agents.
- Alibaba plans to IPO its chip design business to capitalize on the demand for alternatives to Nvidia.
- This announcement comes in the context of US sanctions that prevent Chinese access to Nvidia's H100 and B200.
Recommended tools
| Tool | Main usage | Price (May 2026, check on site.com) | Ideal for |
|---|---|---|---|
| Alibaba Cloud | Inference and training on Zhenwu M890 | On quote (enterprise) | Chinese companies seeking hardware independence |
| Hostinger | Web hosting to deploy AI interfaces | ~€2.99/month | Developers and startups |
| Qwen 3.7-Max | Alibaba's flagship LLM optimized for the M890 | Via Alibaba Cloud API | High-performance inference on Chinese infrastructure |
Zhenwu M890 Technical Specifications — Substance, Not Marketing
Memory and Bandwidth: A Quantified Leap
The Zhenwu M890 features 144 GB of HBM3 memory, 50% more than the 96 GB of the Zhenwu 810E. The interconnect bandwidth increases from 700 GB/s to 800 GB/s, a 14% gain on this specific point according to the detailed specifications reported by Wccftech.
These figures have a direct impact on the training of large language models. Memory determines the size of the model that can be loaded onto a single chip. With 144 GB, the M890 can host a 70-billion-parameter model in full precision without resorting to sharding.
Interconnect bandwidth, on the other hand, dictates the speed at which chips communicate within a cluster. At 800 GB/s, the M890 approaches Nvidia's standards while remaining below the NVLink of the H100 (900 GB/s). But the comparison isn't entirely fair: the M890 competes with the H20, not the H100.
The Panjiu AL128: 128 Chips in a Single Rack
Alibaba unveiled the Panjiu AL128 supernode server, which integrates 128 Zhenwu M890 accelerators per rack. This is a densified cluster architecture, designed for the large-scale deployment of autonomous digital agents according to Interesting Engineering.
The approach is reminiscent of Nvidia with its DGX systems, but with a major difference: each chip in the rack is interconnected via a proprietary protocol optimized for Chinese workloads. Alibaba has not published the exact details of this interconnect, but the promise is clear: eliminate the network bottlenecks that penalize heterogeneous clusters.
Comparison with the Previous Generation and the H20
| Specification | Zhenwu 810E | Zhenwu M890 | H20 (Nvidia) |
|---|---|---|---|
| Memory | 96 GB HBM | 144 GB HBM3 | 96 GB HBM3 |
| Interconnect Bandwidth | 700 GB/s | 800 GB/s | 900 GB/s (NVLink) |
| Relative Performance | Baseline (1×) | 3× (claimed) | Baseline |
| Availability in China | Yes | Yes | Limited by US licenses |
| Release Year | 2025 | May 2026 | 2024 (with restrictions) |
The table reveals a nuanced reality. The M890 does not surpass the H20 on all hardware criteria — bandwidth remains lower. But the "3× performance" claimed by Alibaba likely incorporates overall software and architectural optimizations, not just raw specs. This is a point to verify once independent benchmarks become available.
The AI chip war in China — Why the M890 exists
US sanctions: the involuntary accelerator of Chinese innovation
It all starts in October 2022, when the US Department of Commerce restricts the export of advanced chips to China. Nvidia's H100 is banned from export. Nvidia responds with a downgraded version, the H20, specifically designed to comply with US caps while remaining sellable in China.
But the sanctions toughen. In 2024, even downgraded chips are subject to new restrictions. The result is paradoxical: Chinese companies that bought Nvidia hardware without thinking now find themselves forced to develop domestic alternatives.
The Zhenwu M890 is the direct child of this pressure. Without the sanctions, it probably wouldn't exist, or not at this scale. T-Head would have continued producing specialized chips for the Alibaba cloud without seeking to compete head-on with Nvidia.
The complete Chinese ecosystem: chips, models, capital
The M890 is not an isolated project. It is part of a Chinese AI ecosystem that has structured itself at breakneck speed. Moonshot AI just raised $2 billion for its Kimi K2.6 model, which scores 88.1 in agentic and 84 in general on reference benchmarks.
DeepSeek, with its V4 Pro, shows 88 in general. Z.AI offers the GLM-5 at 82 in agentic. The Chinese open-weight model ecosystem is now the most dynamic in the world, partly thanks to chips like the M890 that make training and inference possible without American dependency.
Alibaba has also announced its intention to IPO its chip design business, according to Bloomberg. The goal is to raise capital to accelerate the roadmap of the Zhenwu series, with new chips planned for the third and fourth quarters of 2027-2028 according to TrendForce.
The US regulatory response
Washington is not standing still in the face of this momentum. The White House wants to verify AI models before their release, a strategic U-turn that shows American nervousness in the face of Chinese progress. Control is no longer just over hardware, but also over the models themselves.
This regulatory evolution reinforces Chinese determination to build a complete and autonomous stack: from chip to model, from cloud infrastructure to the final application. The M890 is an essential link in this chain.
Qwen 3.7-Max and the M890 — The chip-model duo that changes the game
Integrated vertical optimization
The simultaneous announcement of the Zhenwu M890 and the Qwen 3.7-Max model is no scheduling coincidence. Alibaba is applying the vertical integration strategy that Nvidia imposed with CUDA: hardware and software are co-designed to maximize performance.
Qwen 3.7-Max completes Alibaba's famille Qwen, which has established itself as one of the most credible alternatives to American models. The specific optimization for the M890 means that inference can be significantly faster than on generic hardware.
Where does Qwen stand against the competition?
In the current LLM landscape (June 2025), American models still dominate the rankings. OpenAI's GPT-5.5 leads in agentic with 98.2, followed by Google's Gemini 3 Pro Deep Think at 95.4 and Anthropic's Claude Opus 4.7 at 94.3.
But Chinese models are climbing rapidly. Kimi K2.6 reaches 88.1 in agentic, DeepSeek V4 Pro (Max) scores 88 in general. The question is no longer whether Chinese models can compete, but when they will surpass American models on specific metrics.
The M890 could accelerate this shift by providing a training and inference infrastructure that is no longer penalized by hardware limitations.
The M890 vs Nvidia's H20 — What are these 3× claims really worth?
Deconstructing the performance claim
Alibaba claims 3× higher performance than Nvidia's H20 with the M890. This claim, reported by Wccftech and TNW, deserves careful examination.
First point: the baseline for comparison. The H20 is a chip throttled by US export constraints. It is not representative of the full potential of Nvidia's Hopper architecture. Comparing the M890 to the H100 or the B200 would be more relevant for assessing the actual technological level, but these chips are not available in China.
Second point: the metric. "3× the performance" is vague. Does it refer to batch inference speed? Training speed? Throughput in tokens per second? Alibaba has not specified, making the comparison difficult to verify independently.
Third point: testing conditions. Internal benchmarks are rarely reproducible. Until independent labs like MLPerf publish results, the 3× figure remains a marketing claim, not an established scientific fact.
What is credible in the announcement
Despite these reservations, several elements make the announcement credible. The jump from 96 GB to 144 GB of memory is real and measurable. The increase in bandwidth from 700 to 800 GB/s is documented. The 128-chip-per-rack architecture is a concrete response to the scalability problem.
The Zhenwu 810E was already considered comparable to the H20 by industry observers according to TrendForce. A 3× gain in a single generation is ambitious but not unrealistic if T-Head has resolved the main architectural bottlenecks.
The software factor: Alibaba's hidden advantage
Nvidia dominates thanks to CUDA, its software ecosystem that makes its chips easy to program. T-Head does not have an equally mature equivalent. But Alibaba circumvents this problem by directly optimizing its Qwen models for the M890, reducing the reliance on a universal abstraction framework.
This "vertical" approach — co-optimized chip + model — is more limited in scope but more efficient in pure performance. It is the same logic that drove the success of Apple Silicon chips: hardware-software integration beats the generic approach when the ecosystem is large enough.
The Zhenwu roadmap — What comes after the M890
Chips planned for 2027-2028
T-Head didn't just unveil the M890. During the Alibaba Cloud Summit, the company presented a complete roadmap for the Zhenwu series, with new chips planned for the third and fourth quarters of 2027-2028 according to TrendForce.
This unusual transparency for a Chinese chipmaker sends a signal: Alibaba no longer considers the M890 a prototype, but the first step in a sustainable product line. The strategy resembles Huawei's approach with the Ascend series, which went from experimental chips to products deployed on a national scale in just a few years.
The chip division's IPO: why now?
Alibaba plans to take its chip design business public, Bloomberg reports. The timing is calculated: investor appetite for Chinese alternatives to Nvidia is at its peak, and the M890 provides a concrete narrative to value.
A successful IPO would give T-Head the capital needed to accelerate its roadmap and recruit international talent. It's also a way to make the division more autonomous from its parent company, a positive signal for investors who fear political interference in Chinese tech companies.
Geopolitical implications — Beyond technology
The US-China tech decoupling accelerates
The Zhenwu M890 is a symptom of the technological decoupling between the United States and China. Each round of sanctions pushes China to develop alternatives, which in turn justifies new US sanctions. The cycle is self-reinforcing.
Ultimately, the global AI semiconductor market could split into two parallel ecosystems: a Nvidia/CUDA ecosystem for US-allied countries, and a Zhenwu/Qwen ecosystem for China and its partners. European, Indian, and Middle Eastern companies will then have to pick a side, or maintain a costly dual compatibility.
The impact on the competitiveness of Chinese models
Access to high-performance chips is a determining factor in training increasingly large models. With the M890, Chinese companies are no longer condemned to using underpowered hardware. This could result in an acceleration of the pace of new model releases.
The current LLM rankings show that Chinese models are already competitive in general (DeepSeek V4 Pro Max at 88, Kimi K2.6 at 84) and in agentic (Kimi K2.6 at 88.1, GLM-5 at 82). With the M890 as an accelerator, the next generation of Chinese models could close the gap with GPT-5.5 (98.2 in agentic) and Gemini 3.1 Pro (92 in general).
The response from other Chinese players
Alibaba is not the only one developing domestic AI chips. Huawei with the Ascend 910C, Biren Technology, and Cambricon all have products in development or deployment. But the M890 stands out due to its direct integration with Alibaba's cloud ecosystem, the largest in Asia.
Competition among Chinese chipmakers is healthy: it prevents dependence on a single player and accelerates innovation. But it also fragments the software ecosystem, which could slow down adoption compared to Nvidia's unified stack.
Autonomous agents: the real target of the M890
Why Alibaba is betting on agents
Interesting Engineering points out that the M890 is specifically designed to power autonomous digital agents at scale. This is not insignificant: the AI agent market is considered the next frontier of growth, and agentic workloads have different characteristics than classic inference.
An AI agent must make multiple model calls in sequence, with low latency and significant memory to maintain context over long interactions. The M890, with its 144 GB HBM3 and its 128-chip cluster architecture, is optimized for this usage profile.
The Panjiu AL128 architecture designed for multi-agent
The Panjiu AL128 server is not just a simple rack of chips. It is a system designed for the simultaneous deployment of hundreds of agents, each with its own context and its own reasoning chain. The 800 GB/s interconnect ensures that agents can communicate with each other without excessive latency.
This agentic orientation positions Alibaba differently from Nvidia, whose chips are primarily optimized for training, with inference being a secondary use case. The M890 reverses this priority: agentic inference is the design target, training comes second.
❌ Common mistakes
Mistake 1: Confusing the M890 with an H100 killer
The M890 is compared to the H20, not the H100. The H20 is a downgraded chip designed to comply with US export caps. Saying that the M890 "beats Nvidia" is misleading: it beats a throttled version of an older architecture. The honest comparison is M890 vs H20, not M890 vs H100 or B200.
Mistake 2: Taking the "3×" literally
Alibaba claims 3× the performance of the H20, but without specifying the metric, the workload, or the testing conditions. This figure is a starting point for discussion, not an established fact. Wait for independent benchmarks before drawing conclusions.
Mistake 3: Ignoring the software factor
A chip without a software ecosystem is useless. The M890 benefits from direct optimization with Qwen, but CUDA remains the industry standard. Raw performance does not always make up for the software deficit, especially for teams that do not use Alibaba models.
Mistake 4: Underestimating the roadmap
Viewing the M890 as a one-shot would be a mistake. The roadmap announced by T-Head covers 2027-2028 with new chips. It is a long-term program, not a one-off demonstration.
❓ Frequently Asked Questions
Is the Zhenwu M890 available for individual developers to purchase?
No. The M890 is deployed via Alibaba's cloud infrastructure (Alibaba Cloud). Developers access its capabilities through the Qwen model APIs, not by purchasing the chip directly.
Can the M890 train models the size of GPT-5.5?
In theory, yes, thanks to the 128-chip Panjiu AL128 architecture. But the total compute power of an M890 cluster likely remains lower than that of the H100/B200 clusters used by OpenAI. The M890 optimizes inference, not frontier-scale training.
What is the difference between the Zhenwu 810E and the M890?
The 810E had 96 GB HBM and 700 Gb/s interconnect. The M890 moves to 144 GB HBM3 (+50%) and 800 Gb/s (+14%), with a claimed 3× in overall performance. The leap is most notable in memory, which is crucial for large models.
Do independent benchmarks confirm the 3×?
Not yet. The only figures available come from Alibaba. MLPerf results or other third-party benchmarks will likely take several months to appear, as is the case with every new chip.
How does the M890 compare to Huawei Ascend chips?
Both target the same Chinese market, but with different approaches. The Ascend 910C is more integrated into the Chinese government ecosystem, while the M890 benefits from Alibaba's commercial cloud ecosystem. A detailed comparison remains difficult without common benchmarks.
✅ Conclusion
The Zhenwu M890 is not going to "kill" Nvidia — but it proves that China can now build competitive AI chips without access to US hardware. With 144 GB HBM3, a 128-chip cluster, and direct integration with Qwen, Alibaba has set a serious milestone in the race for semiconductor independence. The next step is verifying these 3× claims in real-world conditions.