Tech Growth

M3 Open-Source + 1M Context: MiniMax's 4 Real Advantages in 2026

廣告版位(header)啟用:後台 /admin/settings 填 AdSense Publisher ID
M3 Open-Source + 1M Context: MiniMax's 4 Real Advantages in 2026

Listed on HKEX in January 2026, broke $150M ARR in February, filed for STAR Market listing at the end of May, and shipped flagship M3 on June 1 — MiniMax’s 18-month cadence has been unusually intense in the foundation-model world.

But intensity isn’t the same as substance. After sorting through the public materials, stripping out PR-speak, and keeping only the numbers you can independently verify, here are the 4 advantages MiniMax actually stands on as of June 2026. To be clear up front: these aren’t “why I’m bullish” — they’re “facts you can look up yourself.”

For engineers and founders, understanding these 4 points matters more than watching any keynote — they directly shape which model you should be evaluating in the second half of 2026.

1. M3’s Three Firsts: The Only Open-Source Model With All Three

M3's three firsts: coding, context, and multimodality intertwined

MiniMax’s M3, released June 1, isn’t positioned as “competing with X” — it’s positioned as “completing what was missing.” It takes three capabilities that previously only lived in closed overseas frontier models — frontier coding, ultra-long context, and native multimodality — and ships them together for the first time as a fully open-source model.

Three numbers worth remembering:

  • SWE-Bench Pro 59.0%: M3 beats GPT-5.5 and Gemini 3.1 Pro on this benchmark that directly fixes real GitHub issues, and approaches Claude Opus 4.7.
  • 1M Token Context: Up to 1 million tokens, and engineering-grade usable — not “the spec says 1M but quality falls off a cliff at 200K.”
  • BrowseComp 83.5: Outperforms Opus 4.7 (79.3) on long-horizon agent benchmarks — currently the strongest open-source option on the agent axis.

Why does “open-source” matter more than you think? It restructures the adoption cost model. Closed APIs charge per-token with no ceiling; open-source models can be self-hosted, data stays on-prem, and you can fine-tune for your domain. Effectively, it caps the cost ceiling from “unbounded” to “controllable.”

For enterprise IT decision-makers, M3 is the first open-source model in 2026 that genuinely slots in as “Opus 4.7-grade at a tolerable cost.” That position used to be what the Llama family was trying to claim — and failing.

2. MSA Sparse Attention: 1M Context That Actually Scales

MSA sparse attention: only key nodes connect in a long light stream

Extending context to 1M isn’t the hard part. The hard part is making 1M context fast and affordable. Traditional Transformer attention is O(n²) — double the tokens, quadruple the compute. That’s why most “1M” models are “you can fit it but no one will actually use it.”

MiniMax’s answer: they wrote their own sparse attention architecture, MSA (MiniMax Sparse Attention). Three engineering metrics that matter:

  • Per-token compute at 1/20 of the previous generation: Cost comes back to a manageable range even at full 1M context.
  • Prefilling stage accelerated 9x+: Time-to-first-token on long documents drops dramatically.
  • Decoding stage accelerated 15x+: Generation speed feels close to a 128K-tier model in practice.

There’s a second detail — MiniMax rebuilt the KV read path entirely, running 4x faster than open-source alternatives like Flash-Sparse-Attention and flash-moba. This isn’t “we published a paper” — it’s “we shipped the operator-level changes into production.”

What this means for developers in practice: previously, your agent would “forget” at hour 11 — losing the hardware constraint you set at hour 2 — because the context couldn’t hold it. M3’s 1M context plus MSA raises the agent’s “working memory ceiling” from a few hundred thousand characters to roughly an entire encyclopedia. That’s why MiniMax is comfortable letting M3 autonomously run 12 hours to reproduce an ICLR paper, or 24 hours to optimize a CUDA kernel — it won’t lose its train of thought halfway through.

3. Full-Modality Stack + Globalization: 70% Revenue From Overseas

Full-modality + globalization: text, image, video, and audio orbiting a world map

Many foundation-model companies have only a text leg — video and audio remain “future roadmap.” MiniMax is one of the few Chinese companies where every modality lands in the global top tier. Direct product matrix:

  • Language: The M series (M2.5 → M2.7 → M3), with daily token consumption in February 2026 already 6x that of December 2025.
  • Video: Hailuo 2.3, with continuously improved dynamic expressiveness.
  • Speech: Speech 2.6, end-to-end latency under 250ms — industry-leading.
  • Music: Music 2.6, first-packet latency under 20 seconds, plus 3 open-source Music Skills targeting the Agent ecosystem.

Note this isn’t a “we have it too” me-too play — the 250ms latency of Speech 2.6, M3’s highest score on Claw-Eval, and Music 2.6’s open-source Agent Skills are each individually verifiable, concrete leading points.

The business side is equally concrete. 2025 annual report figures:

  • Total annual revenue: $79.04M, +158.9% YoY.
  • Gross margin flipped from -24.7% in 2023 to +25.4% two years later.
  • Over 70% of revenue comes from international markets.
  • Serving 200+ countries, 236M individual users; 100+ countries, 214,000 enterprise customers and developers.

70% overseas revenue is rare among Chinese AI companies. It signals two things: first, MiniMax isn’t just playing in the Chinese-language sandbox — its English and other-language support is genuinely usable and monetizable. Second, the business model has passed the “funded by VC runway” phase, with gross margin improving for two consecutive years and net loss rate narrowing significantly.

What this means for founders: when evaluating an AI middleware vendor, “can serve global customers” isn’t just a technical capability — it’s a business-continuity question. A company with majority overseas revenue runs a smoother business cycle than one with purely domestic revenue.

4. Platform Upgrade: Token Plan and the Business Flywheel

Platform upgrade: a foundation model layer supporting a multi-tier application ecosystem

On the same day M3 launched, MiniMax rolled out a new subscription plan, Token Plan, pooling all modalities (text / image / speech / music) into a single quota. Plus at ¥49/month (6 billion tokens), Max at ¥119/month (18 billion), Ultra at ¥469/month (55 billion).

Pricing benchmarked against overseas: Plus at ¥49 buys roughly 5x the monthly capacity of Claude Pro at 100. For individual developers, this is the first time you can buy “Claude-tier subscription pricing for flagship open-source model experience.”

But the more interesting shift is the business model itself. Founder Yan Junjie said it clearly on the 2025 earnings call: MiniMax will upgrade from “foundation model company” to AI platform company, with the core metric being intelligence density × token throughput.

This isn’t just a slogan. Three concrete signals:

  1. M3 and MiniMax Code are co-trained: On M3 launch day, the companion agent coding tool MiniMax Code also got updated. Both were trained with the same data and RL pipeline — not “model first, tool bolted on later.”
  2. Notion Custom Agents chose M2.5 as its first open-weight model: This is the international productivity-tool ecosystem voting with its roadmap.
  3. Dual listing path — HKEX + STAR Market: HKEX listing in January, STAR Market filing at the end of May. The capital markets are voting with their wallets.

The key to platform upgrade isn’t “we have an app too” — it’s the four gears of “model × Agent × developer ecosystem × capital” meshing together. As of June 2026, MiniMax is one of the few Chinese AI companies with all four gears turning at once.

Closing: Put M3 Back Into Your Model-Selection Decisions

After walking through 4 advantages, the most important takeaway isn’t “MiniMax is great” — it’s how you use this information in your next decision. Three concrete actions:

  1. If you’re choosing a backend for an agentic coding model: Add M3 to your eval list. SWE-Bench Pro 59% isn’t a numbers game — it’s “real GitHub issue fix rate,” closer to production than any demo.
  2. If your application has 1M-tier long-context needs (legal due diligence, financial report audits, large code base analysis, long-video summarization): M3 + MSA is the most PoC-worthy open-source option of H1 2026.
  3. If you’re assessing a vendor’s sustainability: 70% overseas revenue + two consecutive years of gross margin improvement + dual listing path on HKEX and STAR Market is a relatively complete set of business-continuity evidence.

MiniMax isn’t perfect — M3 still has room to improve in multi-turn dialogue naturalness, vertical-domain fine-tuning maturity, and Traditional Chinese localization. But as of June 2026, it is genuinely one of the few Chinese AI companies delivering verifiable, on-the-numbers results across all three lines of technical leadership, business sustainability, and open-source ecosystem.

The rest is for you to try it yourself.

Disclaimer: All model benchmark data, financial figures, and product specifications cited in this article are sourced from MiniMax official announcements, third-party technical reports (SegmentFault, Juejin, Zhihu, financial media), and public press releases, with writing date June 2026. Technology iterates rapidly — product features and business data may change over time; please refer to the latest official announcements. This article does not constitute investment advice; AI technology adoption should be evaluated based on your own business context.

廣告版位(in-article)啟用:後台 /admin/settings 填 AdSense Publisher ID
Support

Clap to support

If this helped, clap a few times. Up to 10 per reader.

10 claps left this time

Comments

Leave a comment

Comments are reviewed before publishing.