Table of Contents

Mistral shipped two models under the Devstral 2 name on December 9, 2025, and as of April 2026 practitioners are still untangling what “open source” means in Mistral’s licensing vocabulary. The answer depends on which model you mean. Devstral Small 2 (24B) is the genuinely open variant — Apache 2.0, no restrictions, fits in 14 GB. The flagship 123B carries a revenue clause that most companies above mid-market scale will quietly breach.

What Devstral 2 Actually Is: Two Models, Two Licenses, Two Stories

Mistral’s December 9, 2025 release1 packaged two distinct models under one product name:

  • Devstral 2 (123B parameters): the flagship, positioned at enterprise-grade agentic coding tasks
  • Devstral Small 2 (24B parameters): the laptop-friendly variant, positioned for local inference

Both share a 256K token context window2. The capability gap between them is considerably smaller than the parameter ratio implies — but the licensing difference is substantial enough to determine which one your legal team will approve.

The Modified MIT Trap: Why the 123B Model Isn’t Open Source for Most Companies

Devstral 2 (123B) ships under a “modified MIT” license. The binding clause reads3:

“You are not authorized to exercise any rights under this license if the global consolidated monthly revenue of your company (or that of your employer) exceeds $20 million”

That’s $240M in annual revenue as the ceiling. Organizations above that threshold require a separate commercial agreement with Mistral to use the 123B model legally.

The Open Source Initiative’s definition of open source explicitly prohibits discrimination against persons, groups, or fields of endeavor3. A revenue cap violates this directly. Devstral 2 (123B) is source-available with commercial restrictions — it does not qualify as open source under the OSI standard.

Mistral’s marketing describes both models as “open-source and permissively licensed” without clearly distinguishing the revenue gate on the 123B. The distinction matters.

Devstral Small 2: The Actually-Open-Source Coding Agent That Fits in 14 GB

Devstral Small 2 (24B) ships under genuine Apache 2.04. No revenue restrictions. No commercial use limitations. No requirement to negotiate a separate license. You can deploy it, modify it, fine-tune it, and redistribute it without legal exposure — regardless of your company’s revenue.

The capability gap between Small 2 and the 123B is narrower than the marketing suggests. On SWE-bench Verified, Small 2 scores 68.0% versus the 123B’s 72.2%2 — a 4.2 percentage point difference at one-fifth the parameters.

Benchmarks on Real Agentic Tasks

SWE-bench evaluates a model’s ability to resolve real GitHub issues from popular open-source repositories. It’s a more meaningful proxy for coding agent performance than completion-style benchmarks like HumanEval. Devstral 2’s reported scores2:

ModelSWE-bench VerifiedSWE-bench MultilingualTerminal Bench 2
Devstral 2 (123B)72.2%61.3%32.6%
Devstral Small 2 (24B)68.0%55.7%

Terminal Bench 2 scores for Small 2 were not reported in the available data as of April 2026. The multilingual gap (5.6 points) is slightly wider than the Verified gap, suggesting the 123B’s advantage is more pronounced on non-English codebases.

Mistral also claims Devstral 2 is up to 7× more cost-efficient than Claude Sonnet at real-world agentic tasks, priced after the free period at $0.40/$2.00 per million tokens (input/output) for the 123B and $0.10/$0.30 for Small 25. The specific Sonnet version and benchmark methodology behind the 7× figure are not clearly documented, so treat it as directional rather than a precise comparison.

Running It Locally: Hardware Requirements and Quantization Tradeoffs

Devstral 2 (123B) requires at minimum four H100 GPUs for self-hosted deployment2. That’s outside the reach of individual practitioners and most small teams without enterprise infrastructure.

Devstral Small 2 operates on a completely different hardware tier4:

QuantizationSizeHardware target
Q4_K_M14.33 GBSingle RTX 4090 or Apple Silicon Mac (32GB)
Q6_K_L19.67 GB16GB RAM + 12GB VRAM (28GB combined)
Q8_025.06 GBProsumer or higher-end consumer hardware

Q4_K_M is the practical entry point. It fits on widely available consumer hardware while preserving most of the model’s capability. Q6_K_L is worth considering if you have split CPU/GPU RAM available and want higher precision. Q8_0 is full-quality but requires dedicated higher-end hardware.

Mistral Vibe CLI: The Terminal Agent Bundled with the Release

Mistral released Mistral Vibe CLI alongside Devstral 21 — a terminal agent that automates software engineering tasks end-to-end using Devstral as the backend. It ships under Apache 2.0.

The CLI enters a space alongside other terminal coding agents, though as of April 2026 it’s a new release with limited independent evaluation. It’s worth monitoring as the toolchain matures, particularly for workflows that favor a CLI interface over IDE integrations.

Who Should Use Which Model (and Under What Terms)

ScenarioRecommended modelReason
Individual developer, local inferenceSmall 2 (Q4_K_M)Apache 2.0, consumer GPU, no legal exposure
Startup below $20M/month revenueEither123B is in-scope; Small 2 eliminates cap risk entirely
Company above $20M/month revenueSmall 2 or commercial license123B modified MIT requires separate Mistral contract
Subsidiary of a large parent companySmall 2Parent’s global revenue determines eligibility
API usage, cost-sensitive, eligible123B via Mistral API$0.40/$2.00 per million tokens

The benchmark difference — 4.2 points on SWE-bench Verified — is real but unlikely to be meaningful for most production coding workloads. For teams that can use either model legally, Small 2 is the variant with fewer infrastructure requirements, no licensing ambiguity, and hardware requirements that match what practitioners actually own.

FAQ

Does the $20M/month revenue cap apply if I use Devstral 2 (123B) through Mistral’s managed API rather than self-hosting?

The modified MIT license governs use of the model weights directly. API access is governed by Mistral’s separate API terms of service, which may differ. If you’re calling Mistral’s managed API rather than hosting the weights yourself, verify the commercial API terms for your revenue tier — the weight-level license may not apply, but Mistral’s API terms could impose equivalent restrictions. When in doubt, the Small 2 via API eliminates this question entirely.

What is Terminal Bench 2, and is the 32.6% score good?

Terminal Bench 2 evaluates agents operating through a terminal on longer-horizon, multi-step tasks — closer to real-world agentic workflows than single-turn benchmarks. The 123B’s 32.6%2 reflects the genuine difficulty of these tasks; it is not a ceiling-scraping score, but agentic task benchmarks are generally harder than code completion benchmarks and scores across the field are lower. No comparative scores for other models at the same tier were included in the sourced data.

Can I fine-tune Devstral Small 2 and ship the fine-tuned model in a product?

Apache 2.0 permits modification, fine-tuning, and redistribution of derivatives without additional licensing requirements. You are not obligated to open-source fine-tuned weights or notify Mistral. The 123B under modified MIT applies the same $20M/month threshold to any derived works.


Footnotes

  1. Mistral AI. “Introducing: Devstral 2 and Mistral Vibe CLI.” https://mistral.ai/news/devstral-2-vibe-cli 2

  2. Hugging Face. “mistralai/Devstral-2-123B-Instruct-2512 — Model Card.” https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512 2 3 4 5

  3. Implicator.ai. “Mistral’s ‘Open Source’ Trick: Build a Great Model, Gate It Behind Revenue Caps, Call It Freedom.” https://www.implicator.ai/mistrals-open-source-trick-build-a-great-model-gate-it-behind-revenue-caps-call-it-freedom/ 2

  4. Hugging Face (bartowski). “Devstral-Small-2-24B-Instruct-2512 GGUF — quantization specs and Apache 2.0 license.” https://huggingface.co/bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF 2

  5. Simon Willison’s Weblog. “Devstral 2.” https://simonwillison.net/2025/Dec/9/devstral-2/

Enjoyed this article?

Stay updated with our latest insights on AI and technology.