The Open Question

On April 24, DeepSeek released V4, its most capable open weight model to date, performance on agentic benchmarks sitting alongside GPT-5.5 and Claude Opus 4.7, at a fraction of the API price. On April 28, Elon Musk and Sam Altman sat across from each other in federal court in Oakland, fighting over whether OpenAI betrayed the open, nonprofit mission it was founded on in 2015. And, today, on April 29, while I was drafting this article, Mistral released Medium 3.5, frontier-class, multimodal, agentic, open weight, available to download and self-host.

The open vs. closed debate in AI is usually framed as a philosophical argument. It is also, increasingly, an economic and geopolitical one. And the pace at which it is moving makes it worth getting the terms right.

First, a distinction that matters

Open source and open weight are not the same thing. Most AI coverage uses them interchangeably. That conflation is how most of this conversation goes wrong.

Open weight means the model's parameters are publicly downloadable. You can run them locally, fine-tune them on your own data, and deploy them without paying per token. The training data, full architecture details, and source code may not be disclosed. Most models described as "open", Llama, Mistral, DeepSeek, are technically open weight.

Open source means the full stack is available: weights, training data, methodology, and code. Rare in practice at frontier scale. Essentially nonexistent today among models that compete with closed frontier performance.

Closed means API-only. You send a request, you receive a response, you pay per token. You have no access to the underlying model. OpenAI's GPT series, Anthropic's Claude, Google's Gemini, xAI's Grok.

The distinction matters for practical reasons. Large institutions can negotiate on-premise deployments with closed providers. For everyone else operating in regulated environments — fintechs, regional banks, payment processors — open weight isn't a philosophical preference. It's often the only architecturally viable path.

What each major player is actually doing

The landscape is less binary than the headlines suggest.

OpenAI is fully closed at the frontier. It is currently in federal court, with Musk's lawsuit centered on whether the company betrayed its founding mission to develop AI for the benefit of humanity. OpenAI released open weight models for the first time since GPT-2 last year, not out of principle, as Sam Altman himself admitted, but because DeepSeek forced their hand. The direction is right. The distance from the 2015 founding vision still matters.

Anthropic runs a fully closed model at the frontier. Its contribution to the open ecosystem is at the infrastructure layer, MCP (Model Context Protocol), donated to the Linux Foundation's Agentic AI Foundation in December 2025, is now the industry standard for connecting AI systems to tools and data, with over 97 million monthly SDK downloads. The bet: open infrastructure standards matter even when model weights stay proprietary.

Google hedges. Gemini at the frontier is closed. Gemma, its smaller model family, is open weight. It is the only major lab currently playing both sides simultaneously without announcing a strategic pivot.

Meta is the most consequential shift to watch. The company that penned a memo titled "Open Source AI is the Path Forward" in 2024 is quietly developing Avocado, a proprietary frontier model, after Llama 4's disappointing reception failed to captivate developers. Open weight versions of Avocado are planned, eventually, but with key features excluded for safety and competitive reasons. Whether Llama continues as a serious open weight effort or becomes a secondary release track is the question the next twelve months will answer.

Mistral has been the most consistent. Open weight models under permissive licenses, a real commercial business, European data sovereignty positioning. Medium 3.5, released April 29, frontier-class, multimodal, agentic, Modified MIT license, is the clearest statement of that strategy to date. The timing is its own argument.

DeepSeek is open weight as geopolitical strategy. Chinese export controls restrict access to Nvidia's most advanced chips. Open weight adoption at scale, globally, across industries, compensates. DeepSeek V4 Pro, released April 24 under the MIT license, posts benchmark scores alongside closed frontier models. It runs on Huawei's domestic Ascend chips. That last detail is worth more attention than it gets.

API cost per 1M tokens · April 2026 — comparing closed and open weight AI models - API pricing shown. Self-hosted open weight models: no per-token cost — infrastructure only. Sources: official provider pricing pages · April 2026. Total cost depends on your input/output ratio.

The chart above shows API pricing, what you pay per token when accessing each model through its official API. For open weight models, that number drops to zero when self-hosted. The cost shifts from token fees to infrastructure: GPUs, cloud compute, engineering time to manage deployment. For high-volume production workloads, that trade-off frequently favors self-hosting. For smaller teams without infrastructure expertise, the API is often the right starting point.

Mistral Medium 3.5 had been live for a few hours when these numbers were locked.

Why open source matters, and why it was OpenAI's founding premise

The Musk and Altman trial is not really about 2015 emails. It is a public argument about who controls the infrastructure of general intelligence, and whether that control should be concentrated or distributed.

Musk argues that Altman steered OpenAI away from its original mission to develop advanced AI for the benefit of humanity. Whatever you think of his motives, the founding premise was clear: the benefits of this technology should flow broadly, not be captured by a single closed corporation. The name was not accidental.

The argument for openness is not ideological. It is structural. Open models let smaller teams compete. They let regulated industries self-host for compliance. They prevent single-vendor dependency at infrastructure scale. They distribute the gains of the technology rather than concentrating them.

Open source has been one key channel through which China aims to compete with the US, by rapidly scaling adoption. That geopolitical dimension makes the question bigger than any one company's strategy.

The Linux Foundation sees this clearly. The Agentic AI Foundation, launched in December 2025 with Anthropic, OpenAI, Block, Google, Microsoft and others as founding members, exists to ensure that as AI moves from chatbots to autonomous agents, it does so on open standards that avoid fragmentation and vendor lock-in. MCP, AGENTS.md, and Goose, three of the foundational protocols for agentic AI interoperability, are now governed by a neutral, community-driven foundation. The model layer may be contested. The connective tissue between models does not have to be.

The model is only part of the equation

The model itself is rarely the binding constraint. What actually determines whether an AI system works in a production business context:

The orchestration layer is how you manage context, chain calls, handle failures, and route between models depending on task complexity. A well-designed routing system that sends simple queries to a cheaper model and reserves the frontier for genuinely hard problems will outperform any single-model strategy on both cost and quality.

The connector infrastructure is how the model integrates with your existing systems, data sources, and workflows. This is where MCP matters in practice. A closed model with excellent tooling frequently outperforms an open weight model that requires your team to build the surrounding infrastructure from scratch.

The compliance layer includes audit trails, access controls, data residency, PII handling. In financial services this is not optional. A closed API model that sends customer data to a third-party server may be architecturally incompatible with your regulatory obligations regardless of its benchmark scores. This is where self-hosted open weight models have a structural advantage that no amount of enterprise SLA language fully resolves.

The evaluation infrastructure is how you know when the model is wrong before it costs you. Klarna cut from 5,500 to fewer than 3,000 employees in 2024 on an AI-first thesis, then started hiring again after customer service quality deteriorated enough that the CEO publicly admitted they had gone too far. The lesson is not that AI has limits. The lesson is that deploying AI in contexts where human judgment is non-negotiable. A customer disputing a fraudulent charge on a financial product, without the evaluation infrastructure to catch failures before they reach production is how the bill arrives quietly, in the metrics that matter most.

The real trade-offs, without the ideology

Open weight models are self-hostable, which is critical for data residency and compliance in regulated environments. They can be fine-tuned on proprietary data without exposing it to a third party. They carry no vendor dependency or API risk, and cost structures become predictable at scale. The trade-off is real: infrastructure expertise and engineering investment are required to deploy and maintain them.

Closed models offer frontier performance, continuously updated, with no infrastructure burden. Compliance documentation, security, and uptime SLAs are managed by the provider. The trade-off is also real: per-token cost scales directly with usage, and vendor dependency and data egress concerns that enterprise contracts partially but never fully resolve.

Neither column wins categorically. The right answer depends on your data residency constraints, your volume, your team's infrastructure capability, and the specific judgment requirements of your use case.

The question that is actually being asked

Meta's retreat from "open source is the path forward" to a proprietary frontier model is not a betrayal. It is a data point. Frontier performance is expensive. The economics of openness get harder as the compute bill gets larger. Zuckerberg's AI infrastructure commitment runs to $600 billion through 2028. At that scale, giving away your best model is a different calculation than it was in 2024.

What is worth watching: whether the gap between open weight and closed frontier performance keeps compressing the way it has for the past eighteen months, or whether closed labs pull away at the capability frontier in ways open weight models cannot follow.

DeepSeek V4 Pro and Mistral Medium 3.5, both released within days of each other, suggest the compression is continuing. If it does, the open vs. closed debate resolves not through principle but through performance parity. The model becomes a commodity. The scarcity shifts, again, to the orchestration layer, the compliance infrastructure, and the judgment of the people building on top of it.

Which is, at this point, a familiar conclusion.

Sources:
· Google / Gemini 3.1 Pro - https://ai.google.dev/pricing
· OpenAI GPT-5.4 - https://openai.com/api/pricing
· xAI / Grok 4.2 - https://docs.x.ai/developers/models
· Claude Opus 4.7 - https://claude.com/pricing#api
· DeepSeek V4 Pro - https://api-docs.deepseek.com/news/news260424
· Mistral Medium 3.5 - https://mistral.ai/pricing#api

Closed API or self-hosted open weight, which model are you running in production, what drove the decision, and would you make the same call today?

Mathieu Flamant

Founder · CTO · mathieuflamant.com