DeepSeek V4 Just Reshuffled Europe's Open-Source AI Procurement Calculus

DeepSeek's V4 preview, released on 25/04/2026, ships a one-million-token context window, V4-Flash output priced at $0.28 per million tokens, and full training on Huawei Ascend silicon. For European financial services teams weighing cost, sovereignty, and regulatory compliance, the launch changes the maths overnight.

DeepSeek has released a preview of its V4 model, and the launch is the most consequential open-source AI event for European enterprise buyers since Mistral's Mixtral drop in late 2023. The Hangzhou-based startup shipped two Mixture-of-Experts variants on 25/04/2026 and priced Pro-tier output at roughly $3.48 per million tokens, pulling frontier-grade reasoning into reach for any European developer, compliance team, or financial institution willing to self-host the weights.

[[KEY-TAKEAWAYS:DeepSeek V4-Flash output costs $0.28 per million tokens, ten times cheaper than comparable Western frontier models|Both V4 variants support a one-million-token context window, matching Claude and Gemini architecturally|V4 was trained end-to-end on Huawei Ascend 950 silicon, the first globally competitive model to achieve this|Open weights on Hugging Face mean EU enterprises can self-host entirely, avoiding third-country API dependencies|European regulators and financial supervisors will scrutinise data provenance and safety evaluations closely]]

What V4 Actually Ships

By The Numbers

$0.28

V4-Flash output price per million tokens

DeepSeek V4-Flash is priced at $0.28 per million output tokens, more than ten times cheaper than comparable Western frontier model pricing from OpenAI or Anthropic.

Source

1.6 trillion

Total parameters in DeepSeek V4-Pro

DeepSeek V4-Pro is a 1.6-trillion-parameter Mixture-of-Experts model with 49 billion parameters activated per forward pass, placing it in the same architectural tier as leading Claude and Gemini releases.

Source

1 million

Token context window across both V4 variants

Both DeepSeek-V4-Pro and DeepSeek-V4-Flash support a one-million-token context window, enabling long-document processing and complex multi-step agentic workflows at a fraction of the cost of Western alternatives.

Source

$3.48

V4-Pro output price per million tokens

V4-Pro output is priced at approximately $3.48 per million tokens, compared to $15 to $60 for GPT-class models and $15 to $75 for Anthropic Claude, representing a four-to-ten-times cost advantage for European enterprise buyers.

Source

V4 arrives in two configurations. DeepSeek-V4-Pro is a 1.6-trillion-parameter MoE model with 49 billion parameters activated per forward pass. DeepSeek-V4-Flash runs 284 billion total parameters with 13 billion active. Both support a one-million-token context window, putting the family in the same architectural league as the most recent Claude and Gemini frontier releases.

The model card on Hugging Face confirms the open-weights release. DeepSeek also claims V4 can operate autonomously on tasks such as writing and debugging multi-file codebases, a direct signal that the company is now optimising for agent workflows rather than conversational chat. For European financial services teams building document-processing pipelines, trade-surveillance agents, or regulatory-reporting assistants, that agentic capability at this price point is a genuine inflection.

The Continental - Europe’s morning AI brief

The pricing headline is stark. V4-Flash output costs $0.28 per million tokens. That is a ten-times-plus discount on comparable Western frontier output, and it sits well below what most European cloud resellers currently charge for GPT-class or Claude-class inference.

Editorial photograph taken inside a modern European financial data centre, rows of server racks with blue indicator lights, a technician in business attire reviewing a tablet displaying model inferenc

The Huawei Silicon Angle Is The Real Story For European Procurement

The chip provenance matters as much as the model scores. DeepSeek confirmed that V4 was trained and is being served on Huawei Ascend 950 silicon clusters connected via Huawei's Supernode interconnect, with Cambricon providing supporting accelerators. This is the first time a globally competitive frontier-class model has been trained and served end-to-end on Chinese-designed chips, with no Nvidia dependency in the training run.

For European cloud buyers, that fact cuts in multiple directions:

It demonstrates that frontier-quality inference no longer requires Nvidia H100 or H200 hardware, loosening one supply-chain constraint that has driven up European GPU cluster costs since 2023.
It introduces a distinct hardware provenance question for any EU financial institution governed by DORA, the Digital Operational Resilience Act, which requires firms to document and stress-test third-party technology dependencies.
It widens the gap between Nvidia-anchored stacks used by most European hyperscalers and the Huawei-anchored stack now available as a self-hosted option.

Critically, the open weights run on any sufficiently capable accelerator stack. European operators can deploy V4 on Nvidia H100, H200, or Blackwell systems just as readily. The Huawei training story is a proof point about capability, not a constraint on deployment.

Why This Lands Differently In European Financial Services

DeepSeek's January 2025 release rattled US markets. V4 is unlikely to reprice European equities overnight, because sophisticated investors have already priced in the existence of capable Chinese open models. The shift this time is structural and operational.

Consider the procurement decision facing a mid-sized European bank today. Under the EU AI Act, high-risk AI systems used in credit scoring, fraud detection, or customer-facing advice require documented conformity assessments. Under DORA, the same institution must map concentration risk in its AI supply chain. An open-weights model that can be self-hosted in a Frankfurt or Amsterdam data centre, with no API call leaving the EEA, addresses both concerns simultaneously in a way that a hosted OpenAI or Anthropic endpoint does not.

Simon Willison, the AI researcher and creator of Datasette who has published some of the most rigorous independent evaluations of recent frontier models, assessed V4-Pro as "almost on the frontier, at a fraction of the price." That framing is precisely what European technology officers need to take a proposal to a risk committee: near-frontier capability, dramatically lower unit cost, and a self-hosting path that keeps sensitive financial data on-premises.

The European AI Office, which began formal enforcement preparatory work under the AI Act in early 2025, has not yet issued specific guidance on open-weights frontier models from non-EU providers. But Dragos Tudorache, the Romanian MEP who led the Parliament's AI Act negotiations, has repeatedly argued that open-source licensing does not automatically satisfy transparency or safety requirements. That distinction will matter to compliance teams deploying V4 in regulated workflows.

Wide-angle editorial photograph of a European AI policy workshop setting, a conference table at ETH Zurich or a Brussels committee room, participants reviewing printed benchmark comparison tables, lap

Who Wins And Who Must Move In The European Market

The competitive map shifts immediately for several categories of player:

Mistral AI, headquartered in Paris, is the most directly exposed European open-weights provider. Mistral Large 2 and the Mistral NeMo family have competed partly on price and partly on European data-residency assurances. V4-Flash at $0.28 per million tokens undercuts Mistral's hosted pricing for equivalent capability tiers, and Mistral will need to accelerate its own next-generation release or lean harder into its European regulatory compliance narrative.
European cloud resellers that have built margin on top of OpenAI or Anthropic API access face immediate pressure. If enterprise clients can self-host a one-million-token-context frontier model for a fraction of the hosted API cost, the value proposition of a pure reseller collapses.
Proprietary mid-tier providers charging premium rates for moderate-capability models have the most to lose. The price floor just dropped, and it dropped hard.
ASML, SAP, and large European industrial enterprises running internal AI programmes will add V4-Flash to their evaluation lists purely on unit economics, regardless of geopolitical concerns about provenance.

The following comparison is effectively the table every European CIO will circulate in their next technology steering committee:

DeepSeek V4-Pro: $3.48 per million output tokens, open weights, Huawei Ascend 950 native
DeepSeek V4-Flash: $0.28 per million output tokens, open weights, Huawei Ascend 950 native
Mistral Large 2 (hosted): approximately $6.00 per million output tokens, partial open weights, Nvidia-anchored
OpenAI GPT-class: $15 to $60 per million output tokens, closed weights, Nvidia only
Anthropic Claude: $15 to $75 per million output tokens, closed weights, Nvidia only

The Governance Gap That European Deployers Cannot Ignore

V4 weights ship under DeepSeek's open licence, and security teams across Europe are already raising questions about training data provenance, the absence of third-party safety evaluations conducted under EU-equivalent standards, and the implications of a Chinese-origin model processing European financial data, even in a self-hosted configuration.

The early working assumption among European enterprise security advisers is that V4 will be deployed in air-gapped or on-premises configurations by institutions that want the cost savings but cannot route sensitive client or transaction data through DeepSeek's hosted API. That is a reasonable and achievable deployment pattern. It is also one that requires internal AI infrastructure investment that smaller institutions may not yet have.

The European Banking Authority's draft guidelines on the use of AI in credit and operational risk, published for consultation in Q4 2025, require institutions to maintain explainability and auditability logs for any model used in a material decision process. Self-hosting V4 makes that technically feasible. It does not make it automatic. The compliance work is non-trivial.

Updates

29 Apr 2026published_at reshuffled 2026-04-29 to spread distribution per editorial directive

AI Terms in This Article 6 terms

agentic

AI that can independently take actions and make decisions to complete tasks.

inference

When an AI model processes input and produces output. The actual 'thinking' step.

tokens

Small chunks of text (words or word fragments) that AI models process.

parameters

The internal settings an AI model learns during training. More parameters generally means more capable.

API

Application Programming Interface, a way for software to talk to other software.

GPU

Graphics Processing Unit, the powerful chips that AI models run on.

Comments

No comments yet. Start the conversation.