DeepSeek is making its 75 percent API discount permanent

DeepSeek has turned a temporary API discount into a permanent price reset. That matters because startups building with AI now have a much cheaper frontier-style option, and every rival provider has to explain why its bill is still higher.

DeepSeek is no longer treating its 75 percent V4-Pro API discount as a short promotion. The company’s pricing page now says the model will be officially adjusted to one quarter of its original price after the current discount period ends on May 31, 2026 at 15:59 UTC. In plain English, the sale price becomes the real price.

That is a bigger signal than another limited-time developer offer. DeepSeek-V4-Pro is listed at $0.435 per million uncached input tokens and $0.87 per million output tokens, down from crossed-out reference prices of $1.74 and $3.48. Cached input is listed at $0.003625 per million tokens, after the same 75 percent reduction. DeepSeek-V4-Flash is cheaper still, at $0.14 per million input tokens and $0.28 per million output tokens, with cache hits at $0.0028.

According to DeepSeek’s own API pricing page, the company also cut input cache-hit prices across its model lineup to one tenth of launch pricing from April 26. That matters for agent products, coding assistants, customer support systems and document-heavy workflows, where the same instructions, files and reference material get passed through again and again.

For startups, this is not just a cheaper invoice. It changes how products can be built. A founder deciding whether to use an external model, fine-tune an open model, or build a narrow in-house system now has a different spreadsheet in front of them.

AI-native startups have been living with a difficult margin problem. The product looks like software, but the cost structure can behave more like usage-based infrastructure. Every support answer, every generated report, every autonomous coding task and every research workflow creates a real token bill. If the product is priced casually, gross margin can disappear before the company has enough customers to notice.

A 75 percent permanent cut gives those companies room to experiment. It can make low-ticket AI features viable, especially for businesses serving students, small companies, solo operators or international users who cannot absorb enterprise-grade pricing. It can also let startups keep more context inside the model instead of spending engineering time trimming prompts, summarizing aggressively or pushing users into thinner experiences.

That does not mean every team should move everything to DeepSeek tomorrow. Price is only one part of the decision. Reliability, latency, data policy, model behavior, tool calling, regional restrictions and customer trust still matter. Enterprise buyers in the United States and Europe may be cautious about depending heavily on a Chinese AI provider for sensitive workflows. But the price gap is now large enough that many teams will test it anyway.

DeepSeek is trading margin for reach

The most interesting part is the timing. DeepSeek is not cutting prices from a quiet corner of the market. Its V4 preview launched on April 24 with two models, V4-Pro and V4-Flash, both supporting a 1 million token context window. The company describes V4-Pro as a 1.6 trillion parameter mixture-of-experts model with 49 billion active parameters, while V4-Flash is positioned as the faster and more economical option.

This gives DeepSeek a clear two-step product ladder. Flash can handle cheaper everyday workloads. Pro can be used when the job is more complex, such as agentic coding, long-document reasoning or higher-value automation. If developers build routing around that split, the effective cost of an application can fall much further than a single headline price suggests.

The risk is obvious. DeepSeek is compressing its own economics while the cost of frontier AI remains enormous. Bloomberg reported today that DeepSeek’s management has been telling potential investors it will prioritize breakthrough research over short-term commercialization as it advances a 70 billion yuan, or roughly $10 billion, funding round. Founder Liang Wenfeng has also reportedly pledged to continue open-source model development while pursuing artificial general intelligence.

That combination is powerful, but it is not simple. A company telling investors it will chase AGI, keep models open and price APIs aggressively is making a bold promise. It is saying scale will come before margin. That can work if DeepSeek’s architecture is genuinely more efficient, if funding gives it enough runway, and if developer adoption turns into durable platform usage. It becomes harder if inference demand rises faster than expected or if geopolitical limits make global enterprise growth uneven.

For OpenAI, Anthropic and Google, the immediate pressure is not that every customer will switch. The pressure is that DeepSeek gives buyers a benchmark. A procurement team can now ask why similar workloads cost many times more elsewhere. A startup founder can ask whether premium models should be reserved only for the hardest tasks. A developer can build a fallback stack where price, not brand, decides the route.

This is where the market is heading. Model providers will not compete only on benchmark charts or keynote demos. They will compete on the daily economics of products that have to make money. DeepSeek’s permanent cut makes that argument harder to ignore, and the next move belongs to the companies still charging like AI is scarce.

Also read: Hyperliquid's HYPE token hits new all-time high above 62 amid ETF inflows • France adds €1.55B to quantum and chip race • Wingtech files 1.18 billion lawsuit to reclaim Nexperia ownership