The new version of DeepSeek, V4, is out. It is much more powerful and efficient than ChatGPT.
DeepSeek is back with a new AI model, more than a year after impressing ChatGPT and Gemini. The Chinese company has released the DeepSeek-V4 family, which has two models that are designed to be efficient over long periods of time. DeepSeek is no longer just up against OpenAI, Anthropic, or Google in 2025. It is also up against other powerful AI systems, like the new Kimi K 2.6.
The company put out a document on the Hugging Face website that explained the new models and what they could do. The DeepSeek-V4 family consists of two models with an Expert Mix (MoE) architecture. The first, DeepSeek-V4-Pro, contains 1.6 trillion operators, although it only activates 49 billion operators per query. The second, DeepSeek-V4-Flash, works with 284 billion operators and activates 13 billion operators per query.
According to internal testing, DeepSeek-V4-Pro, in its maximum inference mode, ranks as the best open-source model in several areas. In general knowledge, it significantly outperforms its predecessors in the SimpleQA-Verified test. In competitive programming, it ranked 23rd in Codeforces' human candidates, and according to the same research data, it is the first open-source model to match GPT-5.4 in this specific task.
Compared to powerful models like Gemini 3.1-Pro or Cloud Opus 4.6, the situation is different. In the realm of general knowledge and inference, DeepSeek-V4-Pro-Max still lags behind Gemini-3.1-Pro and GPT-5.4 in some tests, although it outperforms Gemini-3.1 Pro in retrieving long-context information. In agent tasks, the AI performs similarly to other open-source models but doesn't surpass the closed systems from Joule, OpenAI, and Anthropic.
One reason for DeepSeek's widespread adoption is its technology. AI companies, including NVIDIA, couldn't understand how a model this efficient required so little computing power. The Chinese company maintains the MoE architecture, though it has enhanced it with new features that handle attention in a different way.
Traditional converters require computational costs that increase with text length, making processing very long texts an excessively resource-intensive process. In contrast, DeepSeek-V4-Pro requires only 27% of the computational work of DeepSeek-V3.2 and occupies approximately 10% of the KV cache.
This was made possible by a hybrid mechanism combining two techniques: compressed sparse attention and high-pressure attention. The former compresses key and value blocks, then applies sparse attention to select only the most relevant inputs. The latter increases this compression, further reducing the size of the key and value cache.
Dickie Wong, executive director of research at Usmart Securities, told the South China Morning Post, "The DeepSeek model is highly efficient, so demand for inference is rapidly increasing. This supports the shares of chip and device manufacturers, as companies still need to invest in GPUs or Huawei's Ascend chips to run these models at scale."
The new model maintains the open-source architecture, allowing for the downloading of weights from Hugging Face. Compared to its predecessor, DeepSeek-V4 Pro boasts enhanced proxy capabilities and a deeper understanding of the world, surpassed only by Gemini-3.1 Pro. The V4 Flash version offers similar performance to its predecessor in inference and simple proxy tasks, though it responds more quickly.
Those who wish to try it can do so through the website or iOS and Android apps.


