本文来自微信公众号:中国企业家杂志 (ID:iceo-com-cn),作者:孔月昕,编辑:马吉英,题图来自:AI生成
A 'small company' that generates revenue with DeepSeek, both painful and happy
Chinese Entrepreneur Magazine
Chinese Entrepreneur Magazine
follow with interest
This article is from WeChat official account: China Entrepreneur Magazine (ID: iceo com cn), written by Kong Yuexin, edited by Ma Jiying, and the title is from AI Generation
Article Summary
The DeepSeek big model triggers a revolution in the AI industry, driving a surge in traffic and technological upgrades for small and medium-sized enterprises. Its high cost performance API and open source strategy lower the threshold for enterprise access, stimulate innovation in C-end/B-end applications, and intensify industry competition. Model optimization makes it possible to deploy end-to-end AI, and hybrid computing solutions accelerate the landing of AI application scenarios. The market expects to usher in an application explosion period in 2025.
• Model hot: DeepSeeker R1 full blood version triggers enterprise access wave, Tencent Yuanbao APP downloads top app store
• Cost Revolution: API price is only 1/30 of GPT-4, theoretical profit margin reaches 545%, promoting industry cost reduction and efficiency improvement
• ⚙️ Technological breakthrough: MoE architecture optimization reduces computing power requirements, end side chips can locally deploy 670B parameter models
• Ecological Expansion: Over 200 Enterprises Connect to Open Source Models, Forming a New Paradigm of Cloud+End Side Hybrid Deployment
• Traffic impact: Silicon based mobile traffic surges 40 times, server overload reflects market demand explosion
• Application turning point: The cost of large models will decrease by 90% in 12 months, and it is expected to usher in the first year of AI intelligent agent explosion in 2025
After AI Infra announced its integration with DeepSeeker R1, many small and medium-sized enterprises will come to contact, hoping to obtain products deployed with the R1 model. Qingcheng Jizhi encountered a similar situation.
Is your DeepSeek a 'full blooded version'? "Tang Xiongchao, CEO of Qingcheng Jizhi, was once asked by a client.
Note: DeepSeek Full Blood Version is the top-level version of the DeepSeeker R1 model, with model parameters reaching 671B (671 billion), which is more than 20 times that of the regular version (14B/32B). The Full Blood Version supports local/API deployment and complex scientific research calculations, with a higher upper limit of capabilities and higher hardware requirements.
After receiving too many such inquiries, the Qingcheng Jizhi team decided to use engineering to solve this problem - launching a "full health version" identification mini program on the official website, and carefully selecting several distinguishable questions that users can use to ask. If the system answers correctly, it is basically the "full health version"; If you can't answer it, it may not be the 'full blooded version'.
After the mini program went online, its traffic exceeded the expectations of Qingcheng Jizhi.
In fact, the experience of Qingcheng Jizhi is just a microcosm of the recent AI industry. The entire AI industry should have had a very fulfilling month, "said an industry insider. The popularity of DeepSeek has put all practitioners in the AI industry in a state of "pain and happiness".
On the one hand, the emergence of DeepSeek has stimulated the awareness and demand of ordinary users to use AI tools, promoting the popularization of AI. DeepSeek has also become the fastest-growing AI application in history. According to the AI product list, DeepSeek had 157 million active users in February, which is close to 20% of ChatGPT's 749 million. The influx of too many users often puts DeepSeek chatbot in a "server busy" state.
On the other hand, DeepSeek's rapid iteration and open source have led the already "rolling" AI industry into a new round of "arms race", with many companies from the model layer to the application layer hardly taking a break during this year's Spring Festival. Numerous companies have announced their integration into DeepSeek, including B-end companies such as cloud service providers and chip manufacturers, as well as various C-end application companies. According to statistics from Zhenghe Island, over 200 companies have completed the integration and deployment of DeepSeek technology interfaces.
The companies that have joined have also received a wave of "sky high traffic" - Tencent Yuanbao APP has rapidly climbed in download volume after joining DeepSeek, and topped the free APP download ranking list of Apple App Store in China on March 3; As an AI infrastructure company, silicon-based mobile has the fastest access to DeepSeeker R1 on the entire network, with a 40 fold increase in traffic and a staggering 17.19 million visits in February.
The emergence of DeepSeeker R1 has further raised expectations from all parties for the accelerated development of AIGC applications. When Monica.im released its AI agent product Manus on March 6th, it once again sparked a frenzy of "invitation code buying".
Both major model manufacturers and companies in the upstream and downstream of the AI industry chain are eagerly waiting for the key path to the future AI world.
1、 How to access DeepSeek
As early as the release of the DeepSeeker V2 model in 2024, the industry had already paid attention to this company and its open-source models.
Guo Chenhui, the technical director of Meitu Design Studio, stated that in order to provide users with a better experience in Meitu's AI application scenarios, Meitu has also been paying attention to excellent large models at home and abroad based on its own research. When DeepSeeker V2 was released, Meitu's external AI team paid attention to the model and tried to collaborate with the DeepSeeker team. However, in order to seek stability, Meitu mainly called the DeepSeek model API through third-party AIInfra service providers at that time. In September 2024, Meitu Design Studio integrated the V2 model to assist with copywriting expansion. After the release of the V3 and R1 models, they also updated them one after another. When our product and business teams see some models that are suitable for integration, they will conduct performance evaluations, and those that are suitable may be introduced into our own application scenarios, "said Guo Chenhui.
DeepSeek officially provides two access methods: one is to call its API interface through some programming methods after the model runs; The second is for users to install an app on their mobile phone or open a chat window on the official website, and directly chat with it. Behind the chat window is calling the API.
However, due to the current high traffic of DeepSeek and the shortage of servers and manpower, DeepSeek's own API may experience issues such as timeouts. Guo Chenhui stated that Meitu's products have a large user base, and after promoting some features, traffic may increase by tens or even hundreds of times. In this case, the service guarantee capability of public cloud is relatively stronger.
Not only that, DeepSeek's model is relatively large, especially the "full blooded version" model, which has certain hardware requirements; Based on the consideration of cost-effectiveness, Meitu's business scenarios have significant peak and low peak effects, and cloud providers can smooth out the differences in the peak and low peak periods of API calls among different companies. If we deploy it ourselves, the utilization rate of resources during low peak periods may be relatively low, resulting in significant waste of resources, "said Guo Chenhui.
Therefore, the current way for Meitu to access the DeepSeek-R1 model is mainly to call the API of cloud vendors and deploy it privately on this basis.
Similar to Meitu, this chip technology that deploys end side chips has also been keeping an eye on various newly released large models, especially those that are more suitable for localized deployment on the end side. Zhou Jie, the general manager of the ecological strategy of this chip technology, stated that for some open source large models, especially SOTA models (State of the Art, the best performing model in a certain field or task), they will invest resources in corresponding heterogeneous adaptation as soon as possible. Therefore, after DeepSeek released V2 last year and R1 this year, this chip technology immediately attempted to adapt to these models.
In Zhou Jie's view, there are two main innovations of the DeepSeek-V2 model. Firstly, it effectively reduces the overhead of KV cache (an optimization technique used by Transformer models in autoregressive decoding) through MLA (Multi Head Latent Attention) architecture. This is because large language models have high requirements for memory bandwidth and capacity. Once KV cache can be reduced, it can greatly help computing platforms; The second is the MoE (Mixed Expert) model released by DeepSeek, which optimizes and transforms the traditional MoE architecture. This architecture allows a larger model with limited resources to be used.
At that time, this chip technology quickly adapted to the light version of the V2 model, which is a model of size 16B. Although the 16A parameter may seem large, in actual operation, it only activates the 2.4B parameter. We believe that this model is very suitable for running on the end side, and the P1 chip of this chip technology can also provide good support for models with a 2.4B parameter scale, "Zhou Jie told" Chinese Entrepreneur ".
Regarding how this chip technology "connects" to DeepSeek, Zhou Jie explained, "Users now use applications such as DeepSeek, which require a lot of computing power from the cloud. DeepSeek's own data center or cloud vendors provide some APIs for terminal applications to call. When users use the DeepSeek APP, they can call the AI capabilities in the cloud. However, some terminal scenarios may have high requirements for data privacy and other aspects. In this case, local computing is needed. After deployment on the terminal side, users can run DeepSeek and other models in the event of network disconnection
After meeting the basic requirements for running a large language model from the aspects of computing power and system, this chip technology can combine with the actual needs of customer projects, collaborate commercially with model vendors such as DeepSeek, fine tune and optimize the model, and implement specific projects.
After the launch of V2, Qingcheng Jizhi also attempted to integrate the model internally, but due to low market demand at that time, they did not promote its use. After the release of R1 this year, they felt it was a great opportunity and decided to integrate with DeepSeek and promote it to customers on a large scale.
Qingcheng Jizhi specializes in system software and provides inference services based on system software. Therefore, unlike some application companies that directly access DeepSeek's API, it provides customers with a dedicated DeepSeek API for application services. Our way of accessing is to download DeepSeek's open-source model and deploy the service on our computing system using system software, "said Tang Xiongchao.
Simply put, the R1 model is a file of several hundred gigabytes in size, but it cannot be directly used after downloading. It's just a file, not a usable service. What we need to do is to run this model and let it provide service interfaces to the outside world. Through the API service interface, users can talk to the model, "explained Tang Xiongchao.
Based on previous technical accumulation, Qingcheng Jizhi iterated the first version within a day after downloading the model file, and then optimized the R1 model structure. The official "full blood version" was officially announced and launched in just one week.
In Tang Xiongchao's opinion, the technical work has been relatively smooth, and after integrating with DeepSeek, more challenges come from the business or market side. Specifically, DeepSeek's traffic has brought the company a lot of consulting clients, but each client's needs are different. Including different computing power platforms, chip models, server specifications, etc., we need to make targeted optimizations based on different computing power foundations, "said Tang Xiongchao.
2、 Reducing API costs promotes the popularization of large models
After the release of the V2 model in May 2024, DeepSeek was awarded the title of "Pinduoduo in the AI industry" due to its ultimate cost-effectiveness, which led to a price war among major domestic manufacturers for large models.
The price war has reduced API costs. Taking Meitu's "AI Product Image" as an example, in Guo Chenhui's view, on the one hand, Meitu has a strong technological advantage in AI image processing, and the integration of DeepSeek model brings positive feedback on user experience and conversion. Moreover, the cost of calling the big language model API is very low, which complements Meitu's business scenarios very well. Therefore, Meitu will also increase its attention to the application of big language models.
On February 9th, DeepSeek stopped the 45 day discounted price trial period for the V3 model, and restored the API to its original price of 0.5 yuan per million input tokens (cache hit)/2 yuan per million output tokens (cache miss), and 8 yuan per million output tokens. The input price of one million tokens (cache hit) for R1 is 1 yuan, the input price of one million tokens (cache miss) is 4 yuan, and the output price is 16 yuan.
But according to OpenAI's official website, GPT-4o's input tokens cost $2.5 per million and output tokens cost $10 per million; The latest release of GPT-4.5's million input/output tokens is as high as $75/$150, which is 15 to 30 times higher than GPT-4o alone.
In Guo Chenhui's view, on the one hand, the cost of calling the DeepSeek model does not account for a high proportion of the overall cost of Meitu AI research and investment; On the other hand, DeepSeek remains in a relatively affordable price range even after restoring its original price, and Meitu's integration with DeepSeek has shown positive user conversion and feedback. Therefore, they will increase their investment in large language models.
Zhou Jie also believes that the API price of DeepSeek is many times lower than OpenAI, which greatly reduces the cost of buying tokens for enterprises and users. At the end model level, a 3B model may now be able to achieve the same level of performance as previous models with a scale of 7B or above, with relatively reduced costs such as memory.
This is a process of software hardware collaboration. Under the same hardware conditions, it is now equivalent to being able to achieve model effects with larger parameter scales, or to achieve the same model effects, the requirements for hardware have become lower, "said Zhou Jie.
In early March, after the five-day "DeepSeek Open Source Week", the DeepSeek team released key information such as the optimization technology details and cost profit margin of the model for the first time. According to DeepSeek's calculations, its cost profit margin can theoretically reach 545%.
The rapid reduction of the cost of large models and the improvement of capabilities have also brought about a high-speed growth of users in the to B and to C fields. Tang Xiongchao revealed that many small and medium-sized enterprises now proactively contact them, hoping to obtain products based on the R1 model.
3、 AI applications will accelerate the explosion
Robin Lee, the founder, chairman and CEO of Baidu, wrote in the article "Seizing the first year opportunity of AI agent explosion and promoting the rapid development of new quality productivity" that the reasoning cost of the big model is reduced by more than 90% every 12 months, far exceeding the "Moore's Law". With the iteration of big model technology and the sharp decline in costs, the application of artificial intelligence will explode.
At present, the AI market is in a stage of rapid growth. Tang Xiongchao believes that DeepSeek's theoretical profit margin is as high as 545%, which has a very positive significance and impact on the entire industry, and has educated the market on the importance of computing power system software.
In the past, people did not attach great importance to the ability of software. DeepSeek made people realize that spending money on software is not a waste of money, but to save money better Tang Xiongchao stated that in an educated market environment, the advantages of core system software can be more fully utilized; In the short term, DeepSeek's open source can also help reduce the commercial cost of product delivery for all parties involved.
As more and more enterprises join DeepSeek and provide feedback on its open source ecosystem, the development process of DeepSeek is also accelerating.
Guo Chenhui believes that this is also the biggest advantage of DeepSeek's open source ecosystem - while the connected enterprises create differentiated products in their respective application scenarios, the application scenarios can also promote the development of DeepSeek and other foundational models. The differentiated deployment of open source ecosystems by various companies not only accelerates AI innovation, but also helps to reduce the cost of large models in vertical segmentation fields, bringing greater imagination to AI applications, "said Guo Chenhui.
In Zhou Jie's view, in addition to the explosion of cloud applications, end-to-end AI applications will also achieve explosive development in 2025 under the promotion of DeepSeek.
The future AI is actually a hybrid artificial intelligence, not everything runs in the cloud, nor is everything run on the end side, because each has its own advantages. For example, the end side can only run relatively small-scale parameter models, but for certain tasks that require higher accuracy, cloud computing power is still needed. In order to ensure data security and privacy, it is necessary to use end side capabilities to achieve the effects of larger parameter scale models, which forms a hybrid deployment solution. "Zhou Jie said that this chip technology is also exploring applications with cloud vendors in this area.
The concept of "the first year of AI application" is no longer a new one, but as of now, AI industry practitioners and investors are still searching for more suitable landing scenarios for AI applications. In Zhou Jie's view, it's just a matter of time. "The development of a new ecosystem definitely takes some time, and everything won't suddenly improve. It requires continuous iteration of software and hardware. Currently, the chip side, model side, and other aspects have laid a solid foundation for the large-scale application of AI. More developers are needed to develop AI applications in the future to meet practical scenario requirements
This article is from WeChat official account: China Entrepreneur Magazine (ID: iceo com cn), written by Kong Yuexin and edited by Ma Jiying
This content is the author's independent viewpoint and does not represent the stance of Tiger Sniff. Reproduction without permission is not allowed. Please contact for authorization matters hezuo@huxiu.com
If you have any objections or complaints about this article, please contact tougao@huxiu.com
People who are changing and want to change the world are all on the Tiger Sniff app