This week, a small Chinese artificial intelligence laboratory turned his secluded leaders into a national hero by revealing the technical recipe of its cutting -edge model. He violated the United States to try to prevent China's high -tech ambitions.
DeepSeek was founded by hedge fund manager Liang Wenfeng. He released the R1 model on Monday and explained in a detailed paper that how to establish a large -scale language model on the self -budget, which can automatically learn and improve yourself without supervision.
American companies, including OpenAI and Google DeepMind, develop and develop and develop in the reasoning model. This is a relatively new AI research field that tries to make the model match the human cognitive ability. In December, Openai, headquartered in San Francisco, released its full version of its O1 model, but maintained secrets.
The R1 version of DeepSeek has triggered the crazy debate of Silicon Valley, which involves the improvement of whether the US AI company including Meta and Anthropic can defend its technical advantages.
At the same time, Liang has become the focus of the pride of domestic nation. This week, he is the only AI leader who has been chosen as the second largest leader of the country, Li Qiang, who participated in the entrepreneur meeting. Entrepreneurs were told "concentrated on the core technology that spans key".
In 2021, LIANG began to buy thousands of NVIDIA graphics processing units for his AI SIDE project, while running his quantitative trading fund climax. The industry insiders regard it as a weird action for billionaires to find new hobbies.
"When we first saw him, he was this very nerd guy. He had a terrible hairstyle and talked about building a 10,000 chip group to train his model. We did not treat him seriously." Liang's business partner explain.
"In addition to saying: I want to build this, he cannot express his vision, which will be a change of the game. We think that only giants such as Bytedance and Alibaba can." The person added.
Liang's status as an outsider in the AI field is an unexpected source of power. In High-Flyer, he uses AI and algorithms to identify models that may affect stock prices, thereby establishing wealth. His team is good at using NVIDIA chips to make money trading stocks. In 2023, he launched Deepseek, announcing that he intends to develop a person -level AI.
The founder of a competitor LLM Company said: "Liang has established an outstanding infrastructure team to truly understand the working principle of the chip." "He brought the best person from the hedging foundation to Deepseek."
After Washington banned NVIDIA from exporting its most powerful chips to China, local AI companies were forced to find innovative methods to maximize the calculation power of the limited number of land chips-Liang team already knew how to solve it.
A AI researcher said: "Deepseek engineers know how to release the potential of these GPUs, even if they are not the most advanced state."
Industry insiders said that Deepseek's strange concern for research has made it a dangerous competitor because it is willing to share its breakthroughs instead of protecting them to obtain commercial income. Deepseek did not raise funds from external funds, nor did it take major measures to monetize.
A AI investor in Beijing said: "DeepSeek is running like early DeepMind." "It is purely focused on research and engineering."
LIANG personally participated in the research of DeepSeek. He used his hedge fund transactions to pay the highest salary of the best AI talent. DEEPSEEK, together with the barbarian of the owner of Tiktok, is famous for providing the highest salary to Chinese AI engineers, and the staff is located in the office of Hangzhou and Beijing.
The business partner said: "Deepseek's office feels like a university campus of serious researchers." "The team believes in the vision of Liang: Show the world to show the creativity and start from scratch."
DeepSeek and high -aircraft did not respond to comment requests.
LIANG designed DeepSeek as a unique "local" company, with doctoral degrees from top Chinese schools, Beijing, Tsinghua and Beihang University, not experts from American institutions.
In an interview with domestic media last year, he said that his core team "has not returned from overseas. They are all local.. We must develop top talents ourselves." DeepSeek has won praise at home as a pure Chinese LLM company.
DeepSeek claims that it only uses 2,048 NVIDIA H800 and $ 5.6 million training models, which have 67.1 billion parameters, which are a small part of OpenAI and Google for a considerable model of training size.
Ritwik Gupta, a AI policy researcher at the University of California Berkeley, said the recent model released by DeepSeek showed that "there is no moat in AI function."
He said: "The first person in the training model must spend a lot of resources to reach there." "But the second porter can be cheaper and get there quickly."
GUPTA added that China's system engineer talent pool is higher than the United States to understand the best use of computing resources to train and run models.
Industry insiders said that although DeepSeek showed impressive results under the condition of limited resources, whether it can continue to be competitive with the development of the industry, this is still an unreasonable issue.
Its supporters at High-Flyer's return on 2024. A person close to LIANG's blame for the attention of the founder is mainly focused on Deepseek.
Its American competitors have not stopped. They are establishing a large "cluster" of the next -generation Blackwell chip of NVIDIA, thus creating computing capabilities that may once again create a performance gap with Chinese competitors.
OpenAI said it is establishing a joint venture with Japan's SoftBank (called Stargate) and plans to spend at least $ 100 million in AI infrastructure in the United States. Elon Musk's XAI has greatly expanded its giant image super computer to contain a 1MN GPU to help train its Grok AI model.
Liang's business partner said: "Deepseek is one of China's largest advanced computing clusters." "They have enough ability, but there is no longer time."
Wenjie ding other reports in Beijing