laitimes

Later Exclusive丨Yuan Jinhui's new company received 50 million investment from Innovation Factory, Wang Huiwen, etc

Later Exclusive丨Yuan Jinhui's new company received 50 million investment from Innovation Factory, Wang Huiwen, etc
"The first time I started a business, I made a technological impact, and this time I want to prove that good technology can make money. ”

Text丨He Qianming

Editor丨Cheng Manqi

Jinhui Yuan, the founder of OneFlow and co-founder of Lightyears Away, recently announced that he will start his own business again and form a new company, SiliconFlow.

"LatePost" exclusively learned that Silicon-based Flow has completed an angel round of financing of 50 million yuan, led by Sinovation Works, followed by Yaotu Capital, MiraclePlus, Meituan co-founder Wang Huiwen, etc., with a post-investment valuation of hundreds of millions of yuan.

Yuan Jinhui told LatePost that he and the OneFlow core team will start preparations to start another business in August 2023, and the new company will continue the direction of OneFlow and be the "framework" in the AI Infra (AI infrastructure) layer.

Just like a computer operating system that enables ordinary users to directly use a mouse and keyboard to operate applications, AI frameworks can help developers design or use models simply and conveniently without worrying about the allocation of underlying computing resources.

The difference is that OneFlow focused on a general training framework to serve the production of deep models, while Silicon-based Flow focused on an inference framework to serve the application of large models.

Yuan Jinhui believes that the application of large service models is easier to make standardized products than the production of large service models, and the market space is much larger: model production is phased and dominated by a small number of companies, while large model applications will spread all walks of life, ubiquitous, and needed by enterprises of various industries and sizes.

From OneFlow to SiliconFlow, from training to inference

Jinhui Yuan graduated from Xidian University with a bachelor's degree, and was admitted to the Department of Computer Science of Tsinghua University in 2003, where he studied for a doctorate degree under the supervision of Academician Zhang Bo, and has done interdisciplinary research in machine Xi, deep Xi and brain science.

After joining the Microsoft Asian Research Institute in 2013, Yuan Jinhui made the fastest topic model training algorithm and system LightLDA at that time, and then began to try to develop an AI training framework, which is also the main direction of OneFlow, which he founded in 2017.

At that time, there were already open-source frameworks such as Google's TensorFlow and Meta's PyTorch. The reason why Yuan Jinhui feels that there is an opportunity for entrepreneurship is that he has a non-mainstream cognition: AI models will be large in the future.

His judgment came from his postdoctoral experience in interdisciplinary research: there were hundreds of billions of neurons in the human brain, and the parameters of the largest neural network at that time (similar to neurons) did not exceed 100 million, and most of the training frameworks could meet the scale of the models at that time, but they were not optimized for the huge models.

Yuan Jinhui said that from the beginning of OneFlow's business, they have been developing a training framework for large-parameter models, using the distributed parallel training method commonly used in large-scale model frameworks, which can improve the efficiency of training a single model with a large number of GPUs at the same time.

In November 2022, when PyTorch released DistributedTensor, a distributed computing extension, PyTorch co-founder Soumith Chintala also mentioned on social platforms that it was partly inspired by similar features of OneFlow.

An AI practitioner once told "LatePost" that OneFlow is indeed more convenient to train large models, and the framework used by large technology companies will do another layer of encapsulation and optimization on Pytorch or TensorFlow, but these optimizations are customized for existing businesses such as information recommendation, not for large models, "Hundreds of millions of parameters and tens of billions or even hundreds of billions of parameters of models, what the training framework does to do is very different", OneFlow has been optimized for large model training.

At the beginning of 2023, the large model received a lot of attention around the world, and OneFlow's technical vision was validated. In March of the same year, OneFlow was acquired by Lightyear Away, and three months later, the founder of LightYear Away, Wang Huiwen, withdrew from the business due to illness, and LightYear Away was acquired by Meituan, and Yuan Jinhui chose to start a business again.

"There were 40 people when I joined Light Years and 35 people when I started my own business. Yuan Jinhui said that after two changes in 2023, only a few members of the OneFlow core team have left, and the key reason is that the new direction of the reasoning system makes the team feel hopeful and opportunity.

Nvidia and Amazon are both doing reasoning frameworks, and there is no shortage of opponents from large companies

The opportunity of the inference framework lies first in the demand and the market. Several large model practitioners estimate that if large models are widely used, the training and inference consumption of large models will reach 2:8, or even 1:9.

At the same time, this is a relatively fragmented market with more suppliers having more say, and there are only a few companies that can produce large models themselves, and many more companies that develop applications based on models or directly use large models, all of which require inference optimization.

The high cost of inference is also a pain point in the current industry, and customers are more willing to pay for it. Traditional software is developed once and replicated infinitely, and the marginal cost is significantly reduced, while users consume a lot of GPU computing resources for inference every time they call a large model. That's why the paid version of ChatGPT also limits the number of times users can use GPT-4, up to a maximum of 50 times in a 3-hour period.

Inference has more room for cost optimization than training. "Using a large number of chips to train large models, the theoretical upper limit of hardware utilization is more than 60%, NVIDIA can do more than 50% of software and hardware at the same time, and other companies optimize on this basis, at most only about ten points of margin; and the inference link has several times, or even ten times the cost reduction space, and the market capacity of inference is many, many times that of training. Yuan Jinhui said.

Yuan Jinhui also mentioned that the resource threshold for making an inference framework is lower and it is more friendly to startups. Because thousands of GPUs are needed as an experimental platform for large model training, inference frameworks can use much smaller computing resources for development and optimization. The demand for inference is also more homogeneous, making it easier to make a standard product, and startups can focus on R&D and reduce customized delivery.

Larger market opportunities and lower resource thresholds mean more competitors, both large and small:

- Large cloud computing companies will develop similar products to be sold with AI computing resources, such as Amazon Cloud's SageMaker and Alibaba Cloud's PAI.

- Chip companies will also develop inference frameworks to match chips, such as NVIDIA's TensorRT-LLM;

- In terms of startups, Lepton AI founded by Jia Yangqing (one of the authors of the well-known AI training framework Caffe), OctoAI co-founded by Chen Tianqi (author of XGBoost, compiler TVM of Xi Robotics Xi) and Deep Learning compiler TVM), Luchen Technology founded by You Yang, professor of the National University of Singapore, and Wuwen Xinqiong, founded by Wang Yu, director of the Department of Electronics at Tsinghua University.

- There are also open-source inference frameworks in academia, such as vLLM developed by the University of Berkeley.

Yuan Jinhui is not worried about the technical strength of silicon-based flow in this field, he said that the company's framework is all self-developed, and is not based on Nvidia's TensorRT-LLM or Berkeley's vLLM, which are currently the two most mainstream inference frameworks.

According to the Silicon-based Flow, Llama2, which uses 8 A800 GPUs to inference 70 billion parameters, surpasses vLLM and TensorRT-LLM in terms of throughput and latency, and some indicators can be up to 10 times that of competing products.

Yuan Jinhui said that Silicon-based Flow has a large number of front-line technical and engineering talents who have been working on AI frameworks for a long time, and many of them have gone to the relevant teams of large companies and come back: "In large companies with large businesses, the priority of AI frameworks is not necessarily very high, and for us, this is the only one." ”

"The first time I started a business, I had a technological impact, and this time I want to prove that good technology can make money"

From its inception in 2017 to its acquisition in 2023, OneFlow has not made investors lose money, but it has not been able to make profits and grow rapidly independently like its foreign counterparts, and its commercialization has not been successful.

OneFlow's situation is not unique, after 2016, a number of basic software startups appeared in China, they mostly tried the open source model, and had good financing in the early stage of development, but most of them became more and more exhausted, and few of them ran successfully.

Before starting this business, Yuan Jinhui still wanted to be a basic software company for enterprise customers, he consulted friends who had successfully started a business, and the feedback he got was pessimistic: "I didn't think clearly why so many outstanding talents in China couldn't do well in this industry, and I didn't see that you have more business knowledge than ordinary people in this regard." ”

Yuan Jinhui thinks that he has a clearer idea of how to commercialize a Chinese company that wants to do standardized technology SaaS (software as a service).

When he decided to start his own business, his first thought was to actively explore overseas markets. Yuan Jinhui believes that the overseas market, especially the United States, has a high degree of digitalization, and enterprises and developers have developed the Xi of software payment, cultivating companies with a market value of tens of billions of dollars such as Databricks and Snowflake, and the difficulty of commercialization is low.

In the Chinese market, his strategy is to bundle a product that a customer must pay for, such as hardware or cloud computing resources, and sell software together.

With the experience and lessons accumulated during the OneFlow period, Yuan Jinhui said that silicon-based flow is confident that it can find a commercialization path that is in line with the general environment.

In the current funding environment, the time left for silicon-based liquidity to prove commercialization is even tighter.

An entrepreneur in this field said that when venture capital institutions invest in such basic software companies, they will provide angel round funds because of the technology of the entrepreneurial team, but subsequent investments will pay more attention to actual revenue and customers, and they will become more and more cautious.

The silicon-based mobile office is home to Tsinghua Science and Technology Park, home to China's large models and AI companies. One day last year, Yuan Jinhui went downstairs to eat dumplings in Xijiade downstairs in Tsinghua Science and Technology Park, and as soon as he sat down, he heard a discussion at the next table: "OneFlow technology is doing very well...... In the end, it was not that I didn't make money and was acquired. ”

"Sometimes I wonder if we're doing a bad example? The technical judgment is fine, and we work diligently, but we still can't achieve the success that everyone recognizes. Yuan Jinhui said that this time he decided to start another business, not wanting to let the team scatter and separate, but also to try to prove and realize what he thinks is simple and correct: "Good technology can make money." ”

Title image source: "Forrest Gump"