This article is the first article published by today's headlines, and if the same article appears on other accounts, it will be ported
With the popularity of ChatGPT and the rise of artificial intelligence large models, a wave of entrepreneurship has been set off in China.
However, when the enthusiasm cools, the teams that can really take root are those enterprises and scientific research institutions that have long been deployed. Established Internet companies such as Baidu, Alibaba, and Tencent, as well as universities and scientific research institutions such as Zhipu Huazhang, Zhiyuan, and Fudan, have long been prepared.
Zhihu's summary answer also comes from Zhihaitu, these are veteran teams that have been baptized by time.
As a company mainly based on mobile phones, the products released by vivo are only the result of accumulation and precipitation.
In the development of large models, mobile applications have become very important. The most famous of these is OpenAI's ChatGPT, which leverages Google's Transformer architecture released in 2017 and makes important contributions to the field of natural language processing.
The large domestic model started a little later, starting around 2019. Technology companies such as Huawei, Alibaba, Baidu, SenseTime, and Inspur have successively released large models such as "Wen Xin Yiyan", "Tongyi Qianwen", "Mixed Yuan", and "Pangu".
Vivo also began to study large models during this time period, and experienced the development and explosion of large models. The parameter scale of a large model is the basis for influencing accuracy.
When the number of parameters and training reaches a certain scale, the accuracy of the model will be significantly improved. At the same time, large models can automatically learn and discover new, higher-level features and patterns, including language understanding, generative ability, and logical reasoning ability.
Therefore, vivo chooses to train two models with more than 100 billion parameters to give the model corresponding capabilities. Computing power is a key factor in determining the processing power of large models.
Training and inference of large models require the support of high computing power. In order to save computing power, a common practice is to use compression algorithms to clip and distill the model, transforming large and complex pre-trained models into compact models.
Vivo chooses to train large models with 100 billion parameters first, and then launch large models with 66 billion parameters, 7 billion parameters, and 1 billion parameters to meet the needs of different scenarios. Applications that run large models on mobile phones are also advancing rapidly.
Google released a mobile PaLM2 model at this year's I/O developer conference.
At the same time, mobile phone SoC manufacturers such as Qualcomm and MediaTek have also begun to support large-model applications, laying the hardware foundation for running large models on mobile phones.
The application of large models can improve the performance of various voice assistants. With large models, voice assistants can improve the accuracy of semantic understanding and coherence of contextual conversations.
In addition, in the field of image recognition, large models can also achieve better shooting effects and bokeh effects, improving the shooting experience of mobile phones. It can also be applied to the field of education, assisting children in homework and learning, and providing a variety of problem-solving methods.
At the same time, the application of large models can also be linked with smart homes to improve the user experience of smart home devices. Although the application of large models on mobile phones is not yet popular, its potential is huge.
Bringing the capabilities of large models into the daily lives of mobile phone consumers can achieve artificial intelligence in the true sense. Vivo started with a large model of 175 billion parameters, and after cropping and distillation, launched a large model of 66 billion parameters, 7 billion parameters and 1 billion parameters to adapt to the memory and computing power limitations of mobile phones.
Running large models on mobile phones mainly faces two problems, namely memory requirements and power consumption.
Mobile phones have limited memory, and large models need to occupy a certain amount of memory space.
In order to solve this problem, vivo adopts the combination of device and cloud, and distributes computing to on-premises or cloud according to the complexity of the problem. On the other hand, the impact of power consumption on the phone is also important.
In order to reduce power consumption, vivo first designed a large, full-featured model, and then cut and distilled according to the needs to launch a model suitable for mobile phones. The application of large models also has a great role in promoting the open source community.
There are already many innovation ecosystems in the open source community, and the addition of large models will further enrich this ecology.
If vivo can support the construction of open source communities, the large model of 7 billion parameters may have richer and more efficient applications.
In general, the application of large models will become part of the Internet infrastructure. As an electronic product that individuals carry around, mobile phones are of great significance to the application of large models.
vivo is in the first echelon of R&D and application of large models, and the matrix of large models they released at the Boao Forum shows their strength and ambition. Through the combination of device and cloud, the large model is expected to become an important function of mobile phones, realizing the comprehensive application of artificial intelligence in daily life.