The rapid growth of the live broadcast economy is promoting the continuous upgrading of the live broadcast form, and after two years of market precipitation, the "virtual anchor" loved by Generation Z consumers has ushered in an explosive "growth" stage.
On major video live streaming platforms, personalized 3D virtual anchors of different styles and types are replacing real anchors active in live broadcast rooms. Virtual anchors have greatly promoted the development of live broadcast business, and can not only be online 24 hours a day, but also have stable and lasting performance; You can also change the style at any time to maintain freshness; And with technology guaranteed, operations can be maintained without the access of a large number of teams.
Kuaishou StreamLake creates a virtual employee "milk thought" for Mengniu
Chat entertainment, game interaction, product explanation, emotional chat and so on. The application scenarios of virtual anchors are endless and varied, not only fully empowering the cultural and entertainment industry, but also achieving efficient interaction in other industries such as FMCG, education, finance, and communications.
Virtual anchors not only enhance the image and expression of the brand, but also serve the brand as digital employees, replacing labor in some areas. Therefore, the demand for the service capabilities of "digital employees" typified by virtual anchors is also increasing.
How to "answer the user's questions like a stream"?
How to improve the live streaming effect of virtual streamers?
How can I reduce the design and operational costs of virtual streamers?
Under a series of problems, the maturity of AIGC technology and the emergence of ChatGPT may provide us with more solutions to problems.
From "anthropomorphic" to "fanlike", the path of advancement of virtual anchors
The "2022 Virtual Digital Human Comprehensive Assessment Index Report" summarizes the three stages of virtual digital human development:
The first stage: anthropomorphism, highly realistic three-dimensional animated characters synthesized by computer virtualization, action forms, sounds, etc. are consistent with real people, and virtual humans are initially driven by AI to communicate and feedback information in real time.
The second stage: fandom, from the form simulation of appearance to the interactive emotion, the emotion algorithm technology realizes high-quality emotional interaction with humans.
The third stage: superhumanization, the ability of virtual humans surpasses that of natural people, "virtual" materialization, and robots carry virtual human consciousness back to the real world.
Image from the "2022 Virtual Digital Human Comprehensive Assessment Index Report"
At present, there are two main modes of virtual anchor driving: one is a virtual person driven by motion capture technology + real person, which shapes the virtual image by capturing the action expression of the "person in the middle" behind the scenes, which is also the most mainstream form of virtual anchor; The other is driven by AI technology, which can achieve 24-hour uninterrupted live broadcasting.
From a functional point of view, although the mo-cap + live-action virtual anchor can achieve answering in the interactive link, it still needs to be played manually, and cannot achieve 24-hour all-weather service. AI-driven virtual anchors can broadcast live for an unlimited time, but at the interactive level, interactive Q&A is carried out according to the knowledge base template set by the brand in advance, and some entertainment forms are displayed in conjunction with related activities, and the scope and field of use are relatively limited.
How can virtual streamers achieve high-quality emotional interaction with humans while still being online around the clock? AIGC+ChatGPT shows us the possibility of achieving this "both ability and ability".
AIGC+ChatGPT, how does the "fandom" virtual anchor perform?
In the early morning of March 15, OpenAI released the multimodal pre-trained large model GPT-4. GPT-4 can accept image and text input, output text, pictures, audio and video and other multi-modality, and its performance in many fields exceeds the human average. Subsequently, Microsoft announced on the 16th that it will launch an artificial intelligence service called Copilot and embed it in Office office software such as Word, PowerPoint, and Excel to achieve text writing, data analysis and icon generation, as well as manage inboxes, synthesize reply drafts and other functions. On the same day, the Chinese version of ChatGPT-Baidu Wenxin was officially unveiled.
The picture is from the screenshot of OpenAI's official website
Domestic and foreign Internet technology giants have joined the construction and layout of AI large language models, which will undoubtedly build a better artificial intelligence ecosystem, promote the significant improvement of AI technology capabilities, and in the future, digital employees represented by virtual anchors will become an indispensable "standard configuration" for enterprises.
ChatGPT has brought a significant improvement in language skills to virtual streamers:
The embedding of ChatGPT is like equipping a "anthropomorphic" virtual human with a brain. Not only can it achieve more accurate, fluent and natural expression, but it can also provide users with more personalized services through rapid learning and adaptation. Through the targeted training of relevant data to form a personalized model, virtual humans are expected to be used in AI explanations in offline exhibition halls to introduce digital humans, online AI anchors, AIGC fast short video systems and other scenarios.
AIGC's assistance is mainly reflected in the picture and sound level:
AIGC has made a lot of achievements in the field of audio and video production. On the one hand, functions such as voice imitation to voice changer greatly increase interactive entertainment; On the other hand, customized AIGC tools can facilitate creators to produce products with more accurate pictures, sounds, and dynamic effects, improve the comprehensive quality of works, and greatly improve the efficiency of creation.
Shiyou Technology Digital Person "Xiaoqian"
With the blessing of ChatGPT and AIGC technology, virtual streamers will enter the "fandom" stage. Through continuous deep integration with application scenarios in various industries, "fan-humanized" virtual anchors will become digital employees of enterprises and become important helpers for enterprises to reduce costs and increase efficiency - relying on the systematic enterprise knowledge base, providing users with 7*24 hours of service, assisting enterprises to complete repetitive work, and greatly improving the accuracy and service quality of basic work.
Vivid and concrete avatars, with flexible and humanized emotional interaction capabilities. This is a digital revolution in operational tools, and in the near future, a low-cost and efficient digital employee will become the norm for enterprises. Missing a breakthrough tool may be missing an era.
Comparison chart of "anthropomorphic" virtual anchors vs "fandom" virtual anchors
"The final form of virtual digital human driving is to realize AI driven, that is, to replace the role of 'people in the middle' with 'AI'." Ji Zhihui, founder and CEO of virtual content technology service provider Shiyu Technology, believes that virtual anchors can be used as fixed assets of brands and will be used in the future, and the precipitation value is infinite.
Ji Zhihui revealed that in addition to accessing the ChatGPT intelligent language model, the virtual human of Shiyou Technology also conducted special training on the virtual human action to improve the fluency and realism of the virtual human in the application scenario. In the future, the use of avatars may be as common as every company equipped with official websites and live broadcast rooms, and become an indispensable external publicity window.
Shiyou Technology AI digital human Mulan, the first application of language model technology in domestic digital human
On March 1, 2023, Shiyou (Beijing) Technology Co., Ltd. announced that it has become the first batch of ecological partners of Baidu Wenxin Yiyan (English name: ERNIE Bot). In the future, Shiyou Technology will fully experience and access the ability of Wenxin Yiyan through Baidu Intelligent Cloud, and access to Baidu Wenxin Language Model of Shiyou Technology Digital Human, allowing virtual humans to take a key step in a more intelligent and intelligent direction. At present, Shiyou Technology has successfully launched the AI virtual anchor "Mulan" with the blessing of language model technology.
Seyou Technology AI Virtual Anchor "Mulan"
"Mu Lan, how can I get rich?"
"To get rich, you first need to have planning and patience, you need to set a realistic financial goal and constantly work to achieve it."
The author and the audience had an interesting interaction in Mulan's Douyin live broadcast room, where the digital human Mulan can provide smooth, concise and humorous answers to different questions asked by the audience. From the perspective of live interactive experience effect, Mulan's intelligence is close to "fandom", and it can have high-quality emotional interaction with the audience.
"Mulan" live broadcast screen recording
This is the first landing of conversational language model technology in domestic digital human application scenarios.
In the view of Shiyou Technology, digital humans are the UI of AI. From Du Xiaoxiao, who hosted the Baidu Metaverse Song Party in 2022, to Mulan, who can now answer with users like a stream, through the powerful generative AI technology capabilities of Shiyou Technology, combined with Wenxin's large-scale real-time text generation capabilities, and the use of AIGC technology to convert graphics, audio, video and other content, today's virtual humans can generate personalized models with intelligent dialogue capabilities without a lot of manpower to support content production.
Through the combination of technologies of Shiyou Digital Human and ChatGPT+AIGC, the gap between the three-dimensional world and the real world will gradually dissolve, replaced by integration and interaction, and technology will bring a more diverse and imaginative future to the real world.
This will also bring disruptive changes to the digital world.
According to survey data, in 2022, nearly seventy percent of enterprises said that their company's live broadcast frequency has further increased, and 49.0% of enterprise customers said that their live broadcast frequency has increased significantly.
Today's corporate live streaming has long developed from an early forced online business to an important lever that runs through various scenarios such as medical care, education, finance, internal training, and external marketing, and promotes the digitalization of enterprises.
Reduce marketing costs and enrich marketing means; Break the limitations of time and space and establish extensive communication with users; Create private domain traffic and promote conversion; Collect, retain, open up user data... As an innovative application in the digital intelligence era, the new generation of technical creation capabilities represented by virtual anchors has subverted the traditional enterprise operation and marketing promotion model.
A virtual human has long been more than just an avatar, but a digital asset for a business. Who can occupy the leading edge in the direction of brand, technology, operation and scene, and build competitive barriers can occupy the first opportunity in the new round of AI technology frenzy and be invincible.