iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

OpenAI's press conference, which ended on May 14, unveiled its latest flagship model, GPT-4o, demonstrating the increasingly powerful capabilities of AI technology. Today, with the rapid development of artificial intelligence and the continuous emergence of large models, it coincides with the 1st anniversary of the release of the iFLYTEK Spark large model. In the past year, iFLYTEK Xinghuo has brought many surprises and changes to virtual humans.

(Image generated by iFLYTEK Xinghuo)

Virtual humans are essentially a digital simulation of people, with three characteristics: appearance, behavior, and thought. The implementation of these features relies on the integration of a series of advanced technologies, such as image recognition, 3D modeling, motion capture, natural language processing, computer vision, etc. At present, the empowerment of virtual humans by the iFLYTEK Xinghuo large model is reflected in the following aspects.

(1) Image customization is lighter: second-level construction

Relying on the Spark model, iFLYTEK has launched the "second-level sound/image construction" function, which can realize the rapid production of sound and image!

iFLYTEK Zhizuo "second-level sound & image construction" function page

It only takes less than 10 seconds to extract the appearance features, voice characteristics and other elements through the AI algorithm, and the system can generate a personal "digital clone" in a very short time. At the same time, it also supports self-training and standard training of image models to meet the application needs of virtual humans in different scenarios.

Virtual anchor "An Xiaojia" generated based on real anchors

Virtual host "Xiaojun" generated based on a live streamer

Professor Wang Jinhuan of Heilongjiang University of Traditional Chinese Medicine "Digital Doppelganger"

A variety of scenarios such as education and training, media communication, scientific and technological services, customer service guides, and short video production involve different content needs, and iFLYTEK can meet them well.

(2) Behavior-driven more realistic: hyper-anthropomorphic voice + AI-generated action

The Xinghuo voice model released on January 30 can achieve super-anthropomorphic dialogue, and the sound effect is close to the oral expression state of human daily life, with paralinguistic abilities such as breathing, sighing, changing the speed of speech, pausing and thinking, light and heavy reading, and modal words (um, ah). In addition, the perception of emotions of the large model reaches 85%+, which can more vividly express emotions such as happiness, apologies, coquettishness, and confusion.

At present, the super anthropomorphic voice has been launched on iFLYTEK Zhizuo, including "Ling Xiaoqi", "Ling Xiaoshan", "Ling Yuyan", "Ling Yuzhao", and "Ling Feizhe" 5 male and female voice talents. Whether it's a daily chatter or a complex and professional Q&A consultation, such a voice can better express personality and emotions.

Hyper-anthropomorphic voice content is more realistic

In addition to sound, movement is also a key element of virtual human interaction. With the support of large model technology, it can deeply understand semantic text, automatically match and generate actions, making virtual human movements more natural, smooth, realistic, and more vigorous.

Diverse postures and richer scenes

AI-generated actions make interactions more natural

At present, iFLYTEK has launched a variety of virtual human avatars, supporting AI-generated actions, and matching scene-based video templates to make the content effect closer to the real scene.

(3) Interaction brain consciousness: the re-evolution of virtual human intelligent interaction machine

The upgrade of virtual interaction means that the communication and interaction between users and virtual humans are more natural, efficient and intelligent.

As an intelligent device that integrates advanced speech recognition, natural language processing and machine learning technologies, the virtual human intelligent interactive machine continuously upgrades its intelligent perception ability, semantic understanding ability and emotional expression ability with the blessing of the Xinghuo large model, making the "face-to-face" communication and Q&A between virtual humans and users more effective and open.

At present, the intelligent interactive machine has been widely used in many fields such as finance, government affairs, cultural tourism, commerce, and exhibitions. It can be seen in scenic spots such as the Old Summer Palace, Mingzhongdu, Luogang Park, and major occasions such as the National Two Sessions, the Beijing Winter Olympics, and the Chengdu Universiade.

The virtual tour guide of Mingzhongdu Ruins Park can guide the scenic spot

The virtual tour guide of the Old Summer Palace Ruins Park is cute to popularize knowledge

iFLYTEK created the Chengdu Universiade virtual volunteer Xiaofu

The virtual human intelligent interactive machine was unveiled at the 2023 World Artificial Intelligence Conference

Beijing Winter Olympics virtual volunteer Aijia conducted multilingual interactive inquiries

The advanced Spark model brings an all-round improvement to the virtual person, not only in terms of external image, language and action, but also in the upgrade of the virtual person's interaction ability, the enhancement of the virtual person's "autonomous consciousness", and then leading the "new consciousness" of the virtual person.

As a representative of new quality productivity, iFLYTEK has always adhered to the practice of artificial intelligence +, so that virtual humans can become human partners.

iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

Read on

Podcast Update|First Voter for MiniMax: MiniMax, GenAI Conference, and Big Model Playing Cards

CVPR 2024|Only one language model is needed to generate high-quality 360-degree scenes from image diffusion models

The Digital Transformation Maturity Model and Assessment was released

3 types of children "will be abolished as soon as the test is taken", Dr. Tsinghua's iron triangle model will help you become a master of the exam

绝对新鲜实惠图源：archiminibricks#乐高 #乐高MOC #积木#模型#大人也要玩玩具

Development Trend of Large Models: Multimodal, Autonomous Intelligence, Edge Intelligence...

The effect is benchmarked against Sora's domestic AI video application, and the large model of Kuaishou video generation can be unveiled

Altman talks about the opportunities, challenges and human self-reflection of AI: China will have a unique large language model

In-depth report on the artificial intelligence industry - after the "first year" - let's look at the commercialization progress of large model applications

谷歌 Pixel 手机获功能更新：Pixel 8(a) 可用 Gemini Nano 模型

Dai Weijin, Executive Vice President and General Manager of the IP Business Unit of VeriSilicon: Large models are entering the edge and end side, and mobile phones, PCs and automobiles are the main forces. VeriSilicon's CPU, GPU, NPU, V

19 Best Large Language Models in 2024

One of the top 10 models for data analysis: the funnel model

Who is the cockpit ceiling of new energy vehicles? The HarmonyOS cockpit is famous, but a new challenger has emerged! #智能座舱#6月12日, Great Wall Motor announced CoffeeO

Summary of today's bidding board (June 13) The structure of the 1-3rd daily line is under great pressure, and the bidding is flawed or suspected of being tempting, so they did not enter the market, but in the end they were all closed. The bid on the 4th is acceptable,

The "price war" of large models has begun, and the AI industry has ushered in great changes?