Phantom future global chasing light (15)丨Will androids dream of electronic sheep? What will digital humans evolve into? Interview with SenseTime

Do androids dream of animatronic sheep? This is the science fiction masterpiece of "science fiction genius" Philip K. Dick, and it is also the human inquiry and imagination of artificial intelligence.

From October 18th to 22nd, the 81st 2023 World Science Fiction Convention will be held in Chengdu. On the eve of the conference, Red Star News and Daily Economic News jointly launched the large-scale media interview report of "Phantom Future, Global Chasing Light", pursuing the light of science and technology and dreams shared by different civilizations behind the transformation of science fiction into reality.

SenseTime, as a Chinese intelligent software company, will also participate in the World Science Fiction Convention. The SenseTime Ronin application platform developed by SenseTime takes digital human video generation technology as the core and has a variety of AI generation capabilities, including text generation, speech generation, motion generation, image generation, NeRF, etc. Red Star News recently interviewed Luan Qing, General Manager of Digital Entertainment Division of SenseTime's Digital Space Business Group, to discuss the present and future of artificial intelligence.

Phantom future global chasing light (15)丨Will androids dream of electronic sheep? What will digital humans evolve into? Interview with SenseTime

Luan Qing Photo courtesy of interviewee

If it rises to a philosophical point of view

It's hard to say whether robots will become self-aware

Red Star News Reporter: Will AI digital humans dream of electronic sheep?

Luan Qing: This question is quite science fiction. From my understanding, the current large model or a series of artificial intelligence technologies that simulate the human brain are generally believed to have not yet produced self-awareness, and are the aggregation and deduction of data, rather than a certain form of self-awareness.

If it rises to a philosophical point of view, what is self-awareness? In fact, it is the deduction of brain structure after information processing. From this perspective, it is difficult to explain whether robots develop self-awareness. The physical structure of artificial intelligence is simulating the brain, and the surplus electrical signals can also operate in the future, and it cannot be said that this situation will not happen in the future. But now, AI exists for human purposes.

Red Star News Reporter: Digital human/virtual human/bionic person, what is the professional technology behind these names?

Luan Qing: Digital human technology includes several aspects, on the one hand, human-computer interaction, that is, digital humans use human methods to speak, move, and express, simulating the perception and experience of human interaction. This includes two main technologies, one is the production of humanoid video, and the other is the use of AI to generate human voices.

In addition to human-computer interaction, another technology is the simulated brain, which is getting more and more attention in the future. In addition to anthropomorphism, the brain of digital people is very powerful, and the computing power is stronger than that of ordinary human brains. It can naturally experience people's feelings, and can also process and calculate information, give the best response, and even provide emotional value.

Red Star News Reporter: SenseTime divides digital humans into five levels: L1 to L5, and collectively refers to L4 and L5 digital humans as "AI digital humans". What is the most complex interaction that SenseTime's digital humans can complete today? What is the technical difficulty behind it?

Luan Qing: At present, one of the most commonly used digital humans is the interface module of human-computer interaction, which is used to generate videos, live broadcasts, and display information and content in a humanized way.

With the breakthrough of large models, it is now time for the "assisted driving" stage. Because the content generated by the large model still needs to be reviewed and adjusted by people, it is not "automatic driving" or "assisted driving". This is between L3 and L4, which produces full content, but needs to be fixed. The field of short video and live broadcast, which is now commonly used, is between L3 and L4, and is the largest application.

Another customer service scenario application is more L4 stage, reaching information-level interaction. For example, now open the ICBC APP, switch to the digital human mode, and all businesses can directly interact with the digital human customer service in the APP. The experience of this scene is L4, but there is still a certain gap in intelligence, so the next step of digital humans wants to reach the real L4, and even develop to L5, and they need technological breakthroughs.

Including now the big model is much more powerful than before, not like before it was stupid, now very smart. But emotional interaction, providing emotional value is still lame, there is no natural to how to communicate can not be distinguished.

There are three points to do in this technological breakthrough, one is that digital people need to be more deeply integrated with the industry. The knowledge, habits, and technical information in the industry field also need professional large models to help understand.

In addition to data opening, the second step is to open the interface. For example, if the operation understands to do this, can the system actually do it? To apply for a credit card, if there is no interface connected to the bank to open a credit card, you cannot get a physical credit card, which requires the interface to be opened.

Having done these two things, there is still something to consider. For example, digital humans can now make medical advice, but can't actually prescribe medicine. Logically, in terms of authority and responsibility, it cannot be done. Digital people can only give advice in some industries, but cannot practice.

Now that the industry has reached hundreds of billions of parameters, by the time GPT 4, it may reach trillion-level parameters, and digital humans can interact more naturally in terms of emotional value. It is not clear how this stage needs to be achieved, whether to modify the network structure, or increase the computing power and the number of network nodes, which is the core breakthrough point that is still being studied.

Red Star News Reporter: Hundreds of billions and trillions of parameters, do they refer to the density of data?

Luan Qing: It is the number of nodes in the model, which can be considered as neurons that simulate the brain, and the human brain should be at the trillion level. So theoretically, the current GPT 4 has reached the parameter level of the human brain. But from the perspective of intelligence, there is still a gap with the human brain.

After the big model breaks through

A dozen seconds of material can make a digital human

Red Star News: SenseTime introduced that AI digital humans are mainly used in the three directions of virtual idols, virtual customer service and super assistants.

Luan Qing: These three application scenarios were the most used applications of digital humans in previous years, in fact, today, the biggest applications of digital humans are short videos and live content generation.

Now many short videos, people do not know that they are made by digital people. For example, the female anchor shows Burger King's signature set meal in the live broadcast room; Short videos for hiring electricians and more. There are also professionals, lawyers, doctors, teachers who use digital humans to generate some content.

Red Star News Reporter: Digital humans are more widely used because of what upgrades have occurred in technology?

Luan Qing: After the emergence of large models, the core value is that they can be mass-produced, and production has become very simple.

4. 5 years ago, the amount of data required to make a digital human was relatively large, generally more than ten hours of video material, and at the same time needed to meet the requirements of multiple angles and actions, and the effect would be found stiff after the production was completed. At that time, many TV stations used digital human anchors in their daily news reports, especially breaking event reports, which was very valuable. However, due to the difficulty and cost of production, it cannot be promoted in the general public marketing scenario, and it is difficult to form a scale effect.

Now after the breakthrough of large models, the production of digital humans has become much easier, and a digital human can be made in a dozen seconds of material. In the past two years, technology has been continuously improving, last year, the year before last, it took three or five minutes, this year one or two minutes, or even tens of seconds.

Red Star News Reporter: What do customers hope to improve in the digital people that SenseTime provides for all walks of life?

Luan Qing: There are many demands, on the one hand, it is richer performance, and on the other hand, it is to run on lighter equipment.

Manifestations include freedom to do movements? Can you dance? Can the actions that are not entered be richer? Can it be directly AI to generate digital people, without looking for a directory, there is no copyright problem.

Recently, it has often been said that it is possible to make digital humans run on any device? Now many are still running on better hardware devices, or running in the cloud, the customer feels that it is too expensive, can he run on his own mobile phone?

The technical support behind it includes chip adaptation and performance optimization. The process of technology pushing into productization is to continuously apply to more scenarios and more complex conditions. In the end, it is still a test of the complexity of AI video generation, which is also the next hurdle that I think artificial intelligence will pass.

Red Star News: Imagine the future, what can we expect digital humans to evolve into in 5 years?

Luan Qing: Now film directors often tell me, when can digital people realize the script and generate a film?

Now some so-called digital people star, or just "change faces", that is, after the human performance, the face is painted green screen. This is actually not cost-saving, it is a gimmick. I think what the industry should really do is to make some content completely AI, shorten the production time, and reduce the cost of trial and error.

At present, movie-level digital people are still facing great challenges, and we are also making preliminary attempts with some stars, and found that there is hope in the field of short videos and short dramas, but the real high-quality screen has not yet broken through. At present, we are working animated films, and through artificial intelligence technology, we can convert live-action content into specific style animations, which I think is the most promising in a short period of time.

Red Star News reporter Cheng Luyang Yu Yao

Edited by Yu Dongmei

(Download Red Star News, there is a prize!) ）

Phantom future global chasing light (15)丨Will androids dream of electronic sheep? What will digital humans evolve into? Interview with SenseTime

Read on

Practical operation of a new generation of bills: how to transfer the electronic acceptance of the Agricultural Bank of China to others, a must-read for novices

24 electronic devices that have disappeared, compare them, which one have you used?

In the next city of automotive electronics, sensors such as lidar and 4D imaging radar will maintain rapid growth in the future

Shocking, the blogger was surrounded by fake "ghost scales" and robbed of mobile phones, and the vendors grabbed shrimp and fell to the ground to move the electronic scales

Xiaomi crowdfunding ushered in a phenomenal health explosion! Mijia smart electronic blood pressure monitor crowdfunding only costs 199

A batch of e-cigarette cartridges and cigarettes were seized

An employee of an electronics factory was pulled by the hair of a South Korean female boss and fell to the ground: all employees knelt down

Foreign companies have developed cognitive-assisted electronic dogs, which wag their tails and even have realistic heartbeats when touched

The longest range is 510km, with electronic pocket gear + DC fast charging

Five-tone therapy: listen to electronic Chinese medicine and enjoy cyber life

Apple, Samsung Electronics, Sony Group, Panasonic, Daikin, Electrolux and other 16 of the world's largest electronics and home appliance companies in the first quarter of 2024 financial report summary

High blood pressure, would you use an electronic blood pressure monitor? Do you know the essentials of home self-monitoring for blood pressure?

A card in hand, warm waiting! @校友们: Get your e-Alumni Card now!

The daily work after graduating from the electronic information engineering major

Singapore's Ministry of Health: Strengthen the penalties for e-cigarette-related violations and subsidies for the disabled to be implemented according to income

The charging standard of electric bicycle charging pile is a bit messy! It involves Xiangchong Cloud, Qianrun Technology, Quanlai Call, etc

Climb to "high" and "green" - the development of Southern New Material Technology Co., Ltd

Jin Chen's "Celebrating More Than Years 2" overturned, and he was complained about a sense of science and technology, and the original actor's comment area fell

"Technology + Culture" featured IP helps the high-quality development of cultural and tourism integration in Xixian New Area

Giti Tire: T5 Qizhi Technology was launched

【6·18】Table tennis black technology equipment inventory, which one do you like the most? - National Ball Collection

The chairman of the "first share of energy storage" Peneng Technology was placed on file and retained, and some shareholders or accurately reduced their holdings and cashed out before the stock price plummeted

Technology giants are actively deploying in the field of AI! Microsoft will hold its annual developer conference, or reveal plans for AI PCs

雾麻科技宣布四位管理层任命 Jim McCormick为首席财务官

Tech giants are betting heavily on the AI war

FAW-Volkswagen's Range Cruise, IQ Technology, is the leader of the 5-seater SUV with 274,900 units

Technology helps the disabled, and their lives are just as exciting

When the construction of key projects is underway|Jiejia Technology's new plant started: innovation and iteration to promote the upgrading of the warp knitting industry

Technology and ruthless DH World Cup equipment appreciation

The performance of consumer electronics can be expected in the future! The logic of Yingli's daily limit was exposed

Pharmaceutical innovation, the deep driving force of science and technology to help the disabled

Technology empowers and asks for food from facility agriculture