How "smart" can AI be now?

Full-size humanoid bionic robot with a height of 1.77 meters and a weight of 52 kilograms Photo/Reporter Li Na

At the ongoing 2023 Zhongguancun Forum, "artificial intelligence" is undoubtedly the hottest keyword. Whether it is autonomous driving or smart wear, quantum computing or 5G communication, or even carbon neutrality, many cutting-edge technologies are inseparable from the support of artificial intelligence technology. It can be said that in the next decade, artificial intelligence will continue to change all walks of life and the lives of ordinary people. In the forum international technology fair conference section, science fair exhibition section, and artificial intelligence related parallel forums, Beijing Youth Daily reporter noticed that major companies have brought the latest artificial intelligence scientific and technological achievements, including visual universal segmentation model SegGPT, 5G audio and video interactive applications, bilingual digital sapiens and so on.

New applications for 5G communication

Visual self-service brings new interactive experiences

According to the latest data, 5G users in mainland China have reached 561 million, and a total of 2.312 million 5G base stations have been built and opened in mainland China, accounting for more than 60% of the world. In the first quarter, the national average 5G download network speed was 334.98Mbps, and the peak download rate was 472.92Mbps. With such fast Internet speeds, coupled with the support of artificial intelligence, in addition to being used to swipe social networks and daily office learning, what else can it be used for?

China Unicom's "5G New Communication Intelligent Interaction Platform" exhibited this time applies the characteristics of 5G "large bandwidth, low latency, and universal connection", uses 5G audio and video interaction and AI atomic capabilities, and combines advanced technologies such as AR&VR, three-dimensional modeling, and intelligent interaction to achieve audio and video interactive applications under 5G endogenous services. The platform uses multimedia, 3D modeling, real-time tracking, sensing, intelligent interaction and other technologies to realize end-to-end visualization and intelligent new communication services, and provide 5G audio and video interaction, intelligent avatar and other functions for enterprises and governments.

For example, the financial industry application platform, users can remotely access bank counter services, enjoy the same experience and personal privacy protection as on-site processing; The energy industry application platform reconstructs the application system of the wellsite intelligent linkage system, and realizes the digital management of wellsite resources and the digital compilation of wellsite patrol. The transportation industry version platform provides barrier-free intelligent communication services based on 5G new communications, and provides visual and interactive smart elderly assistance services for elderly passengers.

It is worth mentioning that the platform has realized the adaptation of the platform and domestic mobile phone chips in terms of localization adaptation and independent control; The platform side supports localized systems, and the mobile phone side adapts to Huawei Kirin chips and MediaTek Dimensity chips, and supports Huawei, Xiaomi, OPPO, VIVO, and Meizu series domestic mobile phones.

"Digital Homo sapiens" are smarter

Integrate with large models to "understand your words"

Digital sapiens, simply put, is a virtual human, with the help of anthropomorphic appearance, the core of artificial intelligence, digital sapiens has begun to commercialize in many industries, assist human services, improve enterprise operation efficiency. For example, in industries such as finance, cultural tourism, media, public services, medical care, and retail, digital sapiens can play the roles of agent customer service, financial advisor, broadcast host, and tour guide. In cultural and entertainment scenarios, IP assets can be formed as virtual idols and virtual singers; In scenarios such as smart vehicles, smart transportation, and smart homes, it can be combined with smart devices to provide users with intelligent services.

Tencent Cloud Intelligent Small Sample Digital Homo sapiens production platform was recently released for the first time, only 3 minutes of real population broadcast video and 100 sentences of voice material, the platform can model and generate high-definition portraits in real time through audio and text multimodal data input, and produce "digital sapiens" similar to real people within 24 hours. Compared with digital humans generated by photos and only showing facial shapes, small sample digital Homo sapiens can design gestures based on text, and reproduce the real human style with lip movements, mouth shapes, and expressions.

At this year's Zhongguancun Forum, Beiqing News reporters also tried to use digital sapiens virtual anchors instead of real anchors to appear on camera, and carried out 7×24-hour live broadcast services, which attracted the attention of many audiences.

However, in the past, digital sapiens had significantly weaker thinking ability than real people. At the Zhongguancun Forum, the "Wisdom Spectrum AI Brain Digital Homo Sapiens" launched by Zhipu AI is more intelligent, it is no longer bound to a fixed way of interaction, but has the ability to understand the intention of human instructions. Zhipu AI is transformed from the technical achievements of the Department of Computer Science of Tsinghua University, and the company cooperated in the development of the bilingual 100 billion-level ultra-large-scale pre-training model GLM-130B in 2022, and led the construction of a high-precision general knowledge graph, organically integrating the two into a two-wheel-driven cognitive engine of data and knowledge, and based on this 100 billion pedestal model, ChatGLM was built. Through the cognitive big model, it connects hundreds of millions of users in the physical world, empowers the metaverse digital human, becomes the base of the embodied robot, and gives the machine the ability to "think" like a human. In addition, it is a bilingual number sapiens who can Chinese and English.

"Unmanned" on the streets

The latest pedestrian prediction model is on the horizon

Now in Yizhuang and other places, Baidu's self-driving vehicles can already be hit. In the future, with the development of technology and the approval of policies, the safety personnel on the car will be withdrawn, and the autonomous vehicle will be truly unmanned.

According to Baidu, the core of unmanned driving technology is the "Baidu Automotive Brain Apollo Platform", which includes four modules: high-precision map, positioning, perception, intelligent decision-making and control. The latest Apollo has evolved to introduce multiple deep learning-based models, release a low-speed pedestrian prediction model based on semantic maps, and introduce imitation learning based on semantic maps.

At the Zhongguancun Forum, Megvii Technology released a self-developed intelligent pallet four-way shuttle system. As a discrete device in the flexible logistics system, Megvii intelligent pallet four-way vehicle can realize "one vehicle running the whole warehouse". Why "flexible logistics"? Megvii said that mainly because it has the two characteristics of discrete equipment and distributed control, user enterprises can be like building blocks, flexible combination and flexible deployment according to needs. Secondly, flexibility is reflected in the "dynamic scalability" of the entire system, and user enterprises can increase or decrease the number of four-way vehicles at any time according to changes such as off-peak seasons and business growth, so as to improve the carrying capacity of the system.

Smart cities are lower carbon

AI "housekeeper" water, electricity, air conditioning are all managed

In the construction of smart cities, AI plays an increasingly important role. For example, AI can be used for infrastructure management in cities, such as automatically monitoring the structural health of roads, bridges, and buildings, as well as detecting and repairing cracks and potholes in roads; AI can help cities manage energy, for example by analyzing energy use data to achieve more efficient energy use, and optimizing cities' energy systems; AI can also help cities protect the environment, for example by improving the quality of their environment through air quality monitoring, waste disposal, and water management.

So, how to use AI to reduce carbon emissions in buildings to achieve the goal of carbon neutrality and carbon peaking? Henghua Digital Display based on building brain neural network system carbon management platform, from the perspective of making full use of clean energy, concentrated on the application of cost-effective technical products, throughout the building end sensing sensing node and the main energy-using equipment sensing perception node, through the building brain edge computing server unified coordinated management, so that the building energy-using equipment efficient operation, as far as possible to eliminate unnecessary energy waste, according to the edge computing model analysis, the energy consumption curve of each energy subsystem of the building is in a stable running state. Overall energy consumption is minimal.

Among them, building power energy consumption should account for the first place in building energy consumption, for the characteristics of building weak current system, on the basis of not increasing decoration construction, develop a set of smaller size, accurate measurement, convenient installation of a set of weak current monitoring and AI control system, can dynamically monitor the power system of the building, ensure that the unmanned area is cut off in time, and avoid unnecessary power waste. The energy consumption of building air conditioning system accounts for 40% of the total energy consumption of the building, and Henghua Digital has developed a strategic algorithm for the optimization of building cooling and heat source systems through in-depth cooperation with universities to establish industry-university-research bases, forming a mature data algorithm model, so that the energy saving rate of air conditioning system reaches more than 10%. At present, this project has been implemented in Guangdong, Tianjin, Jiangxi, Sichuan, Hubei, Anhui and other provinces. In the future, residential areas, office buildings, shopping malls, etc. will "evolve" in the direction of green and low-carbon.

AR glasses "simultaneous interpretation"

Smart wearables help with accessibility

With the integration of artificial intelligence into all aspects of life, devices equipped with artificial intelligence also tend to be miniaturized, such as smart watches that can answer calls, reply to WeChat, and monitor exercise. Smart glasses are shaped like ordinary glasses, and after wearing them, they can make and receive phone calls, listen to music, etc.

However, the smart glasses displayed at the Zhongguancun forum are more practical. Called "Bright Listener Smart Glasses", this is a binocular waveguide AR smart glasses.

VR glasses will immerse themselves in the virtual world after wearing, while AR glasses will not block the view, and they will integrate the real world with the virtual world to achieve some functions that cannot be done in the real world. If people with hearing impairment often encounter difficulties in the process of work, socialization and study due to "inaudible" and "inaudible" sounds, this glasses can convert sound information into text and display it in front of their eyes. It also has a simultaneous interpretation function, which can recognize the languages of different countries and convert them into Chinese characters or other countries' texts before presentation, helping users to easily understand in the context of international language communication. These glasses are lightweight and portable, weighing only 79g, compared to the 200-300g AR glasses currently on the market, their own weight is very suitable for long-term wear; It can also be adapted to lenses for nearsightedness, farsightedness, astigmatism, presbyopia, etc.; The outside of the glasses does not leak light, protecting privacy, and the content is only visible to yourself; The glasses are also equipped with millisecond-level real-time subtitles, noise reduction algorithms, accurate reception within 5 meters, and the translation accuracy can reach more than 95%. It is reported that the product has mass production capabilities.

Privacy-preserving computing technology is open source

It is used in finance, medical insurance and other fields

Privacy computing, also known as privacy-preserving computing, refers to a series of information technologies that analyze and calculate data under the premise of ensuring that the data provider does not disclose the original data, so as to realize the "usable and invisible" of data in the process of circulation and integration, so as to realize the transformation and release of data value. Privacy-preserving computing provides the protection capabilities that the industry will need in the future of private data. At the Zhongguancun Forum Exhibition (Science Fair), Ant Group announced for the first time the complete layout of open source with key basic software as the core, and all 9 core technologies are open source, including privacy computing technology "hidden language". In other words, this technology platform is open to global users, and can directly use product functions without calling and developing code, helping users explore privacy computing application scenarios at low cost.

According to reports, hidden words have been applied in scenarios such as finance, medical care, and insurance. For example, Shanghai Pudong Development Bank and Ant Group's hidden language platform identified more than 145,000 high-risk users and prevented billions of yuan in high-risk loans. In terms of healthcare, Ant Privacy Computing Platform and Alibaba Cloud Digital Healthcare team have built a data fusion platform for hospital operation and management, providing digital performance management analysis for managers, helping hospitals establish refined operation management systems, and reducing economic or clinical risks of hospitals. In addition, in the past, insurance institutions will obtain unnecessary raw data by inquiring the medical institution about the insured's diagnosis and treatment in plaintext (i.e., the data is not encrypted) during the claim settlement process. By setting data logic queries and using privacy computing technologies such as multi-party secure computing, Ant's solution enables insurance companies to only obtain the query results of whether claims are settled, and will not obtain various raw data, protecting the privacy of claims users.

sound

Large models will change the world while developing control technologies

ChatGPT is one of the most interesting new things in the tech world in 2023, and its release has also triggered a frenzy of language big models, with Baidu, Ali, Zhihu, SenseTime, JD.com and other companies launching their big models. Another major area of AI, visual GPT, also appeared at this year's Zhongguancun Forum: KLCII's vision team officially launched the universal segmentation model SegGPT, which is the first universal visual model that uses visual cues to complete arbitrary segmentation tasks.

According to reports, when SegGPT is used, the traditional thinking of language models is abandoned, and images are not used when interacting with machines. For example, when the user gives SegGPT a picture and circles the "rainbow" on it, when the user gives many pictures containing the rainbow, SegGPT can automatically identify the rainbow above and circle these parts. It can be said that SegGPT is "all-in-one": given one or a few example images and intent masks, the model can get the user's intent and complete similar segmentation tasks "in a sample". In addition, SegGPT also "one-touch communication": through a point or bounding box, it gives interactive prompts on the picture to be predicted and identifies the specified object on the split screen. Using this feature, many functions can be realized, such as when the robot manipulator goes to get tomatoes and other objects, the robot can quickly know where the edge of the tomato is, and can pick up the tomato without crushing, which is very accurate.

At present, the domestic large model is in a state of a hundred flowers blooming and a hundred schools of thought contending. Baidu founder, chairman and CEO Robin Li said at the Zhongguancun Forum that artificial intelligence has once again become the focus of human innovation, and more and more people recognize that the fourth industrial revolution is coming. He emphasized: "Big models have changed artificial intelligence, and big models are about to change the world." Dai Qionghai, academician of the Chinese Academy of Engineering and chairman of the Chinese Engineering Intelligence Society, also said that artificial intelligence will bring about application changes in many aspects: facing a new paradigm of scientific research (the origin of the universe, natural laws, and the mysteries of life); For people's life and health (AI drug development, remote virtual surgery); Oriented to the main economic battlefield (virtual creation, industrial manufacturing, spiritual interaction); For major national defense needs (multi-source situation analysis, AI ground-air front deployment) and so on.

It is worth noting that in the face of new changes, some people have also raised warnings. Kai-Fu Lee, chairman and CEO of Innovation Works, said, "AI will still make mistakes, it will be a serious nonsense, it can only be used to generate the first draft of content, develop ideas, not as the final version, AI needs continuous human intervention to avoid fallacies or disasters." In addition, there may be some legal and ethical issues with AI, so AI is not suitable for all fields and can only be used in applications with high fault tolerance. Kai-Fu Lee emphasized, "AI may create false information and may be used by criminals to deceive users, so when developing, it is necessary to study the laws and regulations that control the technology and management of AI." This edition / Wen Jing, reporter of this newspaper

Co-ordinator/Yu Meiying

How "smart" can AI be now?

Read on

Medical care transformed by artificial intelligence

What are the new applications of artificial intelligence→?

The first "International Forum on Artificial Intelligence and Sustainable Development" was successfully held in Beijing!

What is the so-called "computing power" of the AI engine?

Artificial intelligence models based on laboratory tests to accurately diagnose ovarian cancer: a multicenter, retrospective cohort study in China

Scientists use STEM datasets to evaluate neural network model foundations and accelerate the implementation of artificial intelligence

Artificial Intelligence Industry Weekly (April 22-April 28, 2024) - Zhiyan Consulting Release

General artificial intelligence, what kind of intelligence is it?

Experience the charm of artificial intelligence and feel the power of scientific and technological innovation

Meinian Health won the "2024 Forbes Chinese Artificial Intelligence Innovation Scenario Application Enterprise TOP10" award

Yang Yanqing: Artificial intelligence is an important new quality productivity

Running from the mountains to the sea, Wisdom Eye is an important step towards general artificial intelligence

Li Xuanhao, finally ushered in the battle of Tianwang Mountain, how strong is the name of "artificial intelligence"!

The first batch of 18!The list of typical cases of "artificial intelligence + higher education" application scenarios has been announced, have your universities been selected?

The road to simplicity: the reason for this round of artificial intelligence (AI) breakthroughs is actually "simple"

Nobel Laureate Spencer: Advancing the availability and diffusion of AI within countries and in the global economy