laitimes

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

author:Half two finances

At the ongoing 2023 Zhongguancun Forum, "artificial intelligence" is undoubtedly the hottest keyword. Whether it is automatic driving or smart wear, whether it is quantum computing or 5G communication, or even carbon neutrality, many cutting-edge technologies are inseparable from the support of artificial intelligence technology, it can be said that in the next decade, artificial intelligence will continue to change the lives of all walks of life and ordinary people.

At the forum's International Technology Fair Conference, Science Fair's exhibition section, and AI-related parallel forums, the Beijing Youth Daily reporter noticed that major companies have brought the latest artificial intelligence scientific and technological achievements, including visual universal segmentation model SegGPT, 5G audio and video interactive applications, bilingual digital sapiens, AI building housekeeper, AR glasses, real-time display dialogue, etc... They undoubtedly represent the "outlet" of the present and the direction of the future. Industry insiders also said that artificial intelligence is profoundly changing this era, and artificial intelligence will bring about many application changes in the future. However, large models and the like may also create false information, which can be used by bad actors to deceive or brainwash users. Therefore, it is also necessary to study the technology that controls it and the laws and regulations that govern AI when developing.

New Applications of 5G Communication Visual self-service brings a new interactive experience

According to the latest data, 5G users in mainland China have reached 561 million, and a total of 2.312 million 5G base stations have been built and opened in mainland China, accounting for more than 60% of the world. In the first quarter, the national average 5G download network speed was 334.98Mbps, and the peak download rate was 472.92Mbps. With such fast Internet speeds, coupled with the support of artificial intelligence, what can we do in addition to social networks and daily office learning?

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

The "5G New Communication Intelligent Interaction Platform" exhibited by China Unicom this time applies the characteristics of 5G "large bandwidth, low latency, and pan-connection", uses 5G audio and video interaction and AI atomic capabilities, combined with advanced technologies such as AR&VR, three-dimensional modeling, and intelligent interaction, to achieve audio and video interactive applications under 5G endogenous services. The platform uses multimedia, 3D modeling, real-time tracking, sensing, intelligent interaction and other technologies to realize end-to-end visualization and intelligent new communication services, and provide 5G audio and video interaction, intelligent avatar and other functions for enterprises and governments.

For example, the financial industry application platform provides innovative capabilities such as visual self-service, view information collection, and 5G remote counter, which means that through this platform, users can remotely access bank counter services and have the same experience as on-site processing, as well as personal privacy protection.

In addition, the energy industry application platform reconstructs the application system of the wellsite intelligent linkage system, realizes the digital management of wellsite resources and the digital compilation of wellsite patrol, and helps energy customers digitalize. Based on 5G new communication, the transportation industry version platform provides barrier-free intelligent communication services, provides visual interaction and intelligent elderly assistance services for elderly passengers, solves multi-scenario communication problems, and provides intelligent one-click inbound visual services.

In particular, this platform realizes the adaptation of the platform and domestic mobile phone chips in terms of localization adaptation and independent control; The platform side supports localized systems, and the mobile phone side adapts to Huawei Kirin chips, MediaTek Dimensity chips, and supports Huawei, Xiaomi, OPPO, VIVO, and Meizu series domestic mobile phones.

"Digital Homo sapiens" is smarter and can understand your words when integrated with large models

Digital sapiens, simply put, is a virtual human, with the help of anthropomorphic appearance, the core of artificial intelligence, digital sapiens has begun to commercialize in many industries, assist manual services, improve enterprise operation efficiency. For example, digital humans can serve as agent customer service, financial advisors, broadcast hosts, and tour guides in industries such as finance, cultural tourism, media, public services, healthcare, and retail. In cultural and entertainment scenarios, IP assets can be formed as virtual idols and virtual singers; In scenarios such as smart vehicles, smart transportation, and smart homes, it can be combined with smart devices to provide users with intelligent services. With the continuous expansion of the application boundary of digital humans, the industrial value is also expanding.

Not long ago, Tencent Cloud Intelligent Small Sample Digital Homo sapiens production platform was released for the first time, only 3 minutes of real broadcast video and 100 sentences of voice material, the platform can model and generate high-definition portraits in real time through audio and text multimodal data input, and produce "digital sapiens" similar to real people within 24 hours. Compared with 2D live-action boutique digital humans, small sample digital sapiens do not need professional studio recording materials, and the cost is lower; Compared with digital humans generated by photos and only showing facial shapes, small sample digital Homo sapiens can design gestures based on text, and reproduce the real human style with lip movements, mouth shapes, and expressions.

At this year's Zhongguancun Forum, Beijing Youth Daily also tried to use digital sapiens virtual anchors instead of real anchors to appear on camera, and carried out 7*24-hour live broadcast services, which attracted the attention of many audiences.

However, in the past, digital sapiens only had simple interaction skills, and their thinking ability was significantly weaker than that of real people. At the Zhongguancun Forum, the "Zhipu AI Brain Digital Homo Sapiens" launched by Zhipu AI is more intelligent, it is no longer limited to a fixed way of interaction, but has the ability to understand the intention of human instructions.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

It is reported that Zhipu AI is transformed from the technical achievements of the Department of Computer Science of Tsinghua University, and the company cooperated in the development of the bilingual 100 billion-level ultra-large-scale pre-training model GLM-130B in 2022, and led the construction of a high-precision general knowledge graph, organically integrating the two into a two-wheel-driven cognitive engine of data and knowledge, and based on this 100 billion pedestal model, ChatGLM was built. Through the cognitive big model, it connects hundreds of millions of users in the physical world, empowers the metaverse digital human, becomes the base of the embodied robot, and gives the machine the ability to "think" like a human. In order to make the way of thinking of digital sapiens more like humans, ChatGLM refers to the design ideas of ChatGPT, injects code pre-training into the 100 billion pedestal model GLM-130B, and realizes the alignment of human intentions through supervised fine-tuning and other technologies. In addition, it is a bilingual number sapiens who can Chinese and English.

Smart cities are more low-carbon AI "housekeepers", water, electricity, and air conditioners are all managed

In the construction of smart cities, AI plays an increasingly important role. For example, AI can be used for infrastructure management in cities, such as automatically monitoring the structural health of roads, bridges, and buildings, as well as detecting and repairing cracks and potholes in roads; AI can help cities manage energy, for example by analyzing energy use data to achieve more efficient energy use, and optimizing cities' energy systems; AI can also help cities protect the environment, for example by improving the quality of their environment through air quality monitoring, waste disposal, and water management.

So, how to use AI to reduce carbon emissions in buildings to achieve the goal of carbon neutrality and carbon peaking? Henghua Digital demonstrated the carbon management platform based on the building brain neural network system at the Zhongguancun Forum.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

This platform from the perspective of making full use of clean energy, focusing on the application of cost-effective technical products, throughout the terminal sensing perception node of the building and the main energy-using equipment sensing perception node, through the unified coordinated management of the building brain edge computing server, so that the building energy-using equipment runs efficiently, as far as possible to eliminate unnecessary energy waste, according to the analysis of the edge computing model, the energy consumption curve of the building's energy-using subsystem is in a stable running state, and the overall energy consumption is the lowest.

Among them, the power consumption of the building should account for the first place in the energy consumption of the building, according to the characteristics of the weak current system of the building, on the basis of not increasing the decoration construction, a set of weak current monitoring and AI control system with smaller size, accurate measurement and convenient installation can dynamically monitor the power system of the building to ensure that the power is cut off in time in the unmanned area and avoid unnecessary power waste. Building air conditioning system energy consumption accounts for 40% of the total energy consumption of the building, and air conditioning system energy saving has always been a problem, energy consumption remains high, Henghua Digital through the establishment of industry-university-research base with universities, in-depth cooperation to develop a strategic algorithm for the optimization of building cooling and heat source system, through the long-term training of the algorithm model, the formation of a mature data algorithm model, combined with the application of AI artificial intelligence, so that the energy saving rate of the air conditioning system reaches more than 10%. Buildings as individuals can not only operate independently, just like an organ of the human body, but also as a unit in the neural network, working together with other building brains, so as to organically form a whole and play a greater role. At present, this project has been implemented in Guangdong, Tianjin, Jiangxi, Sichuan, Hubei, Anhui and other provinces. In the future, the communities we live in, the office buildings where we work, and the shopping malls we shop will evolve in the direction of green and low-carbon.

"Unmanned" on the street The latest pedestrian prediction model is on the horizon

Now in Yizhuang and other places in Beijing, Baidu autonomous vehicles can already be called, and after the user calls like an ordinary ride-hailing car, the autonomous driving car will arrive in time, without manual intervention, and automatically take the passenger to the destination. Although there are staff members in the car, he is not the driver but only a safety officer, and under normal circumstances, his hands do not touch the steering wheel. In the future, with the development of technology and the approval of policies, safety officers will also be withdrawn, and autonomous vehicles will be truly unmanned.

According to Baidu, the core of unmanned driving technology is the "Baidu Automotive Brain Apollo Platform", which includes four modules: high-precision map, positioning, perception, intelligent decision-making and control. The latest Apollo has evolved to introduce multiple deep learning-based models, release a low-speed pedestrian prediction model based on semantic maps, introduce imitation learning based on semantic maps, and it can be said that how advanced artificial intelligence is, how advanced is autonomous driving.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

At the Zhongguancun Forum, Megvii Technology released a self-developed intelligent pallet four-way shuttle system. As a discrete device in the flexible logistics system, Megvii intelligent pallet four-way vehicle can realize "one vehicle running the whole warehouse". It has the advantages of high flexibility, strong site adaptability, energy saving and environmental protection, and large space for capacity improvement, which meets the logistics storage, handling and picking needs of customers' pallets. Why "flexible logistics"? Megvii said that mainly because it has the two characteristics of discrete equipment and distributed control, user enterprises can be like building blocks, flexible combination and flexible deployment according to needs. Unlike AS/RS stacker cranes that can only operate on a fixed path, the four-way car system is standardized due to its hardware products, namely four-way vehicles, and can be replaced with new vehicles at any time in the event of failure. Secondly, flexibility is reflected in the "dynamic scalability" of the entire system, and user enterprises can increase or decrease the number of four-way vehicles at any time according to changes such as off-peak seasons and business growth, so as to improve the carrying capacity of the system.

AR glasses display conversations in real time Smart wearable devices help with barrier-free

In fact, artificial intelligence has long been integrated into all aspects of life, and devices equipped with artificial intelligence are also miniaturizing, such as smart watches have replaced traditional mechanical watches and become the standard for many people, they answer calls, return WeChat, monitor sports through smart watches, etc.; Another example is smart glasses, which are shaped like ordinary glasses, and can make and receive phone calls and listen to music after wearing.

However, the smart glasses displayed at the Zhongguancun forum are more practical. Called "Bright Listener Smart Glasses", this is a binocular waveguide AR smart glasses. Unlike VR glasses, which immerse themselves in the virtual world after wearing, AR glasses do not block the line of sight, but integrate the real world with the virtual world to achieve some functions that cannot be done in the real world.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

Hearing impaired people often encounter difficulties in working, socializing, and learning because they "can't hear" and "can't hear" sounds, and listeners provide them with a new auxiliary tool to help hearing impaired users understand what others are saying by watching the text in front of them. After the user wears glasses, the sound information can be converted into text and displayed in front of the eyes.

It also has simultaneous interpretation function, which can recognize the languages of different countries, and at the same time convert Chinese characters or other countries' characters to present in front of the user's eyes, helping users to easily understand what others are saying in the context of international language communication. Scenarios such as international business meetings, travel abroad, and cross-border trade work can play an important role.

This glasses are light and portable, the body weighs only 79g, compared to the 200-300g AR glasses currently on the market, its own weight is very suitable for long-term wear, plus the temple curvature can be adjusted, comfortable to wear; It can also be adapted to lenses for nearsightedness, farsightedness, astigmatism, presbyopia, etc.; The outside of the glasses does not leak light, protecting privacy, and the content is only visible to yourself; The glasses are equipped with millisecond-level real-time subtitles, noise reduction algorithms, accurate radio within 5 meters, and the translation accuracy can reach more than 95%. It is reported that the product has mass production capabilities.

Privacy protection computing technology open source is used in finance, medical insurance and other fields

What is Privacy Computing Technology? The most classic is the famous millionaire problem proposed by Yao Zhizhi, a Turing Award winner and founder of Tsinghua University's Yao class, in the "Secure Computing Agreement": Zhang San and Li Si are both rich, but the property is not disclosed, and do not trust a third party, two people want to compare who is richer without announcing the specific amount of property, at this time privacy computing (multi-party secure computing) is used.

Privacy computing, also known as privacy-preserving computing, refers to a series of information technologies that analyze and calculate data under the premise of ensuring that the data provider does not disclose the original data, so as to realize the "usable and invisible" of data in the process of circulation and integration, so as to realize the transformation and release of data value. Privacy-preserving computing provides the protection capabilities that the industry will need in the future of private data.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

At the Zhongguancun Forum Exhibition (Science Fair), Ant Group announced for the first time the complete layout of open source with key basic software as the core, and all 9 core technologies are open source, including privacy computing technology "hidden language". In other words, this technology platform is open to global users, and can directly use product functions without calling and developing code, helping users explore privacy computing application scenarios at low cost.

According to reports, hidden words have been applied in scenarios such as finance, medical care, and insurance. For example, Shanghai Pudong Development Bank and Ant Group's hidden language platform carried out credit risk management based on the risk model of multi-party security calculation, identified more than 145,000 high-risk users, and prevented the issuance of billions of high-risk loans. In terms of healthcare, under the background of medical insurance payment reform, Ant's privacy computing platform and Alibaba Cloud's digital medical team have built a data fusion platform for hospitals for hospital operation and management. The platform uses intelligent algorithms, including image recognition, knowledge graph, text mining and other technologies, to dynamically standardize the clinical behavior of the entire medical care, provide digital performance management analysis for managers, help hospitals establish refined operation management systems, and reduce economic or clinical risks of hospitals. In addition, insurance institutions will obtain unnecessary raw data by querying the medical institution about the insured's diagnosis and treatment in plaintext (i.e., the data is not encrypted) during the claim settlement process. Ant's solution connects insurance companies to some data interfaces, and by setting data logic queries, using privacy computing technologies such as multi-party secure computing, insurance companies can only obtain the query results of whether claims are settled, and will not obtain various original data, so as to realize that the data is available and invisible (not out of the domain) and protect the privacy of claims users.

Large models will change the world, and control technology should be studied at the same time as they are developed

ChatGPT is one of the most interesting new things in the tech world in 2023, and its release has also sparked a frenzy of language big models. Not to be outdone, domestic researchers have launched their large models by Baidu, Alibaba, Zhihu, SenseTime, JD.com and other companies. When will GPT moment come for another big field of AI – vision? At this year's Zhongguancun Forum, KLCII's vision team also officially launched the universal segmentation model SegGPT, which is the first universal visual model that uses visual cues to complete arbitrary segmentation tasks.

Bilingual digital sapiens, AI building housekeeper... How will artificial intelligence affect life?|2023 Zhongguancun Forum

According to reports, SegGPT is a derivative model of KLCII's universal vision model Painter, which is optimized for the goal of dividing all objects. After SegGPT training is completed, no fine-tuning is required, just provide examples to automatically reason and complete the corresponding segmentation tasks, including instances, categories, components, outlines, text, faces, and so on in images and videos.

When using, abandon the traditional thinking of the language model, and use images when interacting with the machine in the visual model. For example, the user gives SegGPT a picture and circles the "rainbow" on it, and when the user gives many pictures containing the rainbow, SegGPT can automatically identify the rainbow above and circle these parts. The user roughly circles the planetary ring belt with a brush, and SegGPT can accurately output the planetary ring band in the target image in the prediction map.

So, how does SegGPT do it? According to reports, SegGPT unifies different segmentation tasks into a common context learning framework, unifying various data forms by converting various segmentation data into images of the same format.

Specifically, the training of SegGPT is defined as a contextual coloring problem with a random color map for each data sample. The goal is to accomplish various tasks based on the context, rather than relying on a specific color. After training, SegGPT can perform arbitrary segmentation tasks in images or videos through contextual inference, such as instances, categories, components, outlines, text, etc.

It can be said that SegGPT is "all-in-one": given one or a few example images and intent masks, the model can get the user's intent and complete similar segmentation tasks "like a sample". Users can identify and segment similar objects in batches by marking and identifying a type of object on the screen, whether in the current screen or other screens or video environments.

In addition, SegGPT also pushes "with one touch": through a point or bounding box, interactive prompts are given on the picture to be predicted, and the specified object on the segmentation screen is recognized. With this feature, many functions can be realized.

For example, on the robot's manipulator, if the recognition is not clear, then when taking tomatoes and other objects, it may be crushed; However, by splitting, the robot can quickly know where the edges of the tomatoes are, and can pick up the tomatoes without crushing, which is very accurate.

At present, the domestic large model is in a state of a hundred flowers blooming and a hundred schools of thought contending. Baidu founder, chairman and CEO Robin Li also said at the Zhongguancun Forum that because of large computing power, large models and big data, "intelligence emerges". What is intelligence emergence? In the past, artificial intelligence was that I wanted the machine to teach it whatever skills I wanted. It is possible to teach it, but it is not to be taught. After the emergence of "intelligence emergence" in large models, skills that have not been taught before, it will also be. At the same time, the development direction of artificial intelligence has changed from discriminative to generative. Generative AI will greatly improve work efficiency. Some research institutions believe that in the next 10 years, the efficiency of knowledge workers can be increased by 4 times.

Li Yanhong said that in recent times, artificial intelligence has once again become the focus of human innovation, and more and more people recognize that the fourth industrial revolution is coming, and this revolution is marked by artificial intelligence. "It's in the spotlight because of the big model. The big model successfully compresses human cognition of the whole world, allowing us to see the path to achieve general artificial intelligence. Li Yanhong said, "At present, we are at a new starting point, this is a new era of artificial intelligence with big models as the core, big models have changed artificial intelligence, and big models are about to change the world." ”

Dai Qionghai, academician of the Chinese Academy of Engineering, chairman of the Chinese Engineering Intelligence Society, and dean of the School of Information Science and Technology of Tsinghua University, said at the Zhongguancun Forum Artificial Intelligence Large Model Development Forum that artificial intelligence is profoundly changing this era. For example, machine translation has replaced most human translation, speech recognition has replaced most human listening, face recognition has become a common mode in the security field, and automatic driving can already drive on urban roads... In the future, artificial intelligence will support the development of spiritual intelligence (metaverse) through multiple dimensions such as perception, computing, reconstruction, collaboration, and interaction. Artificial intelligence will bring about application changes in many aspects: oriented to a new paradigm of scientific research (the origin of the universe, the laws of nature, the mysteries of life); For people's life and health (AI drug development, remote virtual surgery); Oriented to the main economic battlefield (virtual creation, industrial manufacturing, spiritual interaction); For major national defense needs (multi-source situation analysis, AI ground-air front deployment) and so on.

Kai-Fu Lee, chairman and CEO of Innovation Factory, said that the AI 2.0 platform and applications represented by big models will subvert many industries, including: search engines, e-commerce/advertising, finance, education, film and television/entertainment, metaverse/games, medical care... However, he also said that AI can still make mistakes, such as "a serious nonsense", it can only be used to generate the first draft of content and develop ideas, not as the final version, AI needs continuous human intervention to avoid fallacies or disasters. In addition, there may be some legal/ethical issues with AI, so AI is not suitable for all fields, such as finance, training, etc. AI can only be used in applications with high fault tolerance.

Kai-Fu Lee also mentioned similar problems in the forum, saying, "AI 2.0 may create false information, and this flaw cannot be completely eliminated." AI 2.0 can be used by bad actors to deceive or brainwash users. Therefore, when developing 2.0, it is necessary to study the technology that controls AI 2.0 and the laws and regulations that govern AI 2.0. ”

Text/Beijing Youth Daily reporter Wen Jing

Edit/Field

【Copyright Statement】The copyright of this article (including the right of information network transmission) belongs to Beijing Youth Daily and shall not be reproduced without authorization

Read on