laitimes

Sensing AI robots and focusing on the essence of large models, what does the future AI look like in the minds of Turing Award winners and Tsinghua professors?

author:Smart stuff
Sensing AI robots and focusing on the essence of large models, what does the future AI look like in the minds of Turing Award winners and Tsinghua professors?

Smart stuff

compile | Mingyi Edit | Xu Shan

WAIC World Artificial Intelligence Conference 2023 recently concluded successfully in Shanghai. At this domestic highest-level artificial intelligence summit, 11 heavyweight guests shared their observations and judgments on the field of Chinese intelligence from different angles.

In the summit dialogue session of the conference, Xu Li, Chairman and CEO of SenseTime, and Yao Zhizhi, winner of the Turing Award and Dean of Shanghai Institute of Zhizhi, Yuan Yang, Assistant Professor of Institute of Interdisciplinary Information Science of Tsinghua University, Yang Zhilin, Assistant Professor of Institute of Interdisciplinary Information Research of Tsinghua University and founder of Moonshot AI, and Pan Xingang, first author of DragGAN and Assistant Professor of School of Computer Science and Engineering, Nanyang Technological University, conducted wonderful discussions on the development and breakthrough of artificial intelligence, including the breakthrough of the core theory of large models, multimodal optimization of large models, safe and controllable algorithm approach, and comparative analysis of existing cases.

1. Yao Zhizhi: Chinese scientists help artificial intelligence reinforce learning

Yao Zhizhi, winner of the Turing Award and president of the Shanghai Institute of Zhizhi, said that Chinese scientists have made many breakthrough contributions to the development of AI. Gao Yang, assistant professor at the Institute of Interdisciplinary Information Studies at Tsinghua University, made a very important algorithm breakthrough more than a year ago, which accelerated the time process of reinforcement learning by hundreds of times. His research is not only an applied advance, but also a theoretical contribution to algorithm research. As a result, his research has received international attention.

He believes that after ChatGPT, the next important goal of AI research is robots with multiple perception capabilities such as vision and hearing, and can learn various new skills independently in different environments. Gaoyang's technological breakthrough has increased the learning speed of the robot by hundreds of times, so that the robot can do it in a few hours.

This not only solves the practical problem of robot learning, but also contributes to theory. Yao Zhizhi said that in the past six or seven years, senior researchers of artificial intelligence have debated whether the artificial intelligence reinforcement learning route is correct. Gao Yang's research breakthrough tilted this balance to the other side. We still have a long way to go to improve artificial intelligence.

Yuan Yang: The understanding of multimodality should be based on solving specific problems in specific industries

In terms of the application of interdisciplinary large models, Yuan Yang, assistant professor at the Institute of Interdisciplinary Information Studies of Tsinghua University, said that the understanding of multimodal should be based on solving specific problems in specific industries. For example, the generation of text-to-image, the generated image is not the performance that the user wants, and the user needs to modify it with the mouse. And the drag of this mouse is the new modality. The user is telling the big model what they want to say with new input and making it understandable. This multimodal input is very important in the application.

Therefore, in a specific industry, the training of large models should focus on the core problems in the industry and find the data needed to solve the core problems. This is the completion of modalities. On this basis, after doing a good job in modal alignment and modal completion, Yuan Yang believes that large models can have more powerful capabilities to solve more core cross-domain problems.

3. Yang Zhilin: The solution to the problem of general large models should return to a more essential level

Yang Zhilin, assistant professor at the Institute of Interdisciplinary Information Studies at Tsinghua University and founder of Moonshot AI, said that there are still many unsolved problems with existing large models, such as safety, controllability, avoiding hallucinations and fabricating non-existent problems, and large models cannot be created like scientists. He believes that when thinking about these problems against the general model, we should not make a headache, but should draw inferences from one another, think about what are the common problems underlying these problems, and return to a more essential level to solve them.

Pan Xingang: Moonshot and GAN may complement each other in the future

Pan Xingang, first author of DragGAN and assistant professor in the School of Computer Science and Engineering, Nanyang Technological University, compared the differences between two AI mapping software, Moonshot and GAN, based on the framework of the generative model and the optimization goal. The first is performance and efficiency, in the generation process, the iterative calculation of diffusion models requires more training time and greater computational overhead. Therefore, the image generation performance is higher. Pan Xingang believes that the upper limit of the diffusion model is higher than that of GAN, and the quality advantage is more obvious and the application prospect is wider. However, in certain cases where performance and computational overhead are limited, GAN is still a compromise option.

The second is the mapping of GAN and the diffusion model, the impact of the diffusion model on the image content is more random, not structured, and GAN can effectively edit the attributes in the image, such as animal posture. In this regard, the question of how to expand the diffusion model is also worth exploring.

The third is the continuity of the generated space, the image space of the diffusion model is relatively discontinuous during design, and the image control of GAN is relatively smooth and natural. In the future, it is very interesting to complement the advantages of these two models.

Fifth, the field direction of the future large language model

Regarding the development of large models in the vertical field, Yao Zhizhi believes that based on the language ability of large models, more paperwork can be handed over to machines in the future. Yuan Yang, based on his professional background and the judgment of the large model based on the pre-training paradigm, believes that the large model may do better than humans and machines in the medical relationship, indicating that he is more optimistic about the direction of intelligent medical care. Yang Zhilin tends to use it personally. For example, people can provide context to AI, and everything that people see can also be seen by AI through screen recording. Pan Xingang believes that video and 3D content generation have great prospects in the future, which can help creatives and others create higher quality content.

In addition, during this WAIC 2023 World Artificial Intelligence Conference, Tesla founder and CEO Elon Musk, Professor Tang Xiaoou of Hong Kong Chinese University, Ken Hu, rotating chairman of Huawei, Dr. Hou Yang, senior vice president of Microsoft and chairman and CEO of Microsoft Greater China, Turing Award 2018 winner, Meta Yann LeCun, chief AI scientist of the AI Basic Artificial Intelligence Research (FAIR) team, and Yu Kai, founder and CEO of Horizon, shared their observations and thoughts on the field of artificial intelligence.

Conclusion: The future of the field of artificial intelligence will develop in the direction of the "human brain"

This WAIC brings not only a gathering of large models, but also the intersection of new and old blood in the field of artificial intelligence. Research in the field of artificial intelligence covers a variety of disciplines, including computer science, data analysis and statistics, hardware and software engineering, linguistics, neuroscience, and even philosophy and psychology. Therefore, the current discussion of the future development direction of artificial intelligence has not formed a unified view.

But whether it is general artificial intelligence or robot artificial intelligence, various artificial intelligence concepts will develop in the direction of "human brain" or even "human". This means that the goal of AI is not only to imitate human behavior, but also to truly understand complex abstractions such as human thoughts, emotions, and behavior. The analysis of these complex abstractions, in addition to computer science and data analysis, may also involve the field of brain science, as well as deeper philosophical and psychological issues.