laitimes

Researchers propose a new concept of artificial intelligence that allows large language models to interact with the real physical world

author:DeepTech

Recently, Xu Huatao, a PhD student at Nanyang Technological University in Singapore and a research assistant at the Hong Kong University of Science and Technology, and his team have created a project called "Penetrative AI".

Researchers propose a new concept of artificial intelligence that allows large language models to interact with the real physical world

图 | 徐华韬(来源:徐华韬)

The project aims to push the boundaries of large models such as ChatGPT and expand its application beyond the task of processing text to a wider range of scenarios.

The motivation for this research stems from the tremendous success of large models in several fields such as writing, programming, and more, which can not only understand human language, but also create amazing human texts.

However, most of these applications are still limited to the digital world. By fusing real data from the physical world into a large model, the research group hopes to achieve a seamless connection between digital intelligence and the real world.

According to the researchers, this exploration not only hopes to broaden the application field of large models, but also opens up a new model that allows AI to directly interact, parse and react to the sensing data of the real physical world around humans.

Such advances could fundamentally change the way people deal with problems and automate cyber-physical systems, such as the ability of large models to infer people's activities and environments directly from mobile phone data.

Researchers propose a new concept of artificial intelligence that allows large language models to interact with the real physical world

(Source: arXiv)

The team expects "AI to penetrate" to play an important role in numerous cyber-physical systems, including but not limited to the following areas:

First, it can be used for smart medical care.

Osmotic AI uses sensor data from wearable devices to not only track health metrics, predict potential health issues, but also deliver personalized treatments. Its accurate analysis of health data can open up innovative avenues for continuous monitoring.

Second, it can be used for home automation.

Osmotic AI can greatly improve the efficiency and safety of the home by intelligently managing the home through IoT data, such as light adjustment and energy optimization. Because it can accurately identify family life patterns, it can bring more convenience and security to modern living.

Third, it can be used for factory automation.

Through in-depth analysis of machine sensor data, osmotic AI has the potential to automate processes and optimize production lines. For example, it can quickly adapt to complex production environments, which can effectively reduce downtime and ensure production safety.

Fourth, it can be used for environmental monitoring.

In terms of environmental monitoring, osmosis AI can play a role in ecological protection and public safety by using in-depth analysis of complex sensor data to keenly capture environmental changes and predict potential risks.

Fifth, it can be used in the decision-making command system.

In this application scenario, osmotic AI can integrate multi-source sensor data to support rapid decision-making and command in complex environments.

Its in-depth insight and processing of data can significantly improve the efficiency of emergency response and the accuracy of decision-making, and can provide technical support for urban management and emergency rescue.

So, why did Xu Huatao et al. carry out such a study?

According to reports, after ChatGPT was released at the end of 2022 and achieved remarkable success, Xu Huatao's mentor, Professor Li Mo, had an insight that it could become a disruptive technology (game changer).

During the Spring Festival in 2023, Li Mo repeatedly encouraged Xu Huatao and others to deeply understand and actively use ChatGPT.

Inspired by his mentor, Xu Huatao began to think deeply about the characteristics and advantages of large models. According to reports, Xu Huatao's research mainly focuses on designing algorithms to process sensor data, such as using mobile phone sensor data to perceive human behavior.

This prompted him to consider whether he could interpret this sensor data with the help of large models. Preliminary experimental verification shows that large models like ChatGPT can indeed understand various signals, including WiFi.

After reporting these findings to his mentor, he was equally excited and decided that he should continue to explore further.

They then began to explore whether the large model could handle more of the other types of tasks and chose heartbeat detection as a new attempt.

However, in the heartbeat detection task, the large model needs to process a long series of numbers, and the direct application of the large model cannot achieve good results.

Later, Xu Huatao tried to use the thinking of traditional algorithms to guide the large model, but the results were lackluster, and the research came to an impasse.

During a walk in the green space of Nanyang Technological University's campus, Xu Huatao had a flash of inspiration and began to think about how humans accomplish such tasks.

He thought to himself: If there is a child in front of him, how can he guide him through this task? So Xu Huatao began to think of the large model as a "human", and assisted it in processing signals by describing signal patterns through text, and finally found that GPT-4 could do this task very well.

During the research period, the research team also found that there were different levels of signal processing methods in the two tasks, so they summarized two levels: "textual signal" and "digital signal".

Later, they collaborated with the research group of Professor Mani Srivastava of the University of California, Los Angeles, to finalize the study.

In addition, Xu Huatao et al. have been thinking about how to name the concept. After much deliberation, they finally chose the name "Penetrative AI".

The name has a double meaning:

On the one hand, "Penetrative" means "deep understanding", which means that they want new intelligences based on large models to deeply understand the physical world.

On the other hand, "Penetrative" has the meaning of "penetration", which also symbolizes the ability of this new intelligence to penetrate various industries and applications.

日前,相关论文以《Penetrative AI:让大模型理解物理世界》(Penetrative AI:Making LLMs Comprehend the Physical World)为题发在 arXiv[1]。

Xu Huatao is the first author, and Li Mo and Mani Srivastava serve as co-corresponding authors.

Researchers propose a new concept of artificial intelligence that allows large language models to interact with the real physical world

Figure | Related papers (source: arXiv)

It is also reported that osmosis artificial intelligence is a very broad direction, and the team will actively explore new applications that large models can support in the Internet of Things scenario, and will also try their best to popularize this concept so that more scholars can explore more possibilities together.

Resources:

1. Hattapus://ArXiv.org/PDF/2310.09605.pdf

Operation/Typesetting: He Chenlong

Read on