preface
With the rapid development of Internet of Things technology, the interaction between people and devices, devices and devices has become no longer difficult, and how to achieve interaction more naturally, efficiently and intelligently has become a new challenge in the field of Internet of Things.
Recently, advanced large language models (LLM) such as ChatGPT, GPT-3.5 and GPT-4 released by OpenAI and their applications have rapidly become popular around the world, bringing more possibilities for the combination of general artificial intelligence (AGI) and the Internet of Things.
As an advanced natural language processing application, ChatGPT easily enables natural dialogue between humans and machines with its superior natural language processing capabilities. The mainstream protocol MQTT (Message Queuing Telemetry Transport) in the field of Internet of Things ensures real-time transmission and efficient processing of data through lightweight, low-bandwidth communication and publish/subscribe model.
From this, we can boldly imagine that the MQTT protocol combined with ChatGPT can make it easier to realize human-machine interaction in the field of IoT:
- In the smart home field, users can control smart devices in their homes and improve their quality of life by talking naturally with ChatGPT.
- In industrial automation, ChatGPT helps engineers analyze equipment data faster and improve productivity.
- …
Based on this, this article will explore how to combine the MQTT protocol with natural language processing applications such as ChatGPT, and will show the combined application scenarios through a simple building example, providing some ideas for readers to explore intelligent applications of the Internet of Things.
Basic concepts
Before we start, we need to briefly understand some basic concepts about MQTT and ChatGPT.
MQTT protocol
As mentioned above, MQTT protocol is a lightweight message transmission protocol based on publish/subscribe mode, which has been widely used in the Internet of Things, mobile Internet, intelligent hardware, Internet of Vehicles, smart city, telemedicine, electricity, oil and energy and other fields.
Connecting a large number of IoT devices using the MQTT protocol requires an MQTT server as a key component. In the following scheme design, we will use EMQX, a large-scale distributed IoT MQTT message server, to achieve efficient and reliable connection of massive IoT devices and real-time processing and distribution of message and event stream data.
After that, we can use the MQTT client to connect to the MQTT server and communicate with the IoT device. This article adopts the open-source cross-platform MQTT client MQTTX, which includes desktop, command line and web applications, which can easily realize connection testing with MQTT servers and help developers quickly develop and debug MQTT services and applications.
ChatGPT
ChatGPT (https://openai.com/blog/chatgpt) is a natural language processing application built on advanced big language models such as OpenAI's GPT-3.5 and GPT-4. GPT (Generative Pre-trained Transformer) is a deep learning model known for its powerful text generation and comprehension capabilities. ChatGPT understands and generates natural language for smooth, natural conversations with users. To achieve ChatGPT's natural language processing capabilities, we need to use APIs (https://platform.openai.com/docs/api-reference) provided by OpenAI to interact with GPT models.
Scheme design and preparation
Based on the capabilities of the MQTT protocol and ChatGPT, we will design a scheme to realize the combination and interoperability of the two.
To implement natural language processing capabilities like ChatGPT, we will write another client-side script that uses the API provided by OpenAI to interact with the GPT model. When the MQTT client in this script receives the message and forwards it to the API, a corresponding natural language response will be generated, after which the response message will be published to a specific MQTT topic to realize the interaction loop between ChatGPT and MQTT client.
Through this design solution, we will demonstrate the interoperable process between ChatGPT and the MQTT protocol for message reception, processing, and forwarding.
First, follow these steps to prepare the required tools and resources.
- Install EMQX:
You can use Docker to quickly install and launch EMQX 5.0:
docker run -d --name emqx -p 1883:1883 -p 8083:8083 -p 8883:8883 -p 8084:8084 -p 18083:18083 emqx/emqx:latest
In addition to Docker installation, EMQX also supports RPM or DEB package installation, please refer to the EMQX 5.0 Installation Guide (https://www.emqx.io/docs/zh/v5.0/deploy/install.html) for specific installation methods.
- Install the MQTTX desktop application:
Go to the MQTTX official website (https://mqttx.app/zh), select the corresponding operating system and CPU architecture version, click Download and install.
- Sign up for an OpenAI account and get an API key:
Once in OpenAI, https://platform.openai.com/overview create or log in to your account. When finished, click the upper right corner, select View API Keys, and under the API keys column, click Create new secret key to generate a new API key. Keep this key safe, as it will be used for API authentication in subsequent procedures.
After completing the above steps, we already have the tools and resources we need to combine the MQTT protocol with ChatGPT applications. You can consult the OpenAI documentation (https://platform.openai.com/docs/introduction) for detailed guidance and learning materials on how to interact with GPT language models using OpenAI's APIs.
Code implementation
After the resource and environment are prepared, we will use the Node.js environment to build an MQTT client that will receive messages through MQTT topics, send data to OpenAI APIs, and generate natural language from GPT models. The resulting natural language is then published to the specified MQTT topic for integrated interaction. Of course, you can also choose other programming languages such as Python and Golang according to your needs and familiarity. For visual demonstration, we'll use the API directly, but you can also choose to use the official library, which provides a cleaner way to use Node.js and Python.
For more information, see: OpenAI Libraries (https://platform.openai.com/docs/libraries/libraries).
- Prepare your Node.js environment: Ensure that you have installed the Node .js (v14.0 or later recommended). Create a new project folder and initialize the project using the npm init command. Then, install the necessary dependencies using the following command:
npm init -y
npm install axios mqtt dotenv
axios is used to send HTTP requests, mqtt is used to connect to MQTT servers, and dotenv is used to load environment variables.
- Use environment variables: Create a file named .env and add your OpenAI API key to it:
OPENAI_API_KEY=your_openai_api_key
- Write code: Create a new index.js file, connect to the MQTT server, subscribe to the specified MQTT topic, and listen for messages in the file. When the message is received, use axios to send an HTTP request to OpenAI API, generate a natural language reply, and post the reply to the specified MQTT topic, the key code of each step will be listed below for your reference:
Use the mqtt library to connect to the MQTT server, and after the connection is successful, subscribe to the chatgpt/request/+ topic by default to receive the sent MQTT messages:
const host = "127.0.0.1";
const port = "1883";
const clientId = `mqtt_${Math.random().toString(16).slice(3)}`;
const OPTIONS = {
clientId,
clean: true,
connectTimeout: 4000,
username: "emqx",
password: "public",
reconnectPeriod: 1000,
};
const connectUrl = `mqtt://${host}:${port}`;
const chatGPTReqTopic = "chatgpt/request/+";
const client = mqtt.connect(connectUrl, OPTIONS);
Write a genText asynchronous function, receive the userId parameter, create an HTTP client instance using axios, authenticate with the OpenAI API key in the HTTP headers, and then send a POST request to the OpenAI API to generate a natural language reply. The generated reply content is then published to a specific topic subscribed to by the MQTT client to receive the reply. Historical messages are stored in the Messages array:
// Add your OpenAI API key to your environment variables in .env
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
let messages = []; // Store conversation history
const maxMessageCount = 10;
const http = axios.create({
baseURL: "https://api.openai.com/v1/chat",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${OPENAI_API_KEY}`,
},
});
const genText = async (userId) => {
try {
const { data } = await http.post("/completions", {
model: "gpt-3.5-turbo",
messages: messages[userId],
temperature: 0.7,
});
if (data.choices && data.choices.length > 0) {
const { content } = data.choices[0].message;
messages[userId].push({ role: "assistant", content: content });
if (messages[userId].length > maxMessageCount) {
messages[userId].shift(); // Remove the oldest message
}
const replyTopic = `chatgpt/response/${userId}`;
client.publish(replyTopic, content, { qos: 0, retain: false }, (error) => {
if (error) {
console.error(error);
}
});
}
} catch (e) {
console.log(e);
}
};
- Finally, by listening for messages with the subject topic chatgpt/request/+, the received messages are stored in the Messages array, and the genText function is called to generate a natural language reply and sent directly to the specific topic to which the user subscribes within the function. The maximum number of historical messages is 10:
client.on("message", (topic, payload) => {
// Check if the topic is not the one you're publishing to
if (topic.startsWith(chatGPTReqTopicPrefix)) {
const userId = topic.replace(chatGPTReqTopicPrefix, "");
messages[userId] = messages[userId] || [];
messages[userId].push({ role: "user", content: payload.toString() });
if (messages[userId].length > maxMessageCount) {
messages[userId].shift(); // Remove the oldest message
}
genText(userId);
}
});
- Run the script service:
node index.js
At this point, we have completed the basic functionality part of the demo project, in addition to the basic functionality, the code also implements access isolation between users, just add different suffixes in specific themes. By storing the history of previous messages, the GPT model can also understand the context in the context of the conversation and generate a more coherent and contextual response based on the previous conversation.
The complete code can be found in GitHub's openai-mqtt-nodejs (https://github.com/emqx/openai-mqtt-nodejs).
Another scenario
In addition to the above examples, we can also directly use the rule engine and data bridging functions provided by EMQX to achieve rapid development.
EMQX supports setting rules to trigger webhook callbacks when a message is published to a specific topic. We only need to write a simple web service, use OpenAI API to interact with the GPT model and generate a reply through the HTTP response, which can be published to a specified topic by creating a new MQTT client, or directly using EMQX's Publish API to complete the operation, and finally achieve the purpose of integrated interaction.
For users who already have Web services, this approach can maximize development cost savings and quickly implement PoC or Demo. The advantage is that there is no need to write a separate MQTT client, and the EMQX rule engine can be used to simplify the integration process and flexibly process data. However, web services still need to be written and maintained, and webhooks may not be convenient enough to use for complex scenarios.
Therefore, the above mentioned solutions have their own advantages, and we can choose a more suitable solution according to the actual business needs and the technical level of developers. Either way, EMQX, as an MQTT infrastructure, provides important support for system integration, allowing developers to quickly prototype projects and drive digital transformation.
Demo presentation
After developing an instance of the interaction between the MQTT client and the GPT model, we can use the MQTTX desktop client to test this demo project. MQTTX's user interface is similar to chat software, making page operations more streamlined and therefore more suitable for demonstrating interactions with conversational bots.
First, we need to create a new connection in MQTTX, connect to the same MQTT server in the above code, for example: 127.0.0.1, and then subscribe to the chatgpt/response/demo topic to receive replies and send messages to the chatgpt/request/demo topic. The demo suffix here can be replaced with other strings to achieve access isolation between users, and we can test it by sending a Hello message:
Next, we simulate some more complex demo environments, where if the temperature of a sensor exceeds a preset threshold, the ChatGPT bot sends an alarm message to another MQTT topic that is connected to a monitoring device such as a smart watch or smart speaker. After the monitoring device receives the alarm message, it can use natural language technology to convert the alarm information into speech so that users can receive and understand it more conveniently.
For example, we can create another smart home environment that includes multiple MQTT topics for different types of devices (e.g. lights, air conditioning, sound, etc.). We'll use ChatGPT to generate natural language commands for real-time interaction with these devices via MQTT clients, etc.
Outlook for the future
Combining ChatGPT and MQTT protocols can realize intelligent IoT systems, which have a wide range of application potential in smart home and industrial automation. Through natural language interaction, users can control the switch, brightness, color and other parameters of home equipment to achieve a more intelligent and comfortable living environment; In industrial automation, intelligent equipment maintenance and control using ChatGPT and MQTT can lead to a more efficient and intelligent manufacturing process.
In the future, we can envision ChatGPT or smarter AGI tools playing more roles in improving efficiency and productivity in the IoT space, such as:
- Message parsing: Parses messages transmitted through MQTT to extract the required data and prepare for subsequent processing and analysis.
- Semantic understanding: Semantic understanding and processing of messages received from MQTT to extract more accurate information.
- Intelligent processing: Through AI technology, the received MQTT messages are intelligently processed to help users obtain the appropriate solution faster.
- User feedback: As a representative of intelligent interaction, receive user feedback through MQTT and provide corresponding responses.
- Virtual assistant: As a virtual assistant, it controls smart home devices through language recognition technology, provides users with smarter and more efficient services, and improves the convenience and comfort of life.
epilogue
In this blog, we briefly explore the combination of MQTT and ChatGPT and their potential applications. Through EMQX and MQTTX, combined with the API provided by Open AI, an AI application similar to ChatGPT is realized, and the integration of MQTT and ChatGPT is demonstrated by receiving and forwarding the processed data after using MQTT connection.
Although these technologies have not yet been put into production, as more AI-integrated products are available (such as New Bing's integration of GPT models into search engines, and GitHub's Copilot, etc.), we believe that the future development direction of artificial intelligence (AI) and Internet of Things (IoT) technologies will also include natural language interaction optimization, intelligent device control, and more innovative application scenarios.
In conclusion, the combination of MQTT and ChatGPT reveals an area worth paying attention to and exploring in depth. We look forward to these evolving innovations bringing us a better world.