In November 2022, OpenAI, a US artificial intelligence research laboratory, launched the natural language processing tool robot ChatGPT. Compared with other existing intelligent robot products, its intelligence has been subversively improved, not only by connecting a large number of language libraries to train models, so as to simulate the natural way humans talk, but also to carry out more complex language work. Once launched, it quickly became the fastest growing consumer application in history, continuously attracting widespread attention from various industries, and kicking off the era of general artificial intelligence.

What is ChatGPT?

To answer this question, we must first know a concept - general artificial intelligence (AGI), which refers to an intelligent system that can perform various tasks in various fields like humans, with human-like intelligence, autonomous decision-making and learning ability, and can perform a variety of tasks, including speech recognition, natural language processing, visual recognition, etc. Of course, at this stage, humans are still far from developing a truly general artificial intelligence, but ChatGPT can already allow us to see a "spark".

From an application point of view, ChatGPT belongs to an artificial intelligence generated content product that uses AIGC (AI generated content) technology. Its core idea is to use artificial intelligence models to automatically generate various types of text, images, audio, video and other content according to the given theme, keywords, format, style and other conditions. AIGC technology is widely used in media, education, entertainment, marketing, scientific research and other fields, according to the needs and preferences of users, can generate content that meets user expectations, save labor and time costs, and improve the efficiency and scale of content production.

From a technical point of view, ChatGPT is a natural language processing tool robot based on a large language model, the core is on the three English letters of GPT, the full name is Generative Pre-trained Transformer, the translation is "generative pre-training transformer", which can generate high-quality natural language text through massive pre-training, and can adapt to a variety of different natural language processing tasks. The so-called language model is the modeling of the probability distribution of the word sequence, that is, the probability distribution of different words at the next moment is predicted using the already said fragments as conditions. Language models can measure the degree to which a sentence conforms to linguistic grammar, and can also be used to predict the generation of new sentences. For example, this sentence "12 o'clock, after work, let's go to the cafeteria together", language models can predict that "canteen" may be "eating", powerful language models can capture time information and predict the words "lunch" that match the context.

Why is ChatGPT so powerful?

The underlying logic of the GPT model is actually to imitate the human brain. The neurons of the human brain receive information through dendrites, the cell body performs simple addition operations, and finally outputs the results to the next neuron through the axon, and all artificial intelligence neural networks, including GPT, are similarly constructed. ChatGPT is different from previous artificial intelligence in that its model uses a huge corpus to learn, and the GPT-3 in May 2020 has reached 175 billion parameters. The number of parameters of GPT-4 has reached 1.6 trillion. At the same time, it also uses new training methods and optimization techniques to improve the efficiency and stability of the model, becoming a multimodal large model that can handle text, images, audio and other types of inputs and outputs.

OpenAI uses a training method called "reinforcement learning from human feedback" to train GPT models, and through human feedback, targeted optimization, so as to make ChatGPT more intelligent. The training process can be simply divided into three steps: the first step is to train the supervised strategy model. It is to first learn the text data from the network, randomly select the questions in the dataset to obtain random answers, and then give high-quality answers by human annotators, and then use these manually labeled data to fine-tune the GPT model, so that the model understands human intentions to a certain extent. The second step is to train the reward model. Using the model generated in the first stage, questions are randomly drawn from the dataset, and for each question, multiple different responses are generated. Human labelers give a ranking order for these outcomes taking into account, a process similar to coaching or teacher coaching. Next, use this sorting result data to train a reward model so that the model imitates the human scoring standard to score different responses and enter the imitation preference stage. Finally, the parameters are updated according to the score, and the learning of the model is strengthened to obtain high-quality answers. The third step uses PPO (Proximal Policy Optimization) reinforcement learning to optimize the policy. This stage uses the reward model trained in the second stage to update the pre-trained model parameters by relying on the reward score. Questions are randomly selected from the dataset, responses are generated using the PPO model, and quality scores are given using the reward model trained in the previous stage. The return scores are passed sequentially, and the PPO model parameters are updated by reinforcement learning. Repeating the second and third stages, through iteration, a higher quality GPT model is trained. In essence, the whole training process is similar to teaching children mathematics. It is to give examples in class first, then homework, and then mark the homework and pick out some questions that always go wrong and then carry out targeted exercises.

What are the limitations of ChatGPT and what are the risks?

Although ChatGPT is already powerful, it still has many limitations. First, credibility cannot be guaranteed. At present, the model cannot provide reasonable evidence for credibility verification, and in its field without a large number of corpus training, it will even seriously "talk nonsense" and create answers, and the smooth flow of sentences seems reasonable, but in fact it is completely different, which may cause misleading or misunderstanding. Second, the timeliness is insufficient. ChatGPT cannot incorporate new knowledge in real time, and its knowledge scope is limited to the pre-training data time used by the underlying large-scale language model, and the range of knowledge that can be answered has a clear boundary. For example, ChatGPT may not be aware of recent news, events, people, products, etc., or may not be up-to-date on facts that have changed. Third, the construction cost is high. ChatGPT requires a lot of computing power to support its training and deployment, and it also needs server support with large computing power when applying. Fourth, they perform poorly in specific professional areas and cannot handle complex and lengthy or particularly specialized language structures. For questions from very specialized fields such as finance, natural sciences, or medicine, without adequate corpus training, ChatGPT may not translate technical terms well and it will be more difficult to generate appropriate answers.

In fact, the progress of ChatGPT is not that it has become really smart, but that it has initially imitated human language patterns, but it is far from really understanding what it says, and it is actually a kind of "surface intelligence". In essence, it does not get rid of the large-sample passive learning mode of artificial intelligence's "big data, small tasks", that is, when you ask ChatGPT a question, it completes this task with a huge database, rather than thinking, reflecting and judging like humans.

In addition, ChatGPT's superior intelligence is like a "double-edged sword", bringing convenience but also triggering a series of thoughts. The "knowledge" of ChatGPT comes from the massive text in the language model training process, and if the content of the training text is biased and deliberately biased towards a certain ideology or point of view, then it is possible to produce a tendentious answer and text content when answering questions or outputting text. The threshold for using ChatGPT is low, and a large number of users lack relevant professional knowledge and discernment ability, which is difficult to verify the accuracy of the information, and may produce a large amount of useless or even misleading information. If it is misused by lawbreakers, it can be used to quickly generate hate speech, rumors, etc., which can be used to incite national emotions and provoke social contradictions, causing adverse effects on social harmony and stability. In addition, ChatGPT also brings a series of academic ethical issues, such as paper fraud.

ChatGPT is a milestone in the progress of AIGC technology, which greatly increases the technical maturity of content creation using artificial intelligence, and is expected to become a new industry-wide productivity tool to improve content production efficiency and richness. In the long river of human science and technology, ChatGPT has just lit the "spark" of the era of general artificial intelligence, and there is still a long way to go from the real general artificial intelligence.

