Why does ChatGPT make LeCun sour into lemon essence? Google, Meta, OpenAI chatbot big PK!

Shin Ji Won reports

Edit: So sleepy Aeneas Peaches

To make a chatbot, OpenAI is not the first, but it is definitely the most limelight. Google, Meta, DeepMind, OpenAI's chatbot PK, who is the best?

A few days ago, a comment on ChatGPT by Yann LeCun, Meta's chief artificial intelligence scientist, quickly spread inside and outside the circle, triggering a wave of discussion.

At a small gathering of media and executives at Zoom, LeCun gave a surprising comment: "In terms of the underlying technology, ChatGPT is not very innovative."

"Although in the eyes of the public, it is revolutionary, but we know that it is a well-assembled product, nothing more."

ChatGPT is not innovative

ChatGPT, as the "top stream" of chatbots in recent months, has long been popular all over the world, and even has actually changed the career of some people and the status quo of school education.

When the world marveled at it, LeCun's review of ChatGPT was so "understated".

But in fact, his remarks are not unreasonable.

Data-driven AI systems like ChatGPT are available to many companies and research labs. LeCun said OpenAI isn't that unique in this space.

"In addition to Google and Meta, there are six startups, basically all with very similar technologies." LeCun added.

Then, LeCun soured a little -

"ChatGPT uses a Transformer architecture that is pre-trained in a self-supervised way, and self-supervised learning is something I have been advocating for a long time, and OpenAI was not yet born."

Among them, Transformer is an invention of Google. This kind of language neural network is the basis of large language models such as GPT-3.

The first neural network language model, Yoshua Bengio, was proposed 20 years ago. Bengio's attention mechanism was later used by Google in Transformer and has since become a key element in all language models.

In addition, ChatGPT uses human feedback reinforcement learning (RLHF) technology, which was also pioneered by Google's DeepMind Lab.

In LeCun's view, ChatGPT is more of an engineering success than a scientific breakthrough.

OpenAI's technology "isn't innovative in terms of basic science, it's just well designed."

"Of course, I won't criticize them for that."

I'm not criticizing OpenAI's work, or their claims.

I am trying to correct the public and media perception that ChatGPT is an innovative and unique technological breakthrough, but this is not the case.

At a panel with New York Times reporter Cade Metz, LeCun sensed the question of good deeds.

"You might be asking, why don't Google and Meta have similar systems?" My answer is that if Google and Meta launch this kind of chatbot that can talk nonsense, the loss will be quite heavy." He said with a smile.

Coincidentally, as soon as the news that OpenAI was favored by Microsoft and other financial owners and soared to $29 billion, Marcus also wrote a mockery on his blog overnight.

In the article, Marcus broke out a golden sentence: What can you OpenAI do that Google can't do, worth $29 billion?

Google, Meta, DeepMind, OpenAI Big PK!

Without further ado, let's pull out the chatbots of these AI giants and use data to speak.

LeCun is true to say that many companies and labs have ChatGPT-like AI chatbots.

ChatGPT is not the first AI chatbot based on language models, it has many "predecessors".

Before OpenAI, Meta, Google, DeepMind, and others all released their own chatbots, such as Meta's BlenderBot, Google's LaMDA, and DeepMind's Sparrow.

Other teams have also announced their own open-source chatbot plans. For example, Open-Assistant from LAION.

In a blog post on Huggingface, several authors surveyed, categorized, and summarized important papers on topics such as RLHF, SFT, IFT, CoT (all keywords for ChatGPT).

They made a table comparing AI chatbots such as BlenderBot, LaMDA, Sparrow, and InstructGPT based on details such as public access, training data, model architecture, and evaluation direction.

Note: Because ChatGPT is not documented, they use the details of InstructGPT, an instruction fine-tuning model from OpenAI that can be considered the basis for ChatGPT.

It's not hard to see that despite many differences in training data, underlying models, and fine-tuning, these chatbots all have one thing in common – following instructions.

For example, you can ask ChatGPT to write a poem about fine-tuning.

It can be seen that ChatGPT is very "familiar", and when writing poetry, he does not forget to pat the horse fart of LeCun and Hinton.

Then passionately praised: "Fine-tuning, fine-tuning, you are a beautiful dance."

From predicting text to following instructions

Often, the language modeling of the base model is not enough for the model to learn how to follow user instructions.

In the training of the model, in addition to the classic NLP tasks (such as emotion, text classification, summary, etc.), researchers will also use instruction fine-tuning (IFT), that is, fine-tuning the basic model through text instructions on very diverse tasks.

These instruction examples consist of three main parts: instructions, inputs, and outputs.

Input is optional, and some tasks only require instructions, such as open generation in the ChatGPT example above.

When an input and output appear, an example is formed. There can be multiple input and output examples for a given instruction. For example, here is an example:

IFT data is typically a collection of human-written instructions and examples of instructions guided using language models.

During the boot process, the LM is prompted in the few-shot setup (as shown above) and instructed to generate new instructions, inputs, and outputs.

In each round, the model is prompted to choose from the samples written by hand and generated by the model.

The amount of human and model contribution to creating the dataset is like a spectrum (see figure below).

At one end are purely model-generated IFT datasets, such as Unnatural Instructions, and on the other end are a large number of human-generated instructions, such as Super-natural instructions.

In between, a smaller but higher-quality seed dataset is used, and then bootstrapped work, such as Self-instruct.

Another way to organize datasets for IFT is to take existing high-quality crowdsourced NLP datasets on a variety of tasks, including prompts, and use a unified schema or different templates to convert these datasets into instructions.

Work on this includes T0, the Natural instructions dataset, FLAN LM, and OPT-IML.

Papers on natural instruction datasets: https://arxiv.org/abs/2104.08773

Fine-tune the model

On the other hand, OpenAI's InstructGPT, DeepMind's Sparrow, and Anthropic's Constitutional AI all employ reinforcement learning based on human feedback (RLHF), or annotation of human preferences.

In RLHF, a set of model responses is ranked based on human feedback (e.g., choosing a more popular text profile).

Next, the researchers trained a preference model on these annotated responses to return a scalar reward for the RL optimizer.

Finally, the chatbot is trained with reinforcement learning to simulate this preference model.

Chain of Thought (CoT) prompts, a special case of the example of instructions, produce output by inducing chatbots to reason step by step.

Models fine-tuned with CoT use instruction datasets of step-by-step inference with human annotations.

This is the origin of the famous prompt, "let's think step by step."

The following example is taken from "Scaling Instruction-Finetuned Language Models". Among them, orange highlights instructions, pink shows input and output, and blue shows CoT inference.

The paper points out that models fine-tuned using CoT perform better on tasks involving common sense, arithmetic, and symbolic reasoning.

In addition, CoT fine-tuning is also very effective on sensitive topics (sometimes better than RLHF), especially to avoid model messing – "sorry, I can't answer".

Follow instructions safely

As mentioned earlier, the language model of instruction fine-tuning does not always produce useful and safe responses.

For example, it will escape by giving useless answers, such as "I'm sorry, I don't understand"; Or output an unsafe response to a user who throws a sensitive topic.

To improve this behavior, the researchers fine-tune the underlying language model on high-quality human annotated data in the form of supervised fine-tuning (SFT), thereby enhancing the usefulness and harmlessness of the model.

SFT and IFT are very closely linked. IFT can be seen as a subset of SFT. In recent literature, the SFT phase is often used for security topics rather than for specific directive topics completed after IFT.

In the future, there should be clearer use cases for their classification and description.

Google's LaMDA is fine-tuned on a dataset of conversations with security annotations based on a set of rules.

Often pre-defined and developed by researchers, these rules encompass a wide range of topics, including harm, discrimination, misinformation, and more.

The next step for AI chatbots

There are still many open questions to explore about AI chatbots, such as:

1. How important is RL in learning from human feedback? Can we achieve RLHF performance with higher quality data training in IFT or SFT?

2. How does SFT+RLHF in Sparrow compare with SFT in LaMDA?

3. Given that we already have IFT, SFT, CoT, and RLHF, how much pre-training is necessary? What are the trade-offs? Which is the best base model (both public and non-public)?

4. These models are now well-designed, in which researchers specifically search for failure modes and influence future training (including tips and methods) based on the revealed problems. How can we systematically document the effects of these methods and replicate them?

To sum up

1. Compared to the training data, only a very small part is used for instruction fine-tuning (hundreds of orders of magnitude).

2. Supervised fine-tuning uses human annotations to make the output of the model more secure and useful.

3. CoT fine-tuning improves the performance of the model on step-by-step thinking tasks and makes the model not always evade sensitive issues.

Resources:

https://huggingface.co/blog/dialog-agents

Why does ChatGPT make LeCun sour into lemon essence? Google, Meta, OpenAI chatbot big PK!

Read on

Idea is king! The former Chairman of Mensa International summarized the application of artificial intelligence in 2021

Hardcore Observation #507 Honda's clock travels back 20 years and can't be repaired

The Google DeepMind team brings new tools to language models to spot and fix harmful behavior in a timely manner

Google plans to launch an AI chatbot against ChatGPT, tentatively named "Apprentice Bard"

ChatGPT, the new TikTok marketing selection tool?

Wu Jun, a well-known computer expert: ChatGPT is not a new technological revolution and does not bring any new opportunities

In the face of ChatGPT's global popularity, how should China's AI debut?

Silicon Valley Big L5: Survivors of Winter

Why can't Europe create a mobile operating system that can compete with Android and iOS?

Ten thousand layoffs turned around and embraced AI, and Meta was going to change its name again

Microsoft Google wants to reinvent the business with AI, Musk said that AI will destroy humanity... Talk about AI

Samsung "backstabbed" Google

AI competition is intense, Google makes another big move! Merger of DeepMind and Google Brain

By merging DeepMind and Google Brain, Google ushered in a new era of AI

Keep up with Microsoft! Google's generative AI Bard can program and debug code bugs too

Nothing has been achieved in AI research and development, and you still lay off employees while sending yourself "red envelopes"? Google's CEO made nearly $1.6 billion last year

Google CEO Pichai: Artificial intelligence occupies the C position Search is important but no longer the core business

Apple and Google led the development of draft tracking industry specifications to prevent abuse of features

After sparking outrage in Brazil, Google removed the Slave Simulator game

The Queens rights sold for more than $1 billion, and EXO members terminated their contracts with SM Entertainment

Can't stay 3 days a week, the Amazon CEO was forced to say ruthlessly: If you don't go back to the office, you will leave!

Artificial intelligence brings parenting anxiety, and Chinese parents in Australia are worried about the future of their children

The past and future of OpenAI o1 and artificial intelligence

Of the four areas that will not be replaced by artificial intelligence in the future, the first is the most stable, and the fourth is the most cost-effective

Adobe's Project Turntable AI tool rotates two-dimensional artwork in three-dimensional space

Chen Jianlin|Types of enterprise data empowerment from the perspective of general artificial intelligence

Scientists are using new artificial intelligence to uncover the secrets of infant learning and development

Nansha and Huawei join forces! Jointly build an artificial intelligence ecological base

Top 10 Trends in Artificial Intelligence in 2025! The latest forecast →

Zhang Yimou revealed the progress of "The Three-Body Problem": only one film, significant deletion, and the introduction of artificial intelligence

The Frankfurt Book Fair focuses on the development and regulation of artificial intelligence

Top 10 trends in the future of artificial intelligence

Research Report | Explore the frontiers of science and technology and lead the future innovation" Artificial Intelligence Innovation and Application Expo The research journey set sail

The Forum was successfully held

Digital technology and artificial intelligence save the ratings of the Spring Festival Gala

DeepSeek is born, artificial intelligence is powerful, will teachers be replaced? Is there still any point in reading?

While a large number of people are unemployed, they are engaged in artificial intelligence, and the development has robbed hundreds of millions of people's jobs, what will happen in the future?