Sun Yanzi after AI training is hot, how to do training? The famous singer Yanzi is very similar. Sun Yanzi singing Jay Chou's song is popular again, twenty years ago she was popular, in 2003 she was hot again

author：Erudite and talented human Zheng Dao 2023-05-30 15:02:00

Sun Yanzi after AI training is hot, how to do training?

The famous singer Yanzi is very similar. Sun Yanzi singing Jay Chou's song is popular again, she became popular twenty years ago, and she became popular on the whole Internet in 2003. And yet you know what? This is Sun Yanzi after AI training, and the cost is about less than 500 yuan. How to train? From 5 key points, I will explain how to use OpenAl to train a model like ChatGPT to train the process.

1. Training using reinforcement learning (RLHF) with human feedback. It's like raising children, who can learn by searching the world and observing how it works. In this process, parents should give him feedback in time, telling the child what can be done, what cannot be done, and what cannot be involved.

Trying to have a great influence on a child's behavior with less involvement in the past is called a (RLHF) habit.

2. It is called pretraining. This stage model has a lot of primitive text management training on the Internet, trying to predict the next word. These texts come from a variety of sources, including books, web pages, pre-trained models, and other understandable and deep texts. Training at this stage requires a lot of computing resources.

3. It's called supervised finetuning. At this stage, the human trainer, the keyword prompter, needs to answer specific questions and answers, to the trainers that the model is trying to predict, which can help the model better understand and answer various questions.

4) Reward Modeling. At this stage, the human trainer will produce different questions and answers to compare the models and determine which one is better to score him. Compared to this data, the trained model is used to predict the trainer's preference.

5. It is called reinforcement learning. This stage model will generate better responses using the reward model trained in the previous stage, and the model will try to generate those responses that receive higher rewards and higher scores. The obtained model RL model can be deployed directly. So this process probably requires a lot of computing resources as well as data and time.

But through this process, Muing is not the only one who can create complex questions and powerful chat assistants. For example, if you want to be a customer service artificial intelligence of your company, you can learn keywords by learning, so Dr. Hao also sorted out 200 keywords in various industries. If you also need it, you can reply in the comment area: "I want 200 keywords".

Sun Yanzi after AI training is hot, how to do training? The famous singer Yanzi is very similar. Sun Yanzi singing Jay Chou's song is popular again, twenty years ago she was popular, in 2003 she was hot again

Sun Yanzi after AI training is hot, how to do training? The famous singer Yanzi is very similar. Sun Yanzi singing Jay Chou's song is popular again, twenty years ago she was popular, in 2003 she was hot again

Read on