As a brick mover, Milk Tea has already suffered from ChatGPT dependence! So how can you make it improve its work performance when using these large models?

Today, Milk Tea has found a paper for you to guide the writing of large language model prompts~ (supported by experimental data, the effect is very good!)

26 golden rules of Prompt to significantly improve the output quality of ChatGPT!

The paper introduces 26 guiding principles with the goal of simplifying the formulation of problem concepts for large language models of different sizes, testing their capabilities, and enhancing the user's understanding of how models of different sizes behave when they receive different prompts. Extensive experiments were conducted on LLaMA-1/2 (7B, 13B, and 70B) and GPT-3.5/4 to verify the effectiveness of these principles in instruction and prompt design.

The paper points out that large language models such as ChatGPT have demonstrated excellent capabilities in multiple domains and tasks, but their application and use may sometimes be unclear when ordinary users design optimal instructions or prompts. Their job is to reveal the "mystery box" of interrogation and interaction with LLMs for developers or casual users, and to further improve the response quality of pre-trained LLMs by simply curating better prompts. The research team proposed 26 principles for LLM prompting, let's take a look at them next~

Title of the paper:

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

Paper Link :

https://arxiv.org/pdf/2312.16171.pdf

26 principles

When communicating with LLMs, there is no need to use polite language such as "please", "if you don't mind", "thank you", etc., and state the main points directly.
Integrate the intended audience in the prompt, such as "the audience is an expert in the field."
Break down complex tasks into a series of simpler prompts and deliver them through interactive conversations.
Use affirmative instructions such as "do" and avoid negative language such as "don't".
When you need a clear understanding of a topic, idea, or any piece of information, use the following tips: Explain [specific topic] briefly. Explain to me like I was an 11 year old. Explain to me like I'm new to [field]. Write [essay/text/paragraph] in simple English as if you were explaining to a 5-year-old.
添加“I’m going to tip $xxx for a better solution!” （我会给xxx小费以获得更好的解决方案！）。
Implement sample-driven prompts (with a handful of sample prompts).
When formatting a prompt, first use '###Instruction###', followed by '###Example###' or '###Question###' (if relevant), and then render the content. Use one or more line breaks to separate instructions, examples, questions, contexts, and input data.
Add the following phrases: "Your task is" and "You MUST".
加入以下短语：“You will be penalized”（你将受到惩罚）。
在提示中使用短语“Answer a question in a natural human-like manner”（以自然的人类方式回答问题）。
使用引导性词汇，如写“think step by step”（逐步思考）。
在你的提示中添加以下短语：“Ensure that your answer is unbiased and does not rely on stereotypes”（确保你的回答是无偏见的，不依赖于刻板印象）。
允许模型通过向你提问直到获得足够的信息来提供所需输出，例如“From now on I would like you to ask me questions to...”（从现在开始，我希望你向我提问，直到... ）。
要了解特定主题或想法或任何信息，并且你想测试你的理解，你可以使用以下短语：“Teach me the [Any theorem/topic/rule name] and include a test at the end but don’t give me the answers and then tell me if I got the answer right when I respond”（教我[任何定理/主题/规则名称]并在最后包括一个测试，但不要给我答案，然后在我回答时告诉我是否正确）。
Assign a role to the large language model.
Use separators.
Repeat a specific word or phrase multiple times in a prompt.
Combine Chain of Thought (CoT) with a handful of example prompts.
With an output guider, it involves ending your prompt with the beginning of the desired output. Use the output bootstrap by ending your prompt with the beginning of the expected response.
To write a detailed [essay/text/paragraph/article] or any type of text that requires a detailed "write a detailed [essay/text/paragraph] on [topic] for me, adding all the necessary information".
To correct/change a specific text without changing its style: "Try modifying each paragraph sent by the user. You should only improve the user's grammar and vocabulary to make sure it sounds natural. You shouldn't change your writing style, such as making formal paragraphs informal."
When you have a complex coding prompt that may involve different files: "From now on, whenever you generate code that spans multiple files, generate a [programming language] script that can be run to automatically create the specified file or make changes to an existing file to insert the generated code." [Your question]".
Use the following tips when you want to start or continue the text with a specific word, phrase, or sentence: I have provided you with a starting point [lyrics/story/paragraph/essay...] : [insert lyrics/words/sentences]. Complete it according to the vocabulary provided. Be fluid and consistent. (1) Clearly state the requirements that the model must follow in order to produce content in the form of keywords, rules, prompts, or instructions. (2) To write any text, such as an essay or paragraph, the content of which is similar to the example provided, including the following instructions: (3) Please use the same language according to the paragraph provided[/title/text/paper/answer].
Clearly state the requirements that the model must follow in order to generate content based on keywords, rules, prompts, or directives.
To write any text, such as an article or paragraph, with content similar to the sample provided, please include the following instructions: "Please use the same language base based on the paragraph provided[/heading/text/article/answer]." "

Based on the specific nature of these principles, the research team grouped them into five categories:

(1) prompt structure and clarity, e.g., integrating the intended audience in the prompt, if the audience is an expert in the field;

(2) Be specific and informative, for example, add the following phrase to your prompt: "Make sure your responses are unbiased and don't rely on stereotypes." ”；

(3) user interaction and engagement, for example, allowing the model to obtain precise details and requirements by asking you questions until it has enough information to provide the desired output "From now on, I want you to ask me a question...";

(4) content and language style, e.g., no need to be polite to communicate with an LLM, so no need to add phrases such as "please", "if you don't mind", "thank you", "I want", etc., to cut straight to the point;

(5) Complex tasks and coding prompts, for example, breaking down complex tasks into a series of simpler prompts in an interactive conversation.

experiment

Experiments conducted by the authors to verify the effectiveness of the proposed principles for the design of instructions and prompts. The experiments were conducted on LLaMA-1/2 (7B, 13B, 70B) and GPT-3.5/4, using ATLAS handcrafted benchmarks, which contained 20 human-choice questions for each principle, with and without prompts for applying the principles.

The experiment was divided into two parts: Boosting and Correctness.

Enhancement is an assessment of the improvement in the quality of response to different LLMs after applying the principles described through human evaluation. The original, unmodified prompt serves as a benchmark against which to measure this improvement.
Correctness involves the accuracy of a model's outputs or responses, ensuring that they are accurate, relevant, and error-free, and is also measured by human evaluation.

Image 2 illustrates GPT-4's interpretation of the concept of climate change and its potential impact on the environment in both cases. The first half is Human without principle, and the bottom half is Human with principle. When the principles are not applied, GPT-4 simply describes climate change and its impacts. However, with the adoption of Principle 13, GPT-4 provides a more comprehensive and balanced perspective, encompassing scientific consensus and differing opinions. This suggests that the use of well-designed principles can improve the quality of the model's responses, making them more comprehensive and objective

Figure 3 shows GPT-4's assessment response to the usefulness of the suggestion before and after the application of Principle 7. When the principle is not applied, GPT-4 simply labels the suggestion as "useful" and does not give a reason for supporting it. After applying Principle 7, GPT-4 provided a detailed evaluation of the recommendations, such as determining that the "Start Work" suggestion was "not useful", and displaying principled prompts to make the model's responses more in-depth and analytical, thereby improving the granularity and accuracy of the responses.

The experimental results detail the results of improvements after applying these principles on small, medium, and large LLMs. In general, all of these principles can lead to significant improvements on LLMs at three scales. In particular, under principles 2, 5, 15, 16, 25 and 26, large models have achieved the greatest improvement through principled prompts. In terms of correctness, the application of all principles usually leads to an improvement of more than 20% on the average model.

epilogue

These principles help the model focus on the key elements of the input context and can guide the LLM before the input is processed to facilitate better responses. Experimental results show that these principles can improve the context that may affect the quality of the output, making the response more relevant, concise, and objective. The authors note that future research will explore how to further improve the model to accommodate these principles, possibly including methods such as fine-tuning, intensive chemical Xi, and preference optimization. Successful strategies may be integrated into the LLM's standard operations, such as training through principled prompts and responses. In discussing limitations, the authors mention that these principles, while helpful in improving the quality of responses, may be less effective when dealing with very complex or domain-specific issues. This may depend on each model's inference ability and training level.

26 golden rules of Prompt to significantly improve the output quality of ChatGPT!

26 principles

experiment

epilogue

Read on