OpenAI has released a guide to using GPT-4, and all the dry goods are here

Since the birth of ChatGPT, with its epoch-making innovation, it has been sent to the altar of generative AI by countless people in one fell swoop.

We always expect it to accurately understand our intentions, but we often find that its answers or creations are not 100% in line with our expectations. This gap may be due to our high expectations for the model's performance, or our inability to find the most effective way to communicate when using it.

Just as explorers need time to adapt to new terrain, our interactions with ChatGPT require patience and skill, following OpenAI's official GPT-4 guide, Prompt engineering, which documents six strategies for navigating GPT-4.

I believe that with it, your communication with ChatGPT will be smoother in the future.

Here's a quick summary of these six strategies:

Write clear instructions
Provide reference text
Split complex tasks into simpler subtasks
Give the model time to "think"
Use external tools
Test changes systematically

Write clear instructions

Describe the detailed information

ChatGPT can't judge what we're implicit, so we should be as clear as possible about your requirements, such as the length of the response, the level of writing, the format of the output, etc.

The less we let ChatGPT guess and infer our intentions, the more likely it is that the output will meet our requirements. For example, when we ask him to write a psychology paper, the prompt should look like this

Please help me write a psychology paper on "Causes and Treatment of Depression", which requires that you need to consult relevant literature, no plagiarism or plagiarism, you need to follow the academic paper format, including abstract, introduction, body, conclusion, etc., and the word count is more than 2000 words.

Let the model play a role

There is a specialization in the art industry, and the designated model plays a specialized role, and the content it outputs will appear more professional.

For example, play as a detective novelist who uses Conan-esque mystery to describe a bizarre murder. Requirements: Anonymous, more than 1000 words, and the plot has ups and downs.

OpenAI has released a guide to using GPT-4, and all the dry goods are here

Use separators to clearly divide the different sections

Separators such as triple quotes, XML tags, and section headings can help divide sections of text that need to be treated differently, helping the model better disambiguate.

Specify the steps required to complete the task

Breaking down some tasks into a series of well-defined steps makes it easier for the model to execute those steps.

Provide examples

It is often more effective to provide a general description that applies to all examples than to demonstrate through examples, but in some cases it may be easier to provide examples.

As an example, if I tell a model that to learn to swim, all you need to do is kick and swing your arms, that's a general statement. And if I show a model a swimming video that shows the specific movements of kicking and swinging arms, it will be illustrated by example.

Specifies the output length

We can tell the model how long we want it to produce output, and this length can be counted in words, sentences, paragraphs, bullet points, and so on.

Due to the influence of the internal mechanism of the model and the complexity of the language, it is better to divide it according to paragraphs and main points, so that the effect will be better.

Provide reference text

Have the model answer using the reference text

If we have more reference information at hand, we can "feed" the model and have it answer using the information provided.

Let the model refer to the reference text to answer

If the input already contains relevant knowledge documents, the user can directly ask the model to add a reference to its answer by referencing a paragraph in the document, minimizing the possibility of the model talking nonsense.

In this case, the references in the output can also be programmatically verified, i.e., the accuracy of the references is confirmed by matching the strings in the provided document.

Split complex tasks into simpler subtasks

Use intent classification to identify the instructions that are most relevant to the user's query

When dealing with tasks that require a lot of different operations, we can take a smarter approach. First, divide the problem into different types and see what action is required for each type. It's like when we're organizing things, we put similar things together.

Then, we can define some standard operations for each type, just like labeling each type of thing, so that some common steps can be prescribed, such as find, compare, understand, etc.

This can be done in a step-by-step manner, and if we want to ask more specific questions, we can further refine them based on previous operations.

The advantage of this is that each time you answer a user question, you only need to do what you need to do for the current step, rather than doing the entire task at once. Not only does this reduce the likelihood of errors, but it also saves you less effort, as it can be expensive to complete the entire task at once.

For scenarios where you need to deal with long conversations, summarize or filter previous conversations

When dealing with conversations, the model is constrained by a fixed context length and cannot remember all the conversation history.

One way to solve this problem is to summarize the previous conversation, when the length of the entered conversation reaches a certain limit, the system can automatically summarize the previous chat, display a part of the information as a summary, or you can quietly summarize the previous chat content in the background while the conversation is progressing.

Another workaround is to dynamically select the parts of the conversation that are most relevant to the current issue when working on the current issue. This approach involves a strategy called "using embedded-based search for efficient knowledge retrieval".

To put it simply, it's about finding the parts of the conversation that are relevant to it based on the content of the current question. This allows for more effective use of previous information and makes the conversation more relevant.

Summarize long documents in segments and recursively construct full summaries

Since the model can only remember a limited amount of information, it cannot be used directly to summarize very long texts, and in order to summarize long documents, we can take a step-by-step approach to summarizing.

Just like when we read a book, we can summarize each section by asking questions chapter after chapter. Summaries of each section can be concatenated to form a summary of the entire document. This process can be recursive, layer by layer, all the way to summarizing the entire document.

If you need to understand the following content, you may need to use the previous information. In this case, another useful tip is to take a look at the summary and understand what this point is about before you read to a certain point.

Give the model time to "think"

Instruct the model to come up with its own solution before rushing to a conclusion

In the past, we may have asked the model to look at the student's answer and ask the model if the answer is correct, but sometimes the student's answer is wrong, and if the model is directly asked to judge the student's answer, it may not be accurate.

In order to make the model more accurate, we can first let the model do this math problem by itself, and first calculate the model's own answer. Then ask the model to compare the students' answers with the model's own answers.

Let the model do the math first, and it will be easier for it to determine whether the student's answer is correct or not, and if the student's answer is different from the model's own answer, it will know that the student answered incorrectly. In this way, the model can start thinking from the most basic first step, rather than directly judging the student's answer, which can improve the accuracy of the model's judgment.

Use an inner monologue to hide the model's reasoning process

Sometimes it is important for the model to reason about the problem in detail when answering a particular question. However, for some use cases, the model's inference process may not be suitable for sharing with users.

To solve this problem, there is a strategy called inner monologue. The idea of this strategy is to tell the model to structure the part of the output that you don't want the user to see, and then show only a part of it when presented to the user, not all of it.

For example, if we are teaching a subject and want to answer students' questions, if we directly tell students all the reasoning ideas of the model, the students don't have to figure it out on their own.

Therefore, we can use the strategy of "inner monologue": first let the model think about the problem completely, think through all the solutions clearly, and then select only a small part of the model's ideas and tell the students in simple language.

Or we can design a series of questions: first let the model think about the whole solution by itself, and then ask the students a simple similar question according to the model's ideas, and after the students answer, let the model judge whether the students' answers are correct.

Finally, the model uses easy-to-understand language to explain the correct solution to the students, which not only trains the reasoning ability of the model, but also allows students to think on their own, and does not directly tell students all the answers.

Ask if the model has missed anything in the previous process

Suppose we ask the model to find sentences related to a problem from a large file, and the model will tell us one sentence at a time.

But sometimes the model makes a mistake and stops when it should continue to look for relevant sentences, resulting in related sentences being missed and not told to us.

At this point, we can alert the model to "are there any other related sentences?" and then it will continue to query the relevant sentences so that the model can find more complete information.

Use external tools

Use embedded-based search for efficient knowledge retrieval

If we add some external information to the model's inputs, the model will be able to answer questions more intelligently. For example, if a user asks a question about a movie, we can input some important information about the movie (such as actors, directors, etc.) into the model, so that the model can give a smarter answer.

Text embedding is a vector that measures the relationships between texts. Similar or correlated text vectors are closer together, while unrelated text vectors are relatively distant, which means that we can utilize embeddings for efficient knowledge retrieval.

Specifically, we can cut the text corpus into chunks, embedding and storing each chunk. We can then embed a given query and find the most relevant embedding block in the corpus (i.e., the one closest to the query in the embedding space) via vector search.

Use code execution to make more accurate calculations or call external APIs

Language models are not always able to accurately perform complex mathematical operations or calculations that take a long time. In this case, we can tell the model to write some code to accomplish the task, rather than letting it do the calculations on its own.

Specifically, we can instruct the model to write down the code that needs to be run in a certain format, such as enclosing it in triple backticks. When the code has generated the result, we can extract it and execute it.

Finally, if desired, the output of a code execution engine (such as a Python interpreter) can be used as input to the next problem in the model. This allows for more efficient completion of tasks that require computation.

Another good example of using code execution is the use of external APIs (Application Programming Interfaces). If we tell the model how to use an API correctly, it can write code that can call that API.

We can provide the model with documentation or code examples that show how to use the API, so that the model can learn how to use the API. Put simply, by giving the model some guidance on the API, it can create code that implements more functionality.

Warning: Executing code generated by the model is inherently unsafe, and any application attempting to do so should take precautions. In particular, you need to use a sandboxed code execution environment to limit the potential harm that untrusted code can cause.

Let the model provide specific functionality

We can pass it a list of features through an API request. This allows the model to generate function parameters based on the provided pattern. The generated function arguments are returned in JSON format, which we then use to perform function calls.

Then, by feeding the output of the function call back into the model in the next request, you can implement a loop, which is the recommended way to call an external function using the OpenAI model.

Test changes systematically

When we make changes to a system, it's hard to tell if those changes are good or bad. Because there are so few examples, it's hard to determine whether the results are really improved or lucky. Sometimes a modification is good in some cases and bad in others.

So how do we evaluate the quality of the system's output? If there is only one standard answer to a question, the computer can automatically determine whether it is right or wrong. If there is no standard answer, other models can be used to judge quality.

In addition, we can also let humans evaluate the subjective quality, or a combination of computer and human evaluationWhen the answer to the question is very long, the quality of different answers is not much different, then we can let the model evaluate the quality by itself.

Of course, as the model becomes more advanced, there will be more and more content that can be automatically evaluated, and less and less human evaluation is required, and it is very difficult to improve the evaluation system, and combining computer and manual is the best method.

Evaluate model outputs with reference to gold standard answers

Let's say we are faced with a question and need to give an answer. We already know the correct answer to this question, based on a number of facts. For example, if the question is "why is the sky blue?", the correct answer might be "because when sunlight passes through the atmosphere, the light in the blue band passes through better than other colors."

The answer is based on the following facts:

Sunlight contains different colors (light bands)

The blue band has less loss as it passes through the atmosphere

Once we have the question and the correct answer, we can use a model (such as a machine Xi model) to determine the importance of each part of the truth in the answer to the correct answer.

For example, the fact that sunlight contains different colors in the answer is very important for the correctness of the answer. The fact that the blue band has less loss is also important for the answer. This way we can know what key known facts the answer to this question depends on.

In the digital age, prompts are the starting point for splitting requirements, and by designing clever prompts, we can split the entire task into a series of concise steps.

This decomposition not only helps the model better understand the user's intent, but also provides the user with a clearer path to action, as if given a clue that leads us to unravel the mystery of the problem step by step.

Your needs and my needs are like surging rivers, and prompts are like sluices that regulate the direction of the water, which plays the role of a hub, connecting the user's thinking with the machine's understanding. It is no exaggeration to say that a good prompt word is not only an insight into the user's deep understanding, but also a tacit understanding of human-computer communication.

Of course, it's not enough to rely on Prompt engineering to really master the use of prompt words, but OpenAI's official user guide has always provided us with valuable instructions to get started.

OpenAI has released a guide to using GPT-4, and all the dry goods are here

Write clear instructions

Describe the detailed information

Let the model play a role

OpenAI has released a guide to using GPT-4, and all the dry goods are here

Use separators to clearly divide the different sections

Specify the steps required to complete the task

Provide examples

Specifies the output length

Provide reference text

Have the model answer using the reference text

Let the model refer to the reference text to answer

Split complex tasks into simpler subtasks

Use intent classification to identify the instructions that are most relevant to the user's query

For scenarios where you need to deal with long conversations, summarize or filter previous conversations

Summarize long documents in segments and recursively construct full summaries

Give the model time to "think"

Use an inner monologue to hide the model's reasoning process

Ask if the model has missed anything in the previous process

Use external tools

Use embedded-based search for efficient knowledge retrieval

Use code execution to make more accurate calculations or call external APIs

Let the model provide specific functionality

Test changes systematically

Evaluate model outputs with reference to gold standard answers

Read on