Xi Xiaoyao Technology said original

Author | IQ dropped, Python

With the big step forward of AIGC technology, chatbots such as ChatGPT are frequently used in various scenarios of life and work. Imagine that when you write code in a clueless situation or can't solve bugs, this large language model (LLM)-based system may perform well on some simple code generation tasks due to its rich training data and powerful inference ability, providing you with inspiration. However, there are still challenges when facing more complex code generation tasks such as competition-level problems.

A trick to let ChatGPT learn complex programming, the programming level is close to that of human programmers!

A NEW FRAMEWORK CALLED BRAINSTORM MAY BETTER ADDRESS THIS PROBLEM, LEVERAGING HIGH-LEVEL ALGORITHMIC BLUEPRINTS, NEURAL ORDERING MODELS, AND THE REASONING POWER OF LLM TO HELP US GENERATE DIVERSE IDEAS AND FIND THE BEST SOLUTIONS, EVEN ON A PAR WITH HUMAN PROGRAMMERS. Let's take a look together~

Thesis Title:

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Background

Program synthesis

Program Synthesis is widely used in the fields of automated software discovery, program analysis and validation, and human-computer interaction, and it is a technology designed to solve problems raised by developers by automatically synthesizing complete functional programs.

Traditional methods typically search for programs that meet specific task constraints in a search space defined by the underlying programming language. However, this approach faces problems such as complex search space and insufficient formal specifications. Deep learning-based program synthesis can generate programs from informal specifications such as natural language, partial code, input-output examples, or pseudocode. Currently, deep learning methods are mainly used to generate short programs in a specific domain or single-line code in general-purpose programming languages.

Competition-grade code generation

Competition-level code generation refers to the process of generating complex programming solutions at the competition level. This requires models with higher inference and abstraction capabilities, understanding and digesting more complex task descriptions, longer code, and more contextual information. Studies have shown that LLM, which generates code in a zero-shot setting, has better generalization capabilities than fine-tuning for a specific data set. In addition, surveys of GPT-4 show that it solves competition-level programming problems with single-digit pass rates.

A quick overview of the paper

THE BRAINSTORM FRAMEWORK IS A METHOD FOR COMPETITION-LEVEL CODE GENERATION THAT EFFECTIVELY LEVERAGES LLM'S ALGORITHMIC REASONING CAPABILITIES BY GENERATING DIVERSE IDEAS AND SELECTING HIGH-QUALITY IDEAS FROM THEM. The framework generates diverse ideas by designing multiple types of directives that are fed into LLM along with problem descriptions.

FIGURE 1 EXAMPLE OF THE BRAINSTORM FRAMEWORK

AS SHOWN IN FIGURE 1, THE BRAINSTORM FRAMEWORK CONSISTS OF THE FOLLOWING STEPS:

Brainstorming: This step is at the heart of the BRAINSTORM framework and aims to generate a variety of ideas that might help solve a given problem. To achieve this, as shown in Figure 2, the authors designed several types of instructions and entered them into the LLM along with the problem description.
Idea Selection: After generating a large number of ideas in the previous step, they need to be filtered and sorted. Specifically, as shown in Figure 3, an evaluation function is used to evaluate the quality of each idea and select the highest quality idea as the final output.
Code Generation: After choosing the best idea in the previous step, you need to convert it to code. Specifically, the author uses a code generator to translate ideas into executable code. AND, AS SHOWN IN FIGURE 4, THE BRAINSTORM FRAMEWORK WORKS IN A ZERO-SHOT SITUATION, AND UNLIKE THE FEW-SHOT APPROACH, NO EXAMPLES ARE NEEDED FOR EFFICIENT CODE GENERATION.

experiment

A number of experiments were carried out in this paper, and the results are summarized as follows:

As shown in Tables 1 and 2, in the APPS and CodeContests benchmarks, authors use the BRAINSTORM framework to generate code and compare it with other methods. THE RESULTS SHOW THAT THE BRAINSTORM FRAMEWORK SIGNIFICANTLY IMPROVES THE PERFORMANCE OF LLM IN BOTH BENCHMARKS. As shown in Figure 5, there is a significant relative improvement in problems involving probability, shortest path, and graph, and the method consistently outperforms ChatGPT and COT as ratings increase.
THE AUTHORS ALSO EXPERIMENTED IN REAL PROGRAMMING COMPETITIONS TO APPLY THE BRAINSTORM FRAMEWORK TO CODE GENERATION. The results in Table 3 show that the framework can improve the reasoning capabilities of the ChatGPT algorithm to a level comparable to that of human programmers.

THESE RESULTS SHOW THAT WHEN USING THE BRAINSTORM FRAMEWORK, LLM PRODUCES HIGHER QUALITY CODE THAT IS CLOSER TO CODE WRITTEN BY HUMAN PROGRAMMERS.

brief summary

THIS ARTICLE INTRODUCES A NEW COMPETITION-LEVEL CODE GENERATION FRAMEWORK CALLED BRAINSTORM, WHICH HIGHLIGHTS:

PROPOSE BRAINSTORM, A COMPETITION-LEVEL CODE GENERATION FRAMEWORK: TO ACHIEVE EFFICIENT CODE GENERATION TASKS BY GENERATING DIVERSE IDEAS AND SELECTING HIGH-QUALITY IDEAS.
Leverage LLM's algorithmic reasoning capabilities: Generate diverse ideas by designing multiple types of instructions, and use evaluation functions and code generators to convert ideas into executable code.
Zero-shot learning: Can work in zero-shot situations and does not require any examples for efficient code generation.

With further exploration and development in this field, we believe that generating more accurate and high-quality code will become an inevitable trend. When more researchers focus on this field, then we will be one step closer to the beautiful world of free-handed generation code~

A trick to let ChatGPT learn complex programming, the programming level is close to that of human programmers!

Background

Program synthesis

Competition-grade code generation

A quick overview of the paper

experiment

brief summary

Read on

ChatGPT-4o, a small step for OpenAI, a giant leap for human "AI assistants".

The reason why Apple gave up making cars was exposed! After experiencing ChatGPT, I was afraid of falling behind, so I contacted Rivian

The AI search that ChatGPT did not do is not the next battleground

最强OpenAI发布新ChatGPT-4o,AI领域的突破情感识别+视觉理解

OpenAI overturned the voice assistant overnight! ChatGPT learns to look at screens, and the real-life version of Her is here

Sudden Kill! The Chinese version of Ali ChatGPT is here! I couldn't resist signing up for the experience

Hu Xijin is going to lose his job? Netizens used ChatGPT to imitate "Hu Biao" writing, laughing crazy

Let's talk about ChatGPT-4o from the perspective of human-computer interaction

The iOS version of ChatGPT updates support the app's preferred language setting Chinese

How to make ChatGPT "understand you" better

Risk and Governance of Generative AI – The Case of ChatGPT

This is the biggest update for ChatGPT4o! The press conference didn't mention a word! GPT-4o's image recognition ability is so strong! Even the portrait photo can tell who I am 👍 here

ChatGPT's new feature is online: when chatting, you can directly select network disk files such as OneDrive

ChatGPT is able to help doctors accurately analyze clinical studies and medical records

ChatGPT consumes more than 500,000 kWh of electricity per day, and it is energy that is stuck in the development of AI?

Terror! Imploring a Stanford professor to help it "break from prison"? ChatGPT-4 has emerged since