laitimes

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

author:Flash Gene

Guide

After the advent of GPT, many people speculated that a large amount of software would be rewritten because of its appearance. This article is mainly some thoughts and practices on the combination of low-code platforms and ChatGPT. I look forward to catching the AI train with all readers to speed up development~

directory

1 Background

2 Demo demo

3 Ideas

3.1 ChatGPT+ code generation tool combination mode

3.2 ChatGPT code generation status

3.3 Feasible ideas at this stage

3.4 Cases

4 Design implementation

4.1 Architecture Layering

4.2 Plug-in

4.3 R&D adjustments

5 Summary

01

background

Since exploring model-driven development, I have been thinking about the question: "Can software be generated in a simpler and more user-friendly way", and ChatGPT gave me an affirmative answer.

We have previously explored some exploration in generating code based on domain models, hoping to replace coding time with high magnification of modeling time. With the continuous improvement of code tools, efficiency improvement is more and more difficult, because the model is abstract and the implementation is concrete, the information carried by the model is not enough to directly generate code, "people" must be needed to supplement the information, this part of the work tools can not be completed for "people".

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

Until we experienced ChatGPT, while being shocked by its powerful capabilities, we also thought about "how to introduce ChatGPT into our code generation tools to improve R&D efficiency", and quickly built some demo verification effects.

02

Demo demo

In steps 3 to 5, the tool integrates a plugin based on the ChatGPT interface, which automatically extracts the Chinese class name, member variable name, member function Chinese name in the model, and then enters the Chinese name as well as the translation purpose and naming style into ChatGPT to get the translation result, and automatically fills it back into the tool, and finally generates the code.

Imagine a project with dozens of class names, hundreds of member variable names and function names that need to be translated into English according to the Chinese, and some words have to be translated using translation software and then converted according to the use of the use (class names use nouns or noun phrases, method names use verbs), and then adjust to styles such as big humps or underline connections, which is boring and tedious work, and now only needs to be filled with one click , then make minor adjustments.

The translation ability of only connecting to ChatGPT is so obvious, so if the ability of ChatGPT is encapsulated as a plug-in embedded in the entire development process, what effect will be achieved?

03

Ideas

3.1 ChatGPT+ code generation tool combination mode

3.1.1 Mode 1: Generate software directly

This model allows ChatGPT to understand human language and write software, for example, ChatGPT can completely generate a runnable snake game, of course, strictly speaking, this mode is not a combination of ChatGPT and code generation tools, because there is no need for code generation tools to participate, which is undoubtedly the simplest and most natural way of software development.

Unfortunately, through testing, it was found that ChatGPT cannot directly write complete and complex software through dialogue at this stage, because the software has its own core domain knowledge, and different teams have their own specifications, environments and other requirements, such as Google using gRPC framework, deployed in Google Cloud, and Amazon's research and development framework and deployment environment are completely different from Google. It's impossible to enter all of this information into ChatGPT (there is so much of this information that describing it through sessions requires a lot of work, and there are concerns about sensitive information leakage in addition to performance). This model cannot be implemented at this stage. I guess that while the advent of AutoGPT shows that AI can indeed complete a project from 0 to 1, I don't think anyone would dare to apply the artifact it generates directly to the production environment.

3.1.2 Mode 2: Generate code snippets

Code context information is fed into ChatGPT through a session, which is used to refine and edit code, such as the Copilot plugin. Testing found that the quality of ChatGPT-generated code snippets was high and stable.

The difference between this pattern and mode one is that the code is a "tool" that organizes the "code snippets" generated by ChatGPT to form complete software.

3.1.3 Mode 3: Generate DSL

The difference between this pattern and scheme two is that ChatGPT does not directly generate code, and the code is generated by the tool based on the DSL generated by ChatGPT. ChatGPT generates DSLs relatively stable, and the code quality generated by this mode is more reliable than the previous two modes.

3.2 ChatGPT code generation status

"Knowing the other and knowing oneself, we must first have a clear understanding of the capabilities of ChatGPT, so that we can choose the right mode. We read some GPT-4 ability test review articles, and also did a lot of experimental verification, to say a few interesting points:

  • ChatGPT "understands" the code, giving a piece of code can add comments correctly, and even optimize variables according to the context to name and improve the code;
  • ChatGPT can "guess" the code, just give a function declaration, it can guess the function of the function according to the function naming, parameter naming, and generate test cases;
  • ChatGPT is easier to generate common code, such as a base library, but generating domain-specific code may not conform to best practices in that domain;
  • The quality of ChatGPT generated code is related to the user, the more accurate the input, the higher the quality of the generated code, and too much or too little input content will lead to poor generation results.

In the actual scenario, we rely on a lot of information to write code, in addition to the context of the current file, it may also cross files, systems, and warehouses... However, due to ChatGPT's limitation on input length, it is unrealistic to enter all dependent information into ChatGPT (time cost, sensitive code leakage); Another problem is the interactive mode, if the code is generated offline, it is fine, but if it is "pair programming with ChatGPT", the real-time requirements are very high, imagine if Copilot takes 1 minute to generate prompts every time?

3.3 Feasible ideas at this stage

Combined with the above information, we believe that although GPT4 is currently very powerful, it cannot fully automatically generate applications, especially for an industry that needs to match the best practices and domain knowledge of the industry, and needs to follow the team's R&D specifications. This is something that ChatGPT cannot do at this stage. How can ChatGPT's existing capabilities be embedded in code generation tools? I thought a little roughly:

  • Leverage ChatGPT to assist in generating DSLs, import DSLs to low-code platforms to generate business code that meets team specifications;
  • Enter the code generated in the first step into ChatGPT, and ChatGPT will supplement the code snippet according to the context and fill it in the corresponding position;
I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

Reading this, you may have questions, this is obviously a combination of mode two and mode three, why let ChatGPT generate the code twice? I will explain in detail with a case study below.

3.4 Cases

3.4.1 Information Extraction

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

As shown in the figure above, this is the analysis sequence diagram of the system use case "receiving parking space status change", and we can obtain the following information by analyzing the sequence diagram:

· The control class has a method called "Parking Charge";

· The "parking charge" method of the control class relies on the entity class "berth";

· The member variables of the entity class in the analysis sequence diagram can be obtained in the class diagram, and all arrows pointing to the entity are mapped as one method;

· The pseudocode of the control class and entity class methods can be obtained according to the analysis sequence diagram, for example, the pseudocode of "incoming car" is as follows:

int 泊位::来车(){
  // 1、取值班人员
  排班(时间).取值班人员(值班人员序号);
  if(失败){
    打印日志
    返回失败错误码
  }
  return 0
}           

3.4.2 Build DSL to generate code

Obviously, the above information is not enough to generate the code, taking the pseudo-code of "berth::: incoming car" as an example, in order to map to a piece of code that conforms to the C++ syntax, at least the following information needs to be improved:

  • Translation, words in pseudocode must be translated into English;
  • Supplementary field type, what type is the time field;
  • How to define the failure of the mode configuration, such as the "take the person on duty" method? Judge according to the return code or according to a certain parameter;

Once this has been done, we can describe the above pseudocode in a structured language in order to generate the code, for example:

{
"return_type":"int",
"function_name":"ArriveCar",
"param":[],
"impl":[
{
"entity":"Scheduling",
"function":"GetShiftPersonnel",
 "return_type":"int",
"param":[
{
"type":"int",
"name":"number"
}
]
}
]
}           

In fact, the information that needs to be configured is much more than that, and the knowledge to complete these jobs is in the "human brain", which can only be done by people, and it is not simple to build a DSL that can generate code. Of course, we can reduce human work in some ways: for example, modify fill-in-the-blank questions to multiple-choice questions (most configurations are checked operations without entering text), summarize best practices and add default selections (such as member variables do not generate the Get function by default), etc., but there are always some parts of the work that are tedious, repetitive, inefficient and need to be done by people, such as the Chinese in the above steps translated into English words with different parts of speech and different formats according to the use context. This part of the work needs to be done by introducing ChatGPT, which can take 50 minutes (10 per minute) for a person to translate 500 Chinese words, while it only takes a few seconds for ChatGPT to translate.

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

3.4.3 Improve the code

After the first step to extract information from the model, and the second step to convert the information into the DSL needed to generate the code, we can generate the code, here is an example of the code directory we generated:

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

We opened the header files and PROTO files in the above directory and nodded our heads not only with satisfaction. But when we click on the .cpp file of the parking class, we see the following content not only complaining: the generated code cannot be run directly!

// 来车
int ParkingSpace::ArriveCar() {
//// MDD-TAG-BEGIN:[flow][slot-ArriveCar][函数实现]
int ret = 0;
// 取值班人员
Scheduling scheduling (/*请填充参数*/);
ret = scheduling.GetShiftPersonnel(number);
if (ret != 0) {
 LOG_VERR("--->>错误事件名<<---", ret, "GetShiftPersonnel_ERR");
 return ret;
}
return ret;
//// MDD-TAG-END:[flow][slot-ArriveCar]
}           

Yes, most of the method implementations have not been generated up to this point. Here to analyze some reasons, the pseudocode in the above figure as an example, if you want to generate the code, what information do you need someone to add?

  • There may be more than one constructor for the Shift entity, which one to call?
  • Does returning 0 in the "Take Attendant on Duty" method of the Shift entity mean success?
  • Where is the value populated with the parameter "number"? Is it populated with a member variable of the parking class or a global variable?
  • What variables should be printed in the error log?

Just a few lines of pseudocode need to be supplemented with so much information, if this information is configured by a person, what is the difference between it and writing code directly? With friendly prompts, the IDE is even more efficient to write code directly than to generate code after the tool is configured!

Fortunately, without manual configuration, as long as we input the header file definitions required to convert the pseudocode into business code into ChatGPT, it can automatically derive the implementation and generate the code. That's why it's important to "have ChatGPT generate the code in two parts." The first is to generate a DSL so that the code generation tool produces a quality assured code framework, header file definitions, etc., and the second time is to continue refining the business code based on the code already generated.

The idea has been very clear, in order to implement into a specific solution, it is necessary to have a good design of the code generation tool, here good means "modular, low coupling, extensible", the following introduces the design and implementation of the code generation tool.

04

Design implementation

4.1 Architecture Layering

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

(Image source network)

In the figure above, the red part "The model and engine together determine the degree of implementation and scalability of the application". This article is not about domain modeling, so modeling knowledge will not be discussed here, we will focus on the design of the engine. To add to the architectural layering of the system, as follows:

I use low-code combined with ChatGPT development, and I spend an extra 1 hour a day fishing

Protocol stack: Define the formatted structure of the code generation engine input, if possible, this structure can be used as a specification standard for various low-code platforms, of course, if this can be done to solve the problem of interconnection of low-code platforms (difficult but correct), which is beneficial to the entire industry.

Code generation engine: The implementation of the protocol stack, define the code generation template, process the input data and fill it into different templates according to the requirements, and generate C++, TS and other codes.

Engine plugins: Extensions based on the engine, such as making the engine more flexible and supporting more protocols. Our combination with ChatGPT is actually to encapsulate the capabilities of ChatGPT, implement plug-in assistants to complete the work, and improve the ease of use of code generation tools.

Code generation tools: Encapsulate the engine and plug-ins, which are user-oriented products.

4.2 Plug-in

After clarifying the layering of the architecture, the scheme of our integration with ChatGPT is naturally clear: the process of model mapping code is disassembled into a task, and which tasks are completed by people at this stage and ChatGPT already has the corresponding capabilities, then the capabilities based on ChatGPT are divided into plug-ins to assist people to complete the corresponding tasks and improve performance. Below we explain the thinking about several plugins.

4.2.1 Code Generation Plugins

To let ChatGPT generate the code, Prompt needs to contain the following three parts of information: what, what, and what to do.

What's there: As explained above, we have ChatGPT generate relatively stable quality code snippets, i.e. implementations of a certain function. The "what" is actually the input parameters of the function, the member variables of the class, and the global variables. The member variables of the input parameters and classes can be obtained by parsing the domain model, while global variables are defined in the code generation template and are fixed values (usually global configuration).

What to use: i.e. which entity class methods, boundary class methods, base libraries are required to implement the method. Entity class methods show in 3.4.2 that the DSL already has header files for entity class methods after generating code; Boundary class methods are encapsulations of external system interfaces, which need to be managed uniformly, so they can also be easily obtained.

The basic library is divided into official library and third-party library, the official library does not need to be entered, ChatGPT can deduce and use the correct method, and the third-party library because of too many we do not deal with it for the time being, our purpose is not to generate 100% of the code, the part that cannot be generated will be handed over to R&D to supplement it (don't forget that R&D also has Copilot and other sharp tools).

What to do: Need to improve the code logic, that is, pseudocode, shown in 3.4.3, will not be repeated here.

The plugin stitches the above information into Prompt, and the rest is left to ChatGPT, such as the above berth:: The method of coming from ChatGPT is finally perfected by ChatGPT:

// 来车
int ParkingSpace::ArriveCar() {
//// MDD-TAG-BEGIN:[flow][slot-ArriveCar][函数实现]
int ret = 0;
// 取值班人员
Scheduling scheduling (time(nullptr));
ret = scheduling.GetAttendant(number_);
if (ret != 0) {
 LOG_VERR("GetAttendant", ret, number_);
 return ret;
}
return ret;
//// MDD-TAG-END:[flow][slot-ArriveCar]
}           

4.2.2 Translation Plugins

As shown in Chapter 2, we have implemented this plugin. At present, most modeling tools support filling in English names, but English names are non-required fields and most people are not accustomed to using English modeling, we extract the Chinese of unassociated English words in the model, and stitch them as Prompt to call ChatGPT unified translation, and fill in with one click.

Of course, we need more than these two plugins, we also brainstormed a series of efficiency improvement plug-ins, such as single test generation plugins, SQL generation plugins, etc.

4.3 R&D adjustments

We estimate that if we follow this line of thinking, implementing a code generation tool can save more than 90% of the work, and about the remaining 10% is the following:

Code walkthrough: As mentioned above, the code generated by ChatGPT is only relatively stable, so the generated code is annotated out by default using /* and */ packages, and the comments can only be deleted after research and development to confirm that the code is correct and adjusted and put into use.

Code improvement: ChatGPT can not generate 100% of the code, for example, the previous interface produced an intermediate result cached, the interface will be used, considering the performance of ChatGPT, the complexity of plug-in implementation, OPENAI interface charging price and other factors, we cannot stitch the information of the previous interface to Prompt. The cost of writing code directly for this part of the logic development is much lower than the cost of using ChatGPT.

Unit test: ChatGPT writes the quality of unit tests far beyond my imagination, it will consider boundary conditions and other factors, if it is for the basic library generated test case code almost without modification can be used directly, but if it is a control class and other business code test cases can not be used, such as JScode in the above interface is generated by small programs, there is no doubt that ChatGPT can not construct such parameters, is Mock Or call testability interface acquisition, need to be developed, adjust according to the actual situation.

In any case, due to the complexity of the business itself, we cannot hope to use one tool to generate all the code, there must be some code that the tool cannot complete, and people still need to participate.

However, this part of the work can be assisted by tools such as Copilot and IDE plugins.

05

summary

You still need to note that if using ChatGPT directly has the risk of sensitive information leakage, each team can independently deploy AI models as needed. For individual developers, there are open source models that can be used, such as Vicuna, LLaMA, etc.

So far, AI tools have emerged primarily to facilitate humans, not replace them. By learning and using, these tools may help speed up development. Developers are welcome to communicate in the comment area.

Author: Wu Junjie

Source: WeChat public account: Tencent Cloud Developer

Source: https://mp.weixin.qq.com/s/3I3PHTkZ-bafYfhm7KrxTA

Read on