蚂蚁Codefuse团队投稿

量子位 | 公众号 QbitAI

From design, to coding, to testing, deployment, and even operations...... The whole process of software development can be handed over to AI!

An end-to-end AI intelligent assistant that covers the entire software development life cycle, making decentralized software development operations integrated and intelligent.

This AI assistant is specially designed for the development field, avoiding problems such as unreliable general large models, untimely information, and imperfect domain tasks.

An intelligent assistant handles the entire software development process, from design to operation and maintenance, all to AI

This AI assistant is called DevOps-ChatBot, developed by the Ant Codefuse project team, the installation process is simple and fast, and it can also be deployed with one click through docker.

What are the specific functions of DevOps-ChatBot and how does it perform, please see the author's submission.

Solve the defects of the general large model

With the emergence of general models such as ChatGPT and various vertical domain models, the product interaction mode and user information acquisition mode in various fields are gradually changing.

However, DevOps has high requirements for the accuracy of facts, the timeliness of information, the complexity of problems, and the security of data.

As a result, the Codefuse team initiated and open-sourced DevOps-ChatBot end-to-end AI intelligent assistant, which is designed for the full life cycle of software development:

Through DevOps vertical knowledge base + knowledge graph enhancement + SandBox execution environment and other technologies to ensure the accuracy and timeliness of the generated content, and allow users to interactively modify the code compilation and execution to ensure the reliability of the answers;
Through static analysis technology + RAG retrieval enhancement generation and other technologies, the large model can perceive the context, realize the component understanding at the code base level, modify and generate code files at the warehouse project level, and not only complete the code at the function fragment level;
By improving the link-level multi-agent scheduling design, collaborative knowledge base, code base, tool library, and sandbox environment, the large model can realize complex and multi-step tasks in the DevOps field.
Through the DevOps domain-specific domain model and evaluation data construction, private deployment is supported to ensure data security and high availability of specific tasks.

Through this project, the Codefuse team hopes to gradually change the original DevOps Xi, from the traditional DevOps mode of data query and independent decentralized platform operation to the intelligent DevOps mode of large-scale model Q&A, so that "there is no difficult Coder in the world".

Five core modules

DevOps-ChatBot项目整体架构简图如下:

Specifically, it consists of the following 9 functional modules:

Multi Source Web Crawl: A web crawler that provides the ability to crawl relevant information for a specified URL
️ Data Process: Data processing module, which provides the functions of document loader, data cleaning, and text segmentation, and processes and integrates data documents in multi-source formats
️ Text Embedding Index: The core of document analysis, which can be retrieved by uploading documents
Vector Database & Graph Database: Vector and graph databases for data management
Multi-Agent Schedule Core：多智能体调度核心，通过简易配置即可构建所需交互智能体
Prompt Control: The Prompt control and management module defines the context management of the agent
SandBox: a sandbox module that provides an environment for code compilation and action execution
LLM: The brain of the agent, which can support a variety of open-source models and LLM interface ranges
️ API Management: API management components are quickly compatible with related open source components and O&M platforms

In addition to the assembly and collaboration of the above-mentioned functional modules, the DevOps-ChatBot project also has the following core differentiated technologies and functions:

Intelligent scheduling core: The scheduling core with complete system links and one-click configuration of multiple modes
Whole code analysis: warehouse-level code understanding, project file-level code writing and generation
Document analysis enhancement: The document knowledge base is enhanced by retrieval and reasoning in combination with the knowledge graph
Vertical exclusive knowledge: DevOps exclusive knowledge base and vertical knowledge base can be built with one-click self-service
Vertical model compatibility: Compatible with small models in the DevOps domain and DevOps peripheral platforms

Intelligent scheduling core

When dealing with complex problems, we can select, invoke and execute tool feedback through the ReAct process, enabling multiple rounds of tool use and multi-step execution.

However, for more complex scenarios, such as the development of complex code, a single LLM agent is not competent.

The research team hopes to build a scalable, easy-to-use multi-agent framework that can be easily configured to assist in various general tasks such as daily office, data analysis, and development and operation and maintenance.

The multi-agent framework of this project draws on the excellent design of multiple frameworks, such as the message pool in metaGPT and the agent selector in autogen.

The core elements of the multi-agent framework in DevOps-ChatBot include the following six aspects:

Agent Communication: Effective information exchange between agents is essential for context management and Q&A efficiency. There are two communication modes: concise, intuitive and easy-to-understand chain dialogue, and a message pool framework borrowed from metaGPT;
Standard Operation Process (SOP): Define the input and output ranges of the agent and define SOP identifiers, such as Tool, Planning, Coding, Answering, finished, etc., to standardize and process the generated results of the LLM.
Plan and Executor: Increase the use of tools for large models, agent scheduling, and code generation;
Long-short term memory management: In order to simulate the process of human team collaboration, an agent dedicated to content summary (similar to a conference assistant) is added to summarize long-term memory and extract more effective information for transmission.
Human-agent interaction: In the face of complex scenes, humans intervene in the agent interaction process and provide feedback, so that the large model can accurately understand the human intention and complete the task more effectively.
Prompt Control and Management: Responsible for coordinating and managing the prompt interaction between agents to improve the complexity control and interaction efficiency of the system. The input and output are designed with Markdown structure to achieve clear and standardized result display, which is easy to read and analyze.

In the actual operation process, users can realize a complete and complex project launch scenario (Dev Phase) by combining multiple agents, such as demand chain (CEO), product demonstration chain (CPO, CFO, CTO), engineering group chain (selector, developer 1~N), deployment chain (developer, deployer), etc.

Analyze the entire code database

At present, large models are mainly used for code generation, repair, and component understanding tasks, and face the following challenges:

There is a lag in code training data, and the data information in frequently updated open-source/private repositories is not timely.
Large models are not aware of code context and codebase dependency structures.

The research team summarized the main problems encountered in the development, and it can be seen from the following figure that in the development process, the understanding of the existing code base and dependency packages, code retrieval, meta information query, etc. took longer:

In order to solve the above problems, the team obtained the logical structure of the code through program analysis and stored it in the knowledge graph, and then obtained the necessary context information through RAG iterative query enhancement, and combined with multi-agent role-playing, the organic combination of the large model and the code base was realized.

The overall framework of this section is as follows:

Code structure analysis: Cleaning and deduplication of the original code to retain valuable parts of the code. Then, by means of static analysis, the dependency graph between codes is mined from the code base, and the code is interpreted with the help of the understanding ability of the large model, which is an important supplement to the generated structured information graph.
Code Search Generation: Three different search modes are available. Cypher retrieval generation is mainly oriented to the user's understanding of the structure of the code base (such as the number of query classes, etc.), and the graph retrieval is mainly oriented to the user's problem to retrieve code when it contains specific class and method names.

At the same time, the team is also exploring the multi-agent model, iteratively searching the code repository to obtain contextual information, and other agents are responsible for other tasks such as extracting and summarizing information in stages and generating results.

Document analysis enhancements

When large models involve knowledge questions and answers in professional fields (such as medical and communication) and private knowledge questions and answers (private domain data), they are prone to hallucinations and the generated answers are not credible.

The most intuitive solution is to train data from the specific/private domain to enhance model knowledge, but the overhead of training large models is huge.

Therefore, the research team chose the method of knowledge base plug-in and the method of retrieval enhancement generation, and retrieved the data related to the problem from the knowledge base and input it into the large model as additional knowledge, so as to ensure the reliability & real-time performance of the results and avoid training overhead.

How to search and retrieval more accurately is the core problem to be solved in this module, so the research team proposes the following architecture:

The whole DocSearch contains three search links, users can choose their own search links, or they can choose all three to obtain different results.

Traditional document vector database query: Document vector database is the most mainstream method for building knowledge bases. Using the Text Embedding model to vectorize documents and store them in the vector database, combined with the results of contextual literature Xi, this project can choose different retrieval strategies to extract the corresponding knowledge in the knowledge base.
Knowledge graph query: This project uses the Nebula graph database to store and manage the knowledge graph, supports the import of existing knowledge graphs for knowledge retrieval, and also supports automatic extraction of entities and relationships through large models to mine a variety of complex relationships in data.
Knowledge graph inference + vector data query: This project also provides a fusion search of the two. Extract tags for each document first, and build relevant tags in the graph based on user questions. Finally, the documents related to the original problem were retrieved in the document vector database based on the label collection.

Knowledge base building and DevOps knowledge base

As mentioned above, the problem of proprietary/private domain knowledge Q&A can be well solved through knowledge base plug-in and enhanced retrieval generation, and the next core problem is how to better build a knowledge base.

When building a knowledge base, you often face the following questions:

Different data sources are formatted inconsistently and of varying quality
How to automatically identify and weed out errors, duplicates, or irrelevant data
Building a knowledge base relies on expertise
The knowledge base needs to be updated regularly to keep the information accurate and up-to-date

Based on this, the research team proposed the following overall architecture:

Crawler: Collects data and ensures the timeliness of data updates.
Document Loader: Implements the import of multi-source heterogeneous data and flexibly responds to diverse data requirements.
Filter Func: Filter and clean data to ensure the accuracy and efficiency of subsequent analysis.
TextAnalyzer: Realize intelligent analysis of data and transform complex text data into structured (including knowledge graph) and easy-to-understand information.
Pipeline: connects the entire process in series, realizing end-to-end automation from data input to cleaning output;

The research team will focus on the collection and construction of data in the DevOps field in the future, and also hope to help build more private knowledge bases for this standardized data acquisition, cleaning capability & intelligent processing process.

The platform is compatible with the model

With the advent of large language models (LLMs), we are witnessing a change in the way problems are solved, such as the shift from intelligent customer service systems that rely on small-scale model fine-tuning and fixed rules to more flexible agent interactions.

The research team expects to be compatible with the surrounding open source DevOps platforms, and realize conversational interaction through API registration, management, and execution to drive various specific tasks (data query, container operations, etc.).

In order to make the project quickly compatible with relevant open source components and O&M platforms, we can quickly access the tools by registering the baseToolModel class in the python template and writing related attributes and methods such as Tool_name, Tool_description, ToolInputArgs, ToolOutputArgs, and run.

You can use FastChat to start a private inference service or other RESTful APIs, such as Qwen 2.0 and Wenxin Yiyan, and then register them with LLMs for scheduling
You can also register the APIs of Ant Group's related open source projects and O&M platforms to complete O&M operations with a simple LLM conversation

At present, the list of encapsulated tools is as follows: K-SGIMA anomaly detection, code retrieval, document retrieval, DuckduckGo search, Baidu OCR recognition, stock information query, weather query, and time zone query.

Future outlook

At present, the DevOps framework is still in its infancy, and there are still many imperfections, and the research team plans to do core evolution in the following aspects:

Multi-agent scheduling core: Automatically build agent links
Document analysis enhancements: Provides a variety of correction methods and knowledge graph retrieval methods
Whole code database analysis: refines the code parsing and extraction function to enrich the code graph schema
Knowledge base building: Build knowledge base data for different vertical fields
Platform & Model Compatible: Connected with APIs of related open source projects and O&M platforms

Feature display

Driven by these five core modules, DevOps-ChatBot has the following functions:

The first is text knowledge base management:

Text loading, text vectorization services, and vector retrieval services for knowledge bases
Provides functions such as creating, managing, and downloading multiple knowledge bases
Crawlers are supported for real-time URL content crawling

In addition to the text knowledge base, DevOps-ChatBot also supports the upload and management of knowledge graphs and code knowledge base files.

In addition, the R&D team also encapsulates some agent scenarios, such as chatPhase, docChatPhase, searchChatPhase, codeChatPhase, etc., which can support functions such as knowledge base Q&A, code Q&A, tool call, and code execution.

In addition to being used in DevOps, DevOps-ChatBot is also applicable in other fields!

Under the scheduling of multiple agents, DevOps-ChatBot can extend many interesting gameplay methods.

The following gameplay can be completed by assembling and building modules in this project:

代码解释器(Code Interpreter)

As long as a data file is uploaded, DevOps-ChatBot will automatically analyze the data:

Tool use

For example, query the basic time series of a server, input it to a monitoring tool, and analyze it

Smart Stock Analysis (Tool + Code Interpreter)

With a simple natural language query, users can get detailed information about a specific stock, including historical stock price charts, market performance, and likely market movements.

Generate test cases

DevOps-ChatBot can generate test cases against a method in the codebase.

Player Savers (Knowledge Base Q&A)

In addition to these use cases, DevOps-ChatBot can also answer questions related to specific online games. Contains hero information, debut time, city-state, etc.

For example: League of Legends' Hero Relationship Knowledge Graph

One More Thing

The Codefuse team has released DevOpsGPT, an open source project related to large models in the DevOps field, which is mainly divided into three modules, and DevOps-ChatBot in this article is one of them.

In addition, there are two modules, DevOps-Model and DevOps-ChatBot, which are dedicated to the DevOps domain and DevOps intelligent assistant, respectively.

The team's goal is to truly combine large models in the DevOps field to improve efficiency and save costs, including development, testing, O&M, monitoring and other scenarios.

The team expects relevant practitioners to contribute their talents to make "no difficult coder in the world", and will also regularly share their experience & attempts in the field of LLM4DevOps.

Welcome to use & Discuss & Co-build

(1) ChatBot - Out-of-the-box DevOps intelligent assistant: https://github.com/codefuse-ai/codefuse-chatbot

(2) Eval - DevOps LLM Industry Standard Evaluation: https://github.com/codefuse-ai/codefuse-devops-eval

（3）Model - DevOps 领域专属大模型： https://github.com/codefuse-ai/CodeFuse-DevOps-Model

— END —

QbitAI · Headline number signed

An intelligent assistant handles the entire software development process, from design to operation and maintenance, all to AI

Solve the defects of the general large model

Five core modules

Feature display

One More Thing

Read on