laitimes

The Big Language Model of Federal Law

author:Translation technology is a thousand questions

“FEDJUDGE: FEDERATED LEGAL LARGE LANGUAGE MODEL”

There are already many excellent legal LLM models, such as Lawyer LLaMA and ChatLaw. However, although Legal LLM has achieved remarkable results in the context of centralized data training, few people have explored its application in federated learning scenarios.

In the legal field, federated learning brings a range of potential benefits and opportunities for the application of Legal LLM. First, the privacy of legal data is a crucial issue. A large amount of legal data is distributed in organizations such as courts, prosecutors' offices, consulting companies, and legal education and training institutions, and this data contains sensitive information about individuals. By adopting federated learning, Legal LLM can be trained on local devices, and parameters can be aggregated and distributed on a central server, avoiding the sharing of raw data and effectively protecting user privacy.

In addition, there is a scarcity of data in the legal field. Legal data in specific areas may be very limited, such as case data for specific rare cases or legal practice data in specific regions. In traditional centralized learning, this data may not be fully utilized. With federated learning, model training can be performed on local devices and scattered data resources can be used to improve the performance and generalization ability of the model.

This paper proposes FedJudge, a Federated Legal Large Language Model.

The Big Language Model of Federal Law

Paper address: https://arxiv.org/pdf/2309.08173.pdf

Github address: https://github.com/yuelinan/FedJudge

summary

This paper proposes a federated learning framework called FedJudge for efficient and effective fine-tuning of legal large language models. The framework utilizes a parameter-efficient fine-tuning method to update only a small number of additional parameters during federated learning training. In addition, continuous learning methods are explored to preserve important parameters of the global model when training local clients to mitigate data transfer issues. Extensive experimental results on three real-world datasets clearly verify the effectiveness of FedJudge.

Brief introduction

Large language models are widely used in the field of legal intelligence, and by fine-tuning legal data, a variety of legal language models can be generated, such as Lawyer LLaMA and ChatLaw, which can help legal professionals improve their work efficiency and provide legal consulting services for ordinary people.

Legal LLMs, while achieving good results, still have data privacy concerns because the training set is centralized. In practice, with large amounts of legal data distributed across different institutions, direct data communication may not be feasible, so the legal LLMs training paradigm for data centers is not available.

This article describes how to fine-tune language models (LLMs) to the legal field under the framework of federated learning (FL). By implementing the FL approach, legal LLMs can be fine-tuned on local devices or clients such as courts and legal consulting firm clients. Legal LLMs are then updated by aggregating and distributing parameters to avoid sharing raw data and effectively protect data privacy.

Effectively applying the FL method to fine-tuning of Legal LLMs remains an important question that needs to be answered.

LLM has many parameters, and fully fine-tuning LLM is very resource-intensive, which will bring huge computing overhead and limit the fine-tuning of clients with limited computing power. In addition, aggregating and distributing LLM parameters using traditional FL algorithms such as FedAvg increases the communication overhead of the FL system. Standard FL workflows are not available, and both reduce the efficiency of LLM fine-tuning.

Changes in the distribution of legal data will affect the fine-tuning of LLMs, and data differences from different customers will lead to poor aggregation performance during training, thereby reducing the effectiveness of the FL method. In court clients, textual data is often presented in a professional legal language style, while in consulting clients, data is more inclined to be colloquially described.

The Big Language Model of Federal Law

This paper proposes the first Federal Law Big Language Model (FedJudge) framework, which adopts parameter-efficient fine-tuning methods to effectively fine-tune LLMs in a federated learning environment. First, the LoRA method is used to train each local client separately, then upload the trained parameters to the central server for aggregation, and finally distribute the aggregated global parameters to each client to achieve efficient fine-tuning of federal law LLMs.

In order to solve the problem of data distribution drift, this paper proposes a continuous learning method based on global and local models. By constraining the parameters of the local model so that it does not forget the important parameters of the global model, the weight difference between the local and global models is reduced, the knowledge of the global model is preserved, and the problem of data drift is mitigated.

The main contributions of this paper are as follows:

  • FedJudge is the first federal legal LLMs framework to consider the performance degradation caused by compute and communication overhead in LLMs fine-tuning, as well as legal data heterogeneity.
  • A parameter-efficient LoRA fine-tuning method is proposed, and a continuous learning method is introduced to prevent important knowledge of the global model from being forgotten during local model training.
  • Extensive experiments in court opinion generation, legal logic reasoning, and legal advice tasks verify the effectiveness of FedJudge.

Fedjudge: A Large Language Model of Federal Law

Problem definition

In federated learning, there are N clients and their corresponding local private data. Our goal is to use instruction tuning methods in different clients to efficiently fine-tune the parameters of the underlying generative LLM to suit the legal field. The target loss for autoregressive training is that different clients have different data distributions.

The Big Language Model of Federal Law

where x and y represent instruction input and instruction output, respectively, ym represents y's mth marker, y<m represents the mark before ym, Wip represents frozen LLMs parameters, and Wie is the trainable parameter.

In federated learning, we upload the parameters of each client to a central server and use the aggregate function fagg(·) Aggregate all client parameters into the global parameter Wˆe. Finally, we distribute Wˆe to each client to complete a round of communication updates and FedJudge training.

FedJudge architecture

FedJudge is a model evaluation method for federated learning. To reduce calculation and communication overhead, parameter-efficient fine-tuning methods are introduced. Based on this, FedJudge-CL is designed to solve the data drift problem in FL training.

Efficient parameter fine-tuning in FedJudge

This paper introduces how LLMs fine-tune into the legal field using parameter-efficient fine-tuning methods under the federated learning framework. Each local client Ci is first trained using the LoRA method, where LLMs parameters are frozen and trainable rank factorization matrices are introduced into each layer of the LLMs' Transformer architecture. The corresponding learning objective for each local client is given in Equation (1), and only the Wie is updated during local training.

After updating the client, upload the LoRA parameters of all clients to the central server and use the weighted average function fagg(·) Parameter aggregation is performed. This paper uses the weighted average function as fagg(·).

The Big Language Model of Federal Law

For clarity, we represent Wie as Wie(t)(t≥1) in the t-round communication, and Eq. Li in (1) is represented as Li(t). At the end of the t round, we distribute the aggregate parameter Wˆe(t) to each client and replace the local parameter We(t) with the global parameter Wˆe(t).

FedJudge continues to learn

The Big Language Model of Federal Law

Although FedJudge-Base can efficiently train FedJudge, it still has the problem of data distribution transfer, which reduces the effect of FedJudge. Therefore, we extended FedJudge-Base to FedJudge-CL, leveraging a continuous learning approach to mitigate data transfer issues.

We continue to train locally on each client based on the distributed parameter Wˆe. Then, we limit the important parameters in Wˆe to change as little as possible during training. This constraint ensures that the local model does not forget what it has learned globally.

To achieve continuous learning, we employ continuous learning constraints in where Wˆe(t-1) represents the global parameters obtained and distributed in the (t-1) round of communication, and Wie(t) represents the locally trainable parameters of the current (t) round of training. We use the following equation to assess the variation of parameters.

The Big Language Model of Federal Law

A Jacobians-based approach to select clients is used, and a gradient-based approach is used for model aggregation. Ultimately, the goal of each client is to minimize its local loss function.

experiment

data set

This paper evaluates FedJudge's performance using three differently distributed datasets, including the Court Opinion Generation dataset, the Legal Advice Dataset, and the Legal Inference dataset. These datasets are not shared to simulate FL scenarios.

Court View Generation Dataset (Client 1): The Court View is used to interpret the human judge's ruling on the facts of a case. Therefore, in this dataset, our goal is to automatically generate a court view based on the facts of a given case. First, 59,927 cases were collected from the C3VG[17] dataset. Then, following the instruction tuning method [18], we process the collected data in the form {instruction input: instruction output} (e.g. the example in Figure 1(a)). Finally, we divide this data into training sets and test sets. Table 1 shows the detailed statistics of the dataset.

The Big Language Model of Federal Law

Legal Consultation Dataset (Client2): We first collect legal advice data from lawyers LLaMA [3] as a training set, which naturally presents the consultation data in the form of {instruction input: instruction output}. Among them, the instruction input is a legal question raised by a layman in a real-world scenario, and the instruction output is generated by ChatGPT [1]. Then, 2,797 consulting data were extracted from a public dataset 1 as a test set. Figure 1(b) shows an example of a consulting dataset.

Legal Reasoning Dataset (Client3): This dataset is also collected from LLaMA of lawyers, where the instruction input is a problem that requires reasoning and the instruction output is generated by ChatGPT [1]. We divide them into training sets and test sets. Figure 3 shows an example of an inference dataset.

Experiment setup

Contrast method

This paper introduces the method of using Baichuan-7B as a pre-trained LLM backbone network, and introduces the baseline model based on Baichuan-7B and LoRA. These models achieve competitive results in Chinese intelligent tasks.

This paper describes several different training methods, including direct prediction using Baichuan-7B, standard centralized training methods, centralized methods that train only with their own private data, and federated learning methods. The federated learning method can not only obtain the global federated model FedJudge, but also obtain personalized models for individual clients.

Evaluation metrics

This paper evaluates the effectiveness of FedJudge using evaluation metrics such as ROUGE F12, BLEU3, and BertScore.

Implementation details

This paper introduces the application of FedJudge-CL and LoRA algorithms in federated learning. In the experiment, 3 clients were set up, FedJudge-CL had a lambda of 1, LoRA ranked 4, communication rounds were 5, and Adam optimizer had a learning rate of 2e-4. Experiments were performed on 2 Tesla A100 40G GPUs with a batch size of 2 per device and a gradient accumulation step of 8.

Experimental results

Overall performance

The Big Language Model of Federal Law

Baichuan-7B performs well in zero-sample situations, indicating that it has been trained with legal abstraction and reasoning capabilities through a large amount of data. However, its results are still worse than the fine-tuned models, which also indicates the need to fine-tune LLMs into the legal field.

The central model mixes data with different distributions during training, but still performs worse than the federated model on multiple metrics, indicating that it is not appropriate to simply mix data with different distributions for centralized training.

The model trained using Center-ClientE performed well on test data with the same distribution, but not on other data with different distributions. This shows that in the context of data privacy, it is necessary to use federated learning to obtain a globally legitimate LLM.

FedJudge-Base achieved competitive results compared to Center, while our personalization model, Base-ClientE, achieved better results than Center-ClientE by fine-tuning LLMs and LoRA in FL environments. Finally, we observe that both FedJudge-CL and CL-ClientE outperform other models.

By continuously learning constraints, FedJudge-Base and Base-ClientE performed well on most metrics, suggesting that data drift issues can be mitigated. At the same time, the local model is constrained and does not forget global knowledge, which helps to update the global model and thus obtain a more efficient FedJudge-CL model.

Case study

The Big Language Model of Federal Law

This paper qualitatively analyzes the text generated by CL-Client3 and the baseline model Baichuan-7B and Center. Through an example of a legal reasoning dataset, it was found that Baichuan-7B's answer was irrelevant to the question, and the Center's reasoning process was wrong, while CL-Client3 answered the question correctly and gave the corresponding reasoning process, proving the effectiveness of the method.

summary

This paper introduces how to fine-tune large language models (LLMs) to the legal field in a federated learning (FL) environment, and proposes the first federal legal LLMs framework (FedJudge). Specifically, we developed a parameter-efficient fine-tuning method to achieve efficient training on FedJudge. In addition, to alleviate the problem of data distribution skew in FL, we incorporate a continuous learning method into FedJudge to limit the important knowledge of the global model from being forgotten during local training. Experimental results on three real-world datasets demonstrate the effectiveness of FedJudge.

Special note: This article is only for academic exchanges, if there is any infringement, please contact the editor in the background to delete.

- END -

Reprinted from: Lingdu Intelligence

Reprinted by Yang Songmei

Read on