How can enterprises use Model Fine-tuning (SFT) to customize and optimize large models?

author：Everybody is a product manager 2024-04-08 10:49:00

Nowadays, each company is making its own large model, or using the large model to optimize to meet the requirements of the enterprise. In this case, how do we use model fine-tuning to customize and optimize large models? This article introduces the training steps of model fine-tuning and gives relevant case references, hoping to help you.

How can enterprises use Model Fine-tuning (SFT) to customize and optimize large models?

Last time we finished talking about the instruction engineering tuning model, a friend said that it was very rudimentary and could not solve practical business problems.

The model fine-tuning (SFT) we talked about today can solve your confusion to a certain extent, and this time I will still share my specific effects, applicable scenarios, examples and detailed training steps in practical applications.

Don't say much, open the whole ~

01 Definition and effect of model fine-tuning

Model fine-tuning is a key step in the tuning strategy of large models. It exists in two strategies:

全参数微调（Full Parameter Fine Tuning）
部分参数微调（Sparse Fine Tuning）

Full-parameter fine-tuning involves adjusting the overall weighting of the model to a specific domain or task, and such a strategy is useful when there is a large amount of training data that is highly relevant to the task.

Partial parameter fine-tuning, on the other hand, selectively updates only certain weights in the model, especially when we need to maintain most of the pre-training knowledge, which reduces the risk of overfitting and improves training efficiency.

The core effect of fine-tuning is to improve the performance of the model on a specific task while retaining its generalization ability.

02 Scenarios for which the model is suitable and not applicable for fine-tuning

Applicable scenarios

When you have a large amount of domain-relevant marker data, it is appropriate to fine-tune all parameters.
When the model needs to be domain specific and at the same time maintain a certain generalization ability, some parameter fine-tuning is a better choice.

N/A

When training data is limited, or when it differs greatly from the original pre-training data, full-parameter fine-tuning can lead to overfitting.
If the task requires the model to have a broad knowledge background and generalization capabilities, some parameter fine-tuning may be too narrow.

03 Training steps for model fine-tuning

Three-step method:

1) Determine the fine-tuning strategy: Select full-parameter fine-tuning or partial parameter fine-tuning based on the amount of available training data and task requirements.

2) Prepare the dataset: Prepare the relevant labeled data according to the fine-tuned strategy.

3) Fine-tuning Training:

For full-parameter fine-tuning, it usually takes a long time to train as well as a lot of data.
For partial parameter fine-tuning, determine which parameters need to be updated and complete training for a particular layer or module in a short period of time.

In short, full-parameter fine-tuning is usually used in the presence of a large amount of labeled data and a clear mission goal to fine-tune the performance of the model.

In scenarios where there is little data or where extensive knowledge of the model needs to be retained, some parameters are fine-tuned to achieve higher efficiency and avoid overfitting.

04 Example of model fine-tuning: Policy push based on enterprise user behavior

Let's say we have a database of enterprise users on hand that records user clicks and feedback behaviors on various policy notices.

The goal is to fine-tune a language model so that it can infer new policies that users may be interested in based on their historical behavior and push them effectively.

Specific steps for full-parameter fine-tuning

Data preparation: Organize the behavior dataset of enterprise users, and each sample includes user behavior characteristics and corresponding policy feedback.
Data preprocessing: Cleans and preprocesses datasets, standardizes text content, and encodes classification labels.
Model selection: Choose a pre-trained model suitable for text classification tasks, such as the Tongyi Qianwen/Wenxin Yiyan model in China.
Fine-tuning settings: Configure the parameters for fine-tuning, such as learning rate, batch size, and number of iterations.
Fine-tuning execution: Fine-tuning the full parameters of the model using a curated dataset typically requires execution in a GPU-accelerated environment.
Performance monitoring and evaluation: The performance of the model is continuously monitored through the validation set, using metrics such as accuracy, recall, and more.
Application of fine-tuning results: Deploy the fine-tuned model to the policy push system to test the performance of the model in the actual environment.

The specific steps for fine-tuning some parameters

Data collection: Behavioral and feedback data from business users is also required, but may focus more on specific behavior patterns or key characteristics.
Key parameter selection: Analyze which model parameters are more closely associated with user behavior and select only those parameters for training.
Fine-tuning configuration: The parameter settings may be different when configuring fine-tuning, because fewer parameters are updated.
Targeted training: The collected data is used for parameter updates in parts of the model's structure, such as the output layer or the attention mechanism section.
Performance evaluation: Use a small set of test data to quickly evaluate the performance of the adjusted model.
Fine-tuned model deployment: The model with fine-tuned parameters is applied to the policy push system and its actual effect is observed.

We actually want the model to recognize a pattern that "when a user clicks on a certain type of policy information multiple times, the next time a similar policy is launched, the system should prioritize pushing that type of policy to the user".

To fine-tune all parameters, we set up a supervised learning framework that maps out the connections between user behavior and policy categories, and performs gradient updates across the model.

In some parameter fine-tuning, we may focus on a small part of the model, such as adjusting the decision-making layer so that the algorithm can learn to judge which type of policy is most likely to get clicks from users based on the clustering of user behavior, which means that the main change is the weight judgment of the model on the type of behavior.

Through such a refined fine-tuning process, the model can complete the task of pushing enterprise user policies with higher accuracy, and realize personalized service and efficiency improvement.

05 Final Words

Overall, model fine-tuning has the advantage of improving the performance and adaptability of the model for specific tasks, ensuring that the model output is not only accurate, but also reliable and consistent, while the disadvantage is that it is a computationally intensive process that can be difficult to perform with limited resources, especially for large models.

So how do companies judge?

I hope it gives you some inspiration, come on.

Author: Liu Xing talks about products, public account: Liu Xing talks about products

This article was originally published by @柳星聊产品 on Everyone is a Product Manager. Reproduction without permission is prohibited.

The title image is from Unsplash and is licensed under CC0

The views in this article only represent the author's own, everyone is a product manager, and the platform only provides information storage space services.