laitimes

Details of the technical principles of mainstream large language models

1. Compare the details of LLaMA, ChatGLM, Falcon and other big language models: tokenizer, positional coding, layer normalization, activation function, etc. 2. Distributed training technology for large language models: data parallelism, tensor model parallelism, pipeline parallelism, 3D parallelism, zero redundancy optimizer ZeRO, CPU offloading technology ZeRo-offload, mixed precision training, activated recalculation technology, Flash Attention, Paged Attention. 3. Efficient parameter fine-tuning techniques for large language models: prompt tuning, prefix tuning, adapter, LLaMA-adapter, LoRA.

0. Outline

Details of the technical principles of mainstream large language models

1. Details of large language models

1.0 transformer and LLM

Details of the technical principles of mainstream large language models

1.1 Model structure

Details of the technical principles of mainstream large language models

1.2 Training objectives

Details of the technical principles of mainstream large language models

1.3 tokenizer

Details of the technical principles of mainstream large language models

1.4 Location Code

Details of the technical principles of mainstream large language models

1.5 Layer normalization

Details of the technical principles of mainstream large language models

1.6 Activation Functions

Details of the technical principles of mainstream large language models

1.7 Multi-query Attention 与 Grouped-query Attention

Details of the technical principles of mainstream large language models

1.8 Parallel transformer block

Details of the technical principles of mainstream large language models

1.9 Summary - Training stability

Details of the technical principles of mainstream large language models

2. Distributed pre-training of LLM

Details of the technical principles of mainstream large language models

2.0 Peer-to-peer communication and collective communication

Details of the technical principles of mainstream large language models

2.1 Data parallelism

Details of the technical principles of mainstream large language models

2.2 Tensor parallelism

Details of the technical principles of mainstream large language models
Details of the technical principles of mainstream large language models

2.3 Pipeline parallelism

Details of the technical principles of mainstream large language models

2.4 3D parallelism

Details of the technical principles of mainstream large language models

2.5 Mixed Precision Training

Details of the technical principles of mainstream large language models

2.6 Activate Recalculation

Details of the technical principles of mainstream large language models

2.7 ZeRO, zero redundancy optimizer

Details of the technical principles of mainstream large language models

2.8 CPU-offload,ZeRO-offload

Details of the technical principles of mainstream large language models

2.9 Flash Attention

Details of the technical principles of mainstream large language models

2.10 vLLM: Paged Attention

Details of the technical principles of mainstream large language models

3. Efficient fine-tuning of LLM parameters

3.0 Why efficient parameter fine-tuning?

Details of the technical principles of mainstream large language models

3.1 prompt tuning

Details of the technical principles of mainstream large language models

3.2 prefix tuning

3.3 adapter

Details of the technical principles of mainstream large language models

3.4 LLaMA adapter

Details of the technical principles of mainstream large language models

3.5 LoRA

Details of the technical principles of mainstream large language models

3.6 Experimental comparison

Details of the technical principles of mainstream large language models

4. References

Details of the technical principles of mainstream large language models

Written by Spring

Source: WeChat public account: Tencent Technology Engineering

Source: https://mp.weixin.qq.com/s/P1enjLqH-UWNy7uaIviWRA

Read on