A book has been published on the large model review

In March 2023, we published a survey of large language models. Now in its 13th edition, this review article contains 83 pages of text and includes more than 900 references. The purpose of this review article is to systematically sort out the research progress and core technologies of large language models, and discuss a large number of related work. Since the preprint of the review of large language models was launched, it has attracted the attention of many readers.

Since the launch of the English review article, readers have asked whether there is a corresponding Chinese version. To this end, we published a Chinese translation of the review in August 2023. In order to better provide Chinese reference materials for large model technology, we continued to start the compilation of the Chinese book at the end of December 2023 and recently completed the first draft. Different from the positioning of the English review article, the Chinese version of the book pays more attention to providing explanations for introductory readers of large model technology, so we have greatly updated and reorganized the content to try to show an overall large model technical framework and roadmap. This book is suitable for senior undergraduate students and junior graduate students with a foundation in deep learning, and can be used as an entry-level reference material. Link to Chinese book project: https://llmbook-zh.github.io/

Chinese book download link 1: https://github.com/LLMBook-zh/LLMBook-zh.github.io/blob/main/LLMBook.pdf Chinese book download link 2: http://aibox.ruc.edu.cn/zws/ index.htm Chapter organization: 1. Background and basic knowledge Chapter 1 Introduction (the development process of large models, Overview of Important Technologies) Chapter 2 Basic Introduction (Scaling Law and GPT Series Model Development History) Chapter 3 Large Model Resources (Open Source Models, Data, Code Bases) Chapter 2 Pre-training Chapter 4 Data Preparation (Data Collection, Cleaning, Proportioning, Curriculum Methods) Chapter 5 Model Architecture (Transformer Structure, Mainstream Architecture of Large Models, and Detail Improvement) Chapter 6 Model Pre-training (Pre-training Tasks, Optimization Parameter Settings, and Parallel Training Methods) Chapter 7 Fine-tuning and Alignment Instruction Fine-tuning (Instruction Data Collection and Synthesis Methods, Instruction Fine-tuning Strategies and Functions) Chapter 8 Human Alignment (3H Standard, RLHF Algorithm, Non-RL Algorithm) 4. Large Model Usage Chapter 9 Decoding and Deployment (Decoding Generation Algorithms, Decoding Acceleration Algorithms, Model Compression Algorithms) Chapter 10 Prompt Learning (Basic Prompt Methods, Contextual Learning, Chain of Thought) Chapter 11 Planning and Agents (Complex Planning Methods, Agent Building Methods) 5. Evaluation and Application Chapter 12 Evaluation (Evaluation Indicators and Methods, Basic and Advanced Competency Assessment, Evaluation System) Chapter 13 Application (Overview of Research Field and Application in Professional Field)

Timeline of large language model development

Evolution diagram of the derivative work of the LLaMA series of modelsIn the process of writing this book, we have received a large number of revisions from many peers, and we would like to express our gratitude to you, and hope that you will continue to support and pay attention to our large model Chinese book, and your support and feedback will be the biggest driving force for us to move forward. The first edition of this book is just a starting point, and we plan to continue to update and improve the content online, and we especially welcome valuable criticism and suggestions from readers, and we will also acknowledge readers who have made valuable suggestions on the website. If you have any comments, comments, or suggestions, please send us back through the GitHub issue page (https://github.com/LLMBook-zh/LLMBook-zh.github.io/issues) or email. In order to better organize and disseminate the latest progress and technical system of large model technology, we provide the following supporting resources for readers to refer to and use when reading this book. Large Model Code Tool Library: We have developed a comprehensive code tool library, LLMBox, specifically for the development and implementation of large language models, based on a unified training process and a comprehensive model evaluation framework. LLMBox is designed to be a one-stop solution for training and utilizing large language models, with a large number of practical functions integrated inside, achieving a high degree of flexibility and efficiency in the training and utilization phases. Tool Library Links: https://github.com/RUCAIBox/LLMBox.

YuLan large model: The YuLan series model is a large language model that supports chat jointly developed by teachers and students of the Hillhouse School of Artificial Intelligence of Chinese University of China (the name "Yulan" is taken from the school flower of Chinese National University). The latest version completes the entire pre-training process from scratch and uses course learning technology for supervised fine-tuning based on bilingual data in both Chinese and English, including high-quality instructions and human preference data. Model Link: https://github.com/RUC-GSAI/YuLan-Chat.

The list of key leaders and participants in each chapter of this book is as follows:

The leaders of the third chapter are Min Yingqian and Yang Chen, and the participants are Li Junyi and Zhou Kun;
The leaders of Chapter 4 are Zhang Junjie, Hou Yupeng, and Zhou Kun;
The person in charge of the fifth chapter is Dong Zican, and the participants are Tian Zhen and Tang Tianyi;
The leaders of Chapter 6 are Tang Tianyi and Chen Yushuo;
The person in charge of Chapter 7 is Tang Tianyi, and the participants are Cheng Xiaoxue;
The leaders of Chapter 8 are Li Junyi and Chen Zhipeng;
The leaders of Chapter 9 are Chen Yushuo, Liu Peiyu and Tang Tianyi, and the participants are Zhou Kun;
The leaders of Chapter 10 are Li Junyi, Tang Xinyu and Du Yifan;
The leaders of Chapter 11 are Ren Ruiyang and Jiang Jinhao, and the participants are Li Junyi;
The leaders of Chapter 12 are Zhang Beichen and Zhou Kun, and the participants are Zhang Gaowei;
Chapter 13 is led by Zhou Kun, and the participants (in alphabetical order) are Jiang Jinhao, Li Yifan, Liu Zikang, Sun Wenqi, Wang Yuhao, Xu Lanling, Yang Jinxia, and Zheng Bowen.

I would also like to thank all the other students and teachers who participated in the compilation and proofreading of this book. Click "Read More" to download the Chinese book!

A book has been published on the large model review

Read on

Baidu's strongest SOTA: 3DGS based on diffusion model!

Sprint 2024 "Half Year Red" | Sixty percent of AI companies have achieved profitable growth, and large model companies have made money?

Dialogue with UBTECH Jiao Jichao: Large model accelerates humanoid robots to "work in the factory"

iFLYTEK's profit puzzle: high investment and low return in the field of large models

Ali Lin Junyang: Large models are not enough for many people, and building multimodal agents is the key

Li Feifei, the godmother of AI, founded a spatial intelligence company that strives to overcome the existing limitations of large-scale AI technology

"Butterfly Model" classic example class notes

Li Feifei, the "godmother of AI", founded a spatial intelligence company in an effort to overcome the existing limitations of AI technologies such as large models

The large model engages in "human flesh search", and the accuracy rate is as high as 95.8%!

Product Life (4): From "User Story Mapping" to "WOOP Mindset"

Surveying and Mapping Bulletin | Li Yayun: Research and Application of Multi-scale Population Spatial Big Data Aggregation Model in Map Visualization

Kimi large model: the advantages are obvious, but it is a money-burning game

Sunday Jingxue (139): Journal Paper 2.1 Wholesale Price Contract Model in Traditional Supply Chain

Northeastern University has proposed a video data augmentation method that can make video models learn better representations

Geely vast platform + Baidu AI model, Jiyue 07 is the strongest opponent of Xiaomi SU7?

8 major calculation models for buoyancy calculation