laitimes

Research on the application of big language model in the field of bank wealth management

author:Shanghai Finance and Development Laboratory

Wu Yongfei

Ye Guangnan

Liu Sen

Wang Yanbo

Review of the development of artificial intelligence and AIGC The development of global artificial intelligence has experienced three great booms, namely: the 50s to 70s of the 20th century, the first wave of artificial intelligence marked by the 1956 Dartmouth Conference, symbolism and logical reasoning as the main research content; In the 80s of the 20th century, the second wave of artificial intelligence was carried out with the practical application of artificial intelligence by "expert systems" in specific fields, and the knowledge base system and knowledge engineering relied on by "expert systems"; From the 90s of the 20th century to the present, the third wave of artificial intelligence based on artificial intelligence algorithms based on statistical learning methods, as well as the deep learning algorithm proposed by Turing Award winner and deep learning pioneer Geoffrey Hinton in 2006, and promoting artificial intelligence beyond the human level in many fields. Since the beginning of the 21st century, big data and large computing power have provided more powerful support for artificial intelligence applications, and deep learning algorithms represented by generative adversarial networks (GAN, Generative Adversarial Network) have accelerated breakthroughs in many theories and practices, and artificial intelligence generated content (AIGC, Artificial Intelligence Generated Content) or "generative AI" ushered in an explosive period. AIGC automatically or assists in the generation, creation, modification and editing of digital content such as text, images, audio, video, games, code, and models through artificial intelligence algorithms, forming a new digital content production method. Among them, ChatGPT conversational robot, as a product of the AIGC model, is an artificial intelligence large language model with GPT (Generative Pre-trained Transformer) as the main framework, which is the closest technology to Artificial General Intelligence (AGI) so far, opening a step from weak artificial intelligence to strong artificial intelligence. It will bring profound changes and have a far-reaching impact on human society. The development of large language model technology represented by ChatGPT, the development of the industry, the artificial intelligence big language model represented by the technology behind ChatGPT is giving birth to a new round of artificial intelligence, setting off a global competition for artificial intelligence large language model technology, technology giants accelerating the layout, and the field of generative AI is surging. From 2018 to the present, OpenAI has iterated and launched GPT-1, GPT-2, GPT-3, InstructGPT, ChatGPT and GPT-4. Microsoft, Google and other technology giants have accelerated follow-up, among which Microsoft took the lead in applying GPT-4 to the New Bing search engine, which can more accurately understand the query needs of users and provide real-time information related to them. At the same time, domestic AI leaders represented by Baidu, Huawei, Alibaba, and SenseTime have accelerated the R&D and commercial application of large language models, and carried out model practice in NLP, OCR, computer vision, speech recognition and other fields, initially forming end-to-end full-stack big language model technology application capabilities. University scientific research institutions represented by Fudan University and Tsinghua University have open-sourced and independently developed artificial intelligence large language models for the market, and actively promoted ecological construction. Based on the technical advantages accumulated over the years in the field of cloud computing, traditional cloud vendors prioritize the construction of intelligent computing infrastructure based on large language models, which is expected to form a MaaS (model-as-a-service) model, empower industries, and promote the digital intelligence upgrade of enterprises, in order to comprehensively drive the digital transformation and development of industries. As a data-intensive industry, the banking industry has always been the vanguard of advanced technology applications, and commercial banks represented by Industrial and Commercial Bank, Agricultural Bank of China, and Huaxia Bank have explored the application of artificial intelligence big language models in various scenarios in the financial field. ChatGPT is based on the powerful language understanding and generation ability of large-scale pre-training model GPT-3.5, and introduces RLHF (Reinforcement Learning from Human Feedback) technology, which incorporates human feedback into the training process and provides a natural and humanized interactive learning process for machines. Get feedback from humans, learn from a broader perspective and with greater efficiency, learn from more specialized knowledge and standardize its value orientation. ChatGPT model technology, through the generation of single-modal and multi-modal content such as text, code, images, and videos, forms an efficient digital content production mode, opens a digital content production revolution, and greatly improves productivity; By accurately understanding user intentions, calling existing software tools, algorithm models and third-party services to meet various needs of users, form a more efficient human-computer interaction mode, and make it possible for everyone to have their own AI assistant; Through efficient information aggregation and knowledge refinement, combined with professional knowledge base or search engine, the accuracy and real-time of the reply content are greatly improved, and it is expected to form a new way of knowledge representation, call and acquisition, reducing costs and increasing efficiency for information search and knowledge acquisition. At present, ChatGPT has reached the basic human level in the fields of knowledge Q&A, language translation, information search, content creation, code generation, simple reasoning and data analysis. In addition, ChatGPT has shown a wide range of application prospects in the financial field, covering many application scenarios such as risk management, fraud detection, financial planning, marketing automation, intelligent customer service, enhanced knowledge graph, improving customer activity and legal compliance. Limitations of ChatGPT Although ChatGPT has strong language understanding and content generation capabilities, there are still some limitations, including but not limited to the following three aspects. The output lacks timeliness. ChatGPT is usually trained on historical data, and does not have the ability to acquire and process new data in real time, so it is difficult to update the knowledge reserve in the model in real time. For some real-time updates or instant messages, the model may output inaccurate or incorrect information, and to make the training data include the latest information, the time and cost of training are very large, and the update speed will be much slower than the search engine. The reliability of the output needs to be further improved. ChatGPT output will still have factual errors, it cannot verify the authenticity of the data source itself, does not have the ability to verify the reference data source, and may output some fictitious or incorrect information. In addition, although ChatGPT technology has extremely excellent language "creation" ability, and seemingly logical "logical reasoning" effect, ChatGPT's reasoning and generating answers rely on the "statistical probability" method, so it is not enough to accurately deal with logical problems. Not only that, if training is not infused with data in specific professional fields, ChatGPT does not perform satisfactorily in pendant applications in specific professional fields. There is room for ambiguity in the ethical boundaries of models. ChatGPT is pre-trained based on real-world language data, and if the data has bias and harmful content, as well as the bias of the annotators, it will cause the model to output harmful content with discrimination, prejudice and other unethical and moral violations. Although the model developer intends to avoid the above problems, it is possible for the model to output harmful content with some inducement or improper manipulation. Thinking on the application mode of artificial intelligence big language model: artificial intelligence big language model application mode classification: The artificial intelligence big language model technology behind ChatGPT will give birth to new formats and bring new opportunities. In the era of cloud computing, IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service) help enterprises migrate their businesses to the cloud faster to achieve information development goals. In the era of artificial intelligence, MaaS (Model as a Service) will provide model capabilities to enterprises to support digital transformation and intelligent improvement of enterprises and industries. Therefore, how to quickly apply AI big language models is the key to empowering the real economy, facilitating people's lives, and promoting the digital and intelligent transformation of enterprises. From the perspective of the actual application of artificial intelligence big language model by enterprises, it can be divided into public cloud model and private cloud model. The public cloud model is mainly for technology giants to provide model capabilities to the market by building artificial intelligence large-language model infrastructure, realize MaaS (model as a service), and meet the needs of enterprises and individuals with different development capabilities. Specifically, it mainly includes directly calling the inference service, model fine-tuning service, model hosting, etc.: First, directly calling the inference service. Users can directly access the core inference capabilities of the general large language model through paid subscriptions and obtain inference results. Second, fine-tune the service. Users can use a small amount of domain data according to their own needs to train a customized large language model at a relatively low cost on the basis of a general large language model. Third, hosting services. Users can deploy general-purpose big language models or fine-tuned industry and professional big language models directly to the cloud. In this way, users only need to invoke the large language model, do not need to worry about the complexity of deployment and management, and ensure the availability, efficiency, and security of the large language model. The private cloud model is mainly due to the protection of sensitive information and important data and the need for compliance, and enterprises deploy artificial intelligence big language models in local private clouds for internal users. The construction of artificial intelligence large language models requires extremely high computing power, which is not affordable for ordinary enterprises or individuals. Considering factors such as construction cost and difficulty, enterprises mainly include, but are not limited to, the following three ways to build artificial intelligence big language models: First, cooperative deployment. By privatizing and deploying the general-purpose AI large-language model of the model service provider locally for internal use by the enterprise. In this way, the model parameters are generally as high as 100 billion or even higher, with high general intelligence, but the construction cost is large, and the customization needs of enterprises are difficult to meet. Second, the "big language model + fine-tuning" approach. By selecting medium-scale artificial intelligence large-language models (tens of billions of parameters) or pruning, quantifying, and distilling ultra-large-scale large-scale large-language models, and combining enterprise private data, fine-tuning the large-language models can be applied to specific vertical industries, fields, and scenarios. Third, the "pre-training + fine-tuning" approach. Enterprises independently build large-language models through the large-scale "pre-training + fine-tuning" paradigm. Under this method, there are high requirements for the cost of enterprise computing power and the control of core technology. Comparative analysis of the advantages and disadvantages of the application mode of artificial intelligence big language model The comparative analysis of the application mode of artificial intelligence big language model is detailed in Table 1.

Table 1 Comparative analysis of application modes of artificial intelligence large language models

Research on the application of big language model in the field of bank wealth management

For commercial banks, data security protection must comply with national laws and industry regulatory requirements, and calling external vendors' large-language model services needs to rely on the security provided by AI model service providers, and there may be risks that data will be accessed or stolen by third parties. Therefore, commercial banks should give priority to promoting the application of artificial intelligence big language models in the private cloud model. Among them, under the cooperative deployment mode, the construction cost of the general-purpose AI large-language model of the privatized deployment model service provider is high, and it is difficult to meet the customized needs of commercial banks. In the "pre-training + fine-tuning" mode, the threshold for the development, training, and inference deployment of large-language models is very high, and large-scale computing power is usually required for model training, and enterprises need to bear relatively high hardware costs such as GPUs, and the technical difficulty is large, and there are greater implementation and application risks. In the "big language model + fine-tuning" mode, considering that the construction cost and technical complexity are relatively controllable, commercial banks can give priority to using medium-scale general large language models to carry out corresponding application landing work. In the private cloud "big language model + fine-tuning" mode, there are two possible implementation paths: First, by cutting, quantifying and distilling the general AI large language model (generally hundreds of billions of parameters) of existing model service providers, a medium-scale large language model (tens of billions of parameters) is formed, and fine-tuning is carried out in combination with local data. The second is to directly apply medium-scale general-purpose large-language models, based on the emergence phenomenon and generalization capabilities of model general intelligence, and combine the needs of various vertical industries and business scenarios to fine-tune the model and adapt applications, and build a large-language model for vertical applications, so as to get rid of the shackles of fragmentation and workshop-style development of traditional AI capabilities. Application of artificial intelligence big language model in commercial banks, technical development of MOSS big language model of Fudan University. The MOSS big language model of Fudan University is the first plug-in-enhanced open source dialogue language model in China. The model has about 16 billion parameters, supports Chinese and English bilingualism and a variety of plug-ins, such as search engines, calculators, equation solving, literary diagrams, etc. Fudan University's MOSS big language model is pre-trained on about 700 billion Chinese, English and code words, and based on this, plug-in enhanced multi-round dialogue supervised fine-tuning is carried out, so that it has multi-round dialogue ability, instruction compliance ability and the ability to avoid harmful requests, covering the three levels of usefulness, loyalty and harmlessness, and improving dialogue quality and user satisfaction. The basic steps of MOSS development include two stages: pedestal training of natural language models and training of dialogue ability to understand human intent. Moreover, compared with the GPT-3 model with 175 billion parameters, the BLOOM model with 176 billion parameters, and the BloomBergGPT model with 50 billion parameters, the MOSS model is not only small but also compact. It can be easily deployed for terminal devices after model quantization, and can run on a single NVIDIA A100/A800 or two RTX 3090 graphics cards with FP16 accuracy, and a single RTX 3090 graphics card with INT4/8 accuracy, achieving low energy consumption and carbon emissions, and convenient user interaction; At the same time, it can also avoid problems such as overfitting and memory training data to a large extent when fine-tuning vertical data. In addition, the MOSS big language model can learn multi-source heterogeneous knowledge from various financial real-time data (such as stocks, bonds, etc.), various knowledge graphs (such as industrial chain, supply chain, etc.), as well as unstructured text information such as research reports and financial reports, and enhance the intelligent dialogue ability of the MOSS model in financial professional fields through various ways, which can play its application value in financial scenarios such as bank wealth management. Build ChatLONGYING based on large language models such as MOSS. This paper comprehensively considers factors such as data security, customization, technical difficulty, and construction cost, and proposes that commercial banks can give priority to using medium-scale general-purpose large language models that have emerged as the basis for private cloud applications, fine-tune them in combination with the requirements of vertical application scenarios, and integrate the existing AI core technical capabilities of commercial banks, such as natural language processing, computer vision, intelligent speech, and knowledge graph, to build a capability system of AI big language models for commercial banks, and realize AI from "handicraft workshop" to "factory model" , so as to efficiently carry out model production and services, empowering the digital transformation and intelligent development of commercial banks. This paper is mainly based on MOSS large language model and supplemented by other open source artificial intelligence models, for wealth management scenarios in banking applications, by entering personalized data and fine-tuning models, using SFT (Supervised Fine-Tuning) fine-tuning technology to integrate the knowledge and experience of human experts in the learning process, improve the learning efficiency and performance of agents, and use Prompt technology to limit the scope of domain knowledge when using the model and improve the model generation effect. So as to successfully build and apply the ChatLONGYING commercial bank privatization big language model. The application of artificial intelligence big language model in the field of bank wealth management explores the pain points of bank wealth management demand and analyzes many pain points of commercial banks in the field of wealth management. First, from the perspective of robo-advisors, there is a shortage of wealth management professionals. The threshold of traditional wealth management services is relatively high, and commercial banks have fewer professionals who can provide customers with professional advice and asset allocation advice, and based on operating costs, relevant personnel often give priority to serving high-net-worth customers, and it is difficult to cover long-tail customers. Second, from the perspective of intelligent investment research, the efficiency of traditional data analysis and investment research is low. Most financial advisors can only provide some basic product introductions and recommendations, and lack comprehensive, in-depth, flexible and effective analysis of large-scale, diversified and rapidly changing financial market data, and the investment research efficiency is not high. Third, from the perspective of intelligent investment, it is difficult to meet the different risk appetites and differentiated asset allocation needs of customers. The complexity and diversification of the financial market has increased the difficulty of asset allocation, investment risk analysis and investment strategy formulation and other businesses have high professionalism, and different customers' risk appetite and investment strategies are different, and it is difficult for relevant business personnel to effectively respond. The application of ChatLONGYING in the field of bank wealth management explores intelligent investment advisory scenarios. First, it can be applied to customer portraits. By analyzing the customer's risk appetite, investment objectives and asset status, ChatLONGYING generates an asset pool that matches the customer's needs, and through natural language interaction with customers, ChatLONGYING can better understand customer needs and provide customers with personalized investment advice. For example, a wealth manager can ask ChatLONGYING: "Please tell me what type of fund products 50 years old, male, engineer, and new customers earning 20,000 yuan a month prefer?" "In this scenario, ChatLONGYING can give suitability recommendations for specific customer groups and hint at investment risks. Second, it can be applied to the popularization of investment knowledge. ChatLONGYING can answer customers' questions about investment business through the intelligent Q&A system and improve customers' knowledge and understanding of the investment market. By interacting with customers in natural language, ChatLONGYING can better understand customer problems and provide simple and easy-to-understand investment knowledge. For example, a customer can ask ChatLONGYING: "I think the economic situation has been improving recently, and I want to buy some industries that are susceptible to economic cycles, can you help me list them?" In this scenario, ChatLONGYING will list the industries that are more affected by the economy, and try to give the relevant analysis of the economic impact of each industry. Third, it can be applied to product succession reminders. Automated portfolio management, for the need to regularly adjust the asset allocation of the portfolio, ChatLONGYING can intelligently remind according to the information set by the customer to ensure that the risk and return level of the portfolio meets the needs of customers, improve investment efficiency and accuracy. For example, a customer can ask ChatLONGYING: "Please remind me 3 days before the expiration of my current purchase to prepare for a new product." In this scenario, ChatLONGYING can help customers perform follow-up intelligent reminder operations by calling APP, SMS, email, etc. Fourth, it can be applied to customer service support. ChatLONGYING can answer customers' questions about institutional scoring, ratings and investment advice through the intelligent Q&A system, provide technical support and investment advice, and better understand customer problems and provide personalized customer service through natural language interaction with customers. For example, a client can ask ChatLONGYING: "What does the RP1 that the agency suggested for me mean?" In this scenario, ChatLONGYING will give explanations according to the institutional rating, display the investment products corresponding to the risk level, and explain the difference between risk willingness and risk ability to answer customers' questions. Oriented to intelligent investment research scenarios. First, it can be applied to data analysis and mining. Through natural language processing technology, ChatLONGYING will automatically mine key information and trends from large amounts of investment data, helping analysts understand market changes and investment opportunities faster, so as to make better investment decisions. For example, a customer asks ChatLONGYING: "Is listed company A's net profit beating expectations?" In this scenario, ChatLONGYING will answer the question by searching for the forecast profit given by institution A when the annual report is not published and the actual data after the annual report is published. Second, it can be applied to research report search. According to customer needs, ChatLONGYING automatically completes the retrieval of investment research reports through the integration of information retrieval technology, automatically integrates and analyzes a large amount of data, and provides the content and conclusions of research reports, thereby greatly improving the quality and efficiency of research reports. For example, a client can ask ChatLONGYING about the trading scope of a public fund, such as "Can 510021 trade Hong Kong stocks?" In this scenario, ChatLONGYING will give the transaction scope, direct and indirect transaction differences and the risk of trading public funds in the intelligent retrieval of research reports. Third, it can be applied to index calculation. ChatLONGYING can use algorithms to analyze and calculate investment data, provide the best investment research suggestions based on the given data, and help investors understand the relevant evaluation indicators to complete investment research more efficiently. For example, customers can ask ChatLONGYING about the maximum drawdown of a wealth management product in the past four years, instead of manually calculating it due to time period restrictions. Fourth, it can be applied to intelligence monitoring. Through real-time monitoring and analysis of the investment market, ChatLONGYING can automatically discover market risks and opportunities, remind investors to pay attention to market changes and risks, and make investment research results match real-world scenarios faster. For example, ask ChatLONGYING: "Has 161725 fund manager changed in the last two years?" In this scenario, ChatLONGYING will give relevant information such as the change in the fund manager's tenure, whether it has changed within the customer circle, and the current length of the fund manager's tenure. For intelligent investment scenarios. First, it can be applied to asset allocation. According to investors' risk appetite and goals, ChatLONGYING will automatically generate the best asset allocation plan, through machine learning algorithms and a large amount of investment data, to provide investors with the optimal investment portfolio, so as to maximize investment returns and minimize investment risks. For example, ask ChatLONGYING: "The next market is good, I want to be more radical, can you recommend a list of stocks with the concept of ChatGPT." In this scenario, ChatLONGYING will give a list of stocks with the corresponding concept and indicate its main business. Second, it can be applied to writing policy code. ChatLONGYING provides the best trading strategy through machine learning algorithms and real-time market data, providing buying and selling recommendations based on market trends and investors' needs to maximize investment returns and reduce trading risks. For example, you can ask ChatLONGYING: "Please help me write a Python strategy with MA5 exceeding MA10 and annualized volatility less than 15%. In this scenario, ChatLONGYING will present the corresponding code, including data, strategy theme, backtest performance and other whole-process information. Third, it can be applied to real-time investment monitoring. ChatLONGYING provides investors with real-time investment decision suggestions through real-time market data and portfolio monitoring, helping investors understand market changes and trends, and make timely investment decisions to maximize investment returns. For example, after giving your own portfolio, ask ChatLONGYING: "Is it reasonable to reduce my position when the maximum drawdown exceeds 5% in my own stock portfolio?" "In this scenario, ChatLONGYING gives the negative impact of the maximum drawdown on the portfolio, the positive effect of the deposition and other investment recommendations. Fourth, it can be applied to risk management. Through automated risk assessment and monitoring, ChatLONGYING can help investors reduce investment risks, provide the best risk management strategies according to investors' risk appetite and portfolio, and protect investors' principal and returns. For example, when an investor wants their portfolio to return 8% over the long term, but the maximum drawdown does not exceed 5%, they can ask ChatLONGYING how to operate. In this scenario, ChatLONGYING will give operation steps and corresponding investment suggestions based on the input information of investors. ConclusionThis paper reviews the development history and current progress of artificial intelligence big language model, and on the basis of in-depth study of the application mode of artificial intelligence big language model, it is proposed that commercial banks can give priority to using medium-scale general large language models that have emerged as the basis for private cloud applications, fine-tune them according to the requirements of vertical application scenarios, and integrate the existing AI core technical capabilities of commercial banks, such as natural language processing, computer vision, intelligent speech, and knowledge graph. Build a capability system for AI big language models of commercial banks. Although artificial intelligence big language models have powerful capabilities such as language understanding and content generation, the current model application still faces security risk challenges, which are mainly reflected in three aspects: First, scientific and technological ethical risks lead to controversy. For example, artificial intelligence automatically generates content, or even artificial intelligence replaces some manual work, whether there will be problems that violate human ethics, morality and law, which is very controversial, and there is currently no regulatory consensus and standards for related technologies. The second is the risk of malicious use or misuse. If the use process of artificial intelligence big language model is not supervised, it may also be used to generate content that violates laws and regulations and ethical guidelines, and is used for network hype, malicious and false information, malicious malicious and false information, malicious malware, improper commercial marketing, etc., or leaking customers' personal privacy and confidential information. The third is malicious prompt injection attacks. Hackers may develop cracking methods and prompt injection attacks against generative AI systems, use well-designed and refined sentences instead of code, exploit system weaknesses to bypass security checks of content filters, embed malicious data or instructions into AI models, and make AI systems generate unethical, discriminatory or misleading or even illegal speech. In the face of the challenge of security risks, effective preventive measures should be taken, first, improve the relevant systems of generative AI security applications. In the face of scientific and technological ethical risks, the generation and dissemination of undesirable and illegal content should be prevented by establishing an effective content review and supervision mechanism; Protect the legitimate rights and interests of all participants by establishing a reasonable intellectual property protection system; Protect customers' data security and privacy by establishing strict data protection practices. The second is to strengthen the technical supervision and review of the application of large language models. In the face of the possible use of artificial intelligence big language models for cybercrime, relevant departments should strengthen the supervision and review of artificial intelligence big language models to prevent them from being abused, establish feasible testing methods to ensure that the answers given by the models are true and reliable and harmless, and avoid problems such as data leakage, false information, and infringement. The third is to explore specific risk prevention measures and means for the practical application of large language models. In the face of possible internal and external malicious attacks, financial institutions should select credible AI big language models and adopt private cloud models according to their own business, implement necessary model application management methods and network control methods, so as to reduce external risk exposure and steadily carry out the application of AI big language models for specific business scenarios. (The author of this article would like to thank Chai Hongfeng, academician of the Chinese Academy of Engineering and dean of the Institute of Financial Science and Technology of Fudan University, for his guidance on this article.) Yuhang Zhou, Qianru Zeng, Yunhui Gan of Fudan University, and Sheng Chen, Xuan Yang, Li Liu, Xizi Liu, Yuhang Guan, Yiduo Wang, Shaojie Yang, Kuo Yan, Xinkai Gao, Wei Li, Jiefei Liu, Guanglong Li, Hui Hui, Wei Liu, Yuechao Wang, and Shilei Shan of Fudan University also contributed to this article.)

Source: Banker Magazine

Research on the application of big language model in the field of bank wealth management

Read on