laitimes

Full analysis of LLM capabilities and applications

author:Intelligence-driven AI

I. Introduction

After several years of development, the Large Language Model (LLM) has evolved from an emerging technology to a mainstream technology. Products with large models as the core technology will usher in a new iteration. In addition to chatbot applications, can large models generate application value in other fields? Before answering this question, it is necessary to clarify what are the core capabilities of the large model? What are the applications associated with these core competencies?

This article will focus on the following three areas:

1. LLM capability analysis 2. LLM technical analysis 3. LLM case practice

Second, LLM capability analysis

Full analysis of LLM capabilities and applications

Figure 1. Large model core capabilities

The core capabilities of LLM are roughly divided into six parts: Generate, Summarize, Extract, Classify, Search and Rewrite.

1. Generate (Generate)

Generation is the core capability of LLM. When talking about LLM, the first thing that probably comes to mind is its ability to generate raw and coherent text content. Its ability is built on training on large amounts of text and capturing the intrinsic connections of language and human usage patterns. Make full use of the model generation ability to complete the conversation and completion application. For conversational applications, a typical application is a chatbot, where the user enters a question and the LLM responds to the question. For generative apps, typical applications are article continuation and summary generation. For example, when we write a marketing copy, we write a part of the context, and LLM can continue the copy on this basis until the entire paragraph or the entire article is completed.

[Application]: Chat assistant, writing assistant, knowledge question answering assistant.

2. Summarize

Summary is an important capability of LLM. Through Prompt Engineering, LLM can refine and summarize the text entered by the user. In our work, we process a large number of meetings, reports, articles, emails and other text content every day, and the LLM summary capability helps to quickly obtain key information and improve work efficiency. Using its summary and distillation capabilities can yield many valuable applications. For example, each time you participate in an online or offline meeting, you need to form a meeting record after the meeting, and summarize the important points and implementation plan of the meeting. LLM uses a complete audio recording to complete the summary of the content of the meeting and important points.

[Application]: Online video conference, conference call content summary; Privatization Knowledge Base document summary; Summary of working texts such as reports, articles, emails, etc.

3. Extract

Text extraction is the extraction of key information from text through LLM. For example, named entity extraction, LLM is used to extract time, place, people and other information in the text, aiming to structure the key information of the text. In addition, it can also be used to extract key information in excerpted contracts, legal terms.

[Application]: Document naming entity extraction, article keyword extraction, video tag generation.

4. Classify

Classification aims to divide text into categories through LLM. The advantages of large models in text content classification are strong semantic understanding and small sample learning. That is to say, it does not require samples or requires a small number of samples to learn to have strong text classification capabilities. Compared with the perpendicular domain model trained by a large number of corpus, it has advantages in development cost and performance. For example, Internet social media generates a large amount of text data every day, businesses analyze text data to evaluate the public's feedback on products, and the government analyzes platform data to assess the public's attitude towards policies and events.

[Application]: Online platform sensitive content review, social media comment sentiment analysis, e-commerce platform user evaluation classification.

5. Search

Text retrieval is to retrieve similar text in the target document based on the given text. The most used are search engines, which we want to return highly relevant content or links based on input. However, the traditional method uses keyword matching, and only all or part of the keywords are hit in the retrieved document and returned to the target document. This is detrimental to search quality because there is no recall for content that does not match keywords but is semantically highly relevant. In retrieval applications, LLM has the advantage of being able to achieve semantic level matching.

[Application]: text semantic retrieval, image semantic retrieval, video semantic retrieval; Semantic retrieval of e-commerce products; Semantic retrieval of recruitment resumes.

6. Rewrite

Text rewriting is to polish and correct the input text according to the requirements through LLM. Common applications are text error correction, for example, there may be word spelling errors and sentence grammatical errors during manuscript writing and editing, and the use of LLM and prompt engineering to automatically correct text content. In addition, it can also be used to polish the article to make the article clearer and smoother. At the same time, LLM can also perform text translation.

[Application]: text error correction, text editing, text translation.

Third, LLM technology analysis

The above lists the core competencies and associated applications of LLM, and how are these applications technically implemented?

Full analysis of LLM capabilities and applications

Figure 2. LLM capability implements technical architecture

1. Generate (Generate)

Text generation is the generation of new text given input and context. Here's a simple example:

import os
from langchain.llms import OpenAI
# 输入openai_api_key
openai_api_key = 'sk-D8rnXN4lDiYE2jyR6Cxxx3BlbkexywbgjUt5vegEeNpz8MF'
os.environ['OPENAI_API_KEY'] = openai_api_key
llm = OpenAI(temperature=0.9)
# 输入
text = "今天是个好天气,"
# 输出
print(llm(text))           

Output:

很适合出门散步或者做一些活动。我们可以去公园、湖边或者有趣的地方游玩,享受美丽的自然风景。也可以去户外健身,锻炼身体。亦可以在室内做一些有趣的活动,比如看书、看电影或与朋友共度美好的时光。           

The above example uses openAI's GPT-3.5-turbo large language model text generation example. For localized deployments, open source models such as chatglm-6b can be replaced. In addition, prompts can be used to guide LLM to complete specific tasks, such as summarizing, classifying, extracting, and rewriting as described below.

2. Summarize

Full analysis of LLM capabilities and applications

Figure 3. LLM Summarize diagram

As shown in the figure above, adding a "summary" prompt to the prompt can guide LLM to complete the text summary task. Here's a simple example:

import os
from langchain.llms import OpenAI
from langchain import PromptTemplate

# 输入openai_api_key
openai_api_key = 'sk-D8rnXN4lDiYE2jyR6xxxx3BlbkexywbgjUt5vegEeNpz8MF'
os.environ['OPENAI_API_KEY'] = openai_api_key

llm = OpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)

template = """
请对以下文本进行总结,以一个5岁孩子能听懂的方式进行回答.
{text}
"""
prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)
# 输入文本
text = """
ChatGPT是美国人工智能研究实验室OpenAI新推出的一种人工智能技术驱动的自然语言处理工具,使用了Transformer神经网络架构,也是GPT-3.5架构,这是一种用于处理序列数据的模型,拥有语言理解和文本生成能力,尤其是它会通过连接大量的语料库来训练模型,这些语料库包含了真实世界中的对话,使得ChatGPT具备上知天文下知地理,还能根据聊天的上下文进行互动的能力,做到与真正人类几乎无异的聊天场景进行交流。ChatGPT不单是聊天机器人,还能进行撰写邮件、视频脚本、文案、翻译、代码等任务。
"""
prompt_format = prompt.format(text=text)
output = llm(prompt_format)
print(output)           

Output:

ChatGPT是一种很聪明的机器人,它可以帮助我们处理文字和语言。它学习了很多对话和文字,所以它知道很多东西。它可以和我们聊天,回答我们的问题,还可以帮我们写邮件、视频脚本、文案、翻译和代码。它就像一个真正的人一样,可以和我们进行交流。           

In the example above, a prompt is added to describe the summary task: "Please summarize the following text and answer it in a way that a 5-year-old can understand." LLM summarized the text as requested. In order to improve the consistency of the summary content, the temperature parameter value is lowered, and the above code is set to 0, and the same answer is output every time.

3. Classify

Full analysis of LLM capabilities and applications

Figure 4. Schematic diagram of LLM Classify

Text classification is the most common application in natural language processing. Compared with small models, large models have advantages in development cycle and model performance, which is detailed in the case study. The following is a simple example to illustrate the application of LLM in sentiment classification.

import os
from langchain.llms import OpenAI
from langchain import PromptTemplate

# 输入openai_api_key
openai_api_key = 'sk-D8rnXN4lDiYE2jyR6xxxx3BlbkexywbgjUt5vegEeNpz8MF'
os.environ['OPENAI_API_KEY'] = openai_api_key

llm = OpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)

template = """
请完成情感分类任务,给定一个句子,从['negative','positive']中分配一个标签,只返回标签不要返回其他任何文本.

Sentence: 这真是太有趣了.
Label:positive
Sentence: 这件衣服的材质有点差.
Label:negative

{text}
Label:
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)
# 输入
text = """
他刚才说了一堆废话.
"""
prompt_format = prompt.format(text=text)
output = llm(prompt_format)
print(output)           

Output:

negative           

In the above example, a prompt is added to describe the classification task: "Please complete the sentiment classification task, given a sentence, assign a label from ['negative', 'positive'], return only the label and return no other text." At the same time, examples are given, and the model is fine-tuned using LLM's in-context learning. This method is more important, and some studies have shown that the model fine-tuned by in-context learning has significantly improved performance in classification tasks.

4. Extract

Full analysis of LLM capabilities and applications

Figure 5. Schematic diagram of LLM Extract

Extracting text information is a common requirement in NLP. LLM can sometimes extract entities that are more difficult to extract than traditional NLP methods. The figure above is a schematic diagram of LLM Extract, LLM combined with prompt to extract keywords in the Input text. The following is a simple example to illustrate the application of LLM in key information extraction.

import os
from langchain.llms import OpenAI
from langchain import PromptTemplate

openai_api_key = 'sk-D8rnXN4lDiYE2jyR6xxxx3BlbkexywbgjUt5vegEeNpz8MF'
os.environ['OPENAI_API_KEY'] = openai_api_key

llm = OpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)

template = """
请完成关键词提取任务,给定一个句子,从中提取水果名称,如果文中没有水果请回答“文中没有提到水果”.不要回答其他无关内容.

Sentence: 在果摊上,摆放着各式水果.成熟的苹果,香甜的香蕉,翠绿的葡萄,以及紫色的蓝莓.
fruit names: 苹果,香蕉,葡萄,蓝莓

{text}
fruit names:
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

text = """
草莓、蓝莓、香蕉和橙子等水果富含丰富的营养素,包括维生素、纤维和抗氧化剂,对于维持健康和预防疾病有重要作用。
"""

prompt_format = prompt.format(text=text)
output = llm(prompt_format)
print(output)           

Output:

草莓,蓝莓,香蕉,橙子           

In the above example, the prompt requirement is added that LLM can output the "fruit name" in the given text. Using example and in-context learning, LLM is able to extract key information from the text.

5. Search

Full analysis of LLM capabilities and applications

Figure 6. Schematic diagram of LLM Search

  • Embedding: Encodes text. As shown in the figure above, each text is vectorized.
# 加载pdf文档数据
loader = PyPDFLoader("data/ZT91.pdf")
doc = loader.load()
# 数据划分
text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=400)
docs = text_splitter.split_documents(doc)
# 文本embedding
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
docsearch = FAISS.from_documents(docs, embeddings)           
  • Similarity: Input text is retrieved with the bottom-vault text similarity metric. As shown in the image above, Query embedding search.
retriever=docsearch.as_retriever(search_kwargs={"k": 5})           
  • summarize: Summarizes the retrieved text. And get the search results in the image above.
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever(search_kwargs={"k": 5}),
                                 chain_type_kwargs={"prompt": PROMPT})

print("answer:\n{}".format(qa.run(input)))           

LLM semantic search can make up for the shortcomings of traditional keyword matching search, and has application value in semantic search and image search in local knowledge base and search engine.

6. Rewrite

Full analysis of LLM capabilities and applications

Figure 7. Schematic diagram of LLM Rewrite

The main applications of rewriting are text error correction and text polishing. Guide LLM through prompts to complete rewriting tasks. Here's a simple example:

import os
from langchain.llms import OpenAI
from langchain import PromptTemplate

openai_api_key = 'sk-D8rnXN4lDiYE2jxxxYiT3BlbkFJyEwbgjUt5vegEeNpz8MF'
os.environ['OPENAI_API_KEY'] = openai_api_key

llm = OpenAI(temperature=0, model_name='gpt-3.5-turbo', openai_api_key=openai_api_key)


template = """
请完成文本纠错的任务,给定一段文本,对文本中的错别字或语法错误进行修改,并返回正确的版本,如果文本中没有错误,什么也不要返回.

text: 黄昏,一缕轻烟从烟囱里请缨地飘出来,地面还特么的留有一丝余热,如果说正午像精力允沛的青年,那黄昏就像幽雅的少女,清爽的风中略贷一丝暖意。
correct: 黄昏,一缕轻烟从烟囱里轻轻地飘出来,地面还留有一丝余热,如果说正午像精力充沛的青年,那黄昏就像优雅的少女,清爽的风中略带一丝暖意。
text: 胎头望着天空,只见红彤彤的晚霞己经染红大半片天空了,形状更是千资百态。
correct: 抬头望着天空,只见红彤彤的晚霞己经染红大半片天空了,形状更是千姿百态。

{text}
correct:
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

text = """
孔雀开平是由一大盆菊花安照要求改造而成,它昂首廷胸翩翩起舞。
"""
prompt_format = prompt.format(text=text)
output = llm(prompt_format)
print(output)
           

output

孔雀开屏是由一大盆菊花按照要求改造而成,它昂首挺胸翩翩起舞。           

The above is the use of GPT-3.5-turbo for text error correction. A description of the prompt and an example are given. As can be seen in the above example, LLM can find errors in the text and modify the error content.

4. LLM case study

Description of requirements: Generate a large amount of text content every day in social media, e-commerce platforms, and webcasts. These texts contain value and may contain objectionable information. For example, merchants can analyze media data to evaluate public feedback on products. For example, relevant institutions can analyze platform data to understand the public's attitude towards policies and events. In addition, social networking platforms may be adulterated with bad information, illegal speech and other cybersecurity issues.

How to perform fine-grained sentiment analysis and content moderation of web content?

Since 2023, large models represented by chatGPT have continued to be popular around the world, due to the increase in the number of model parameters to "emerge" its semantic understanding and text generation capabilities. Can large models be applied to sentiment analysis and content moderation?

Task description: Sentiment analysis is to analyze the emotional attitude contained in the text, which is generally divided into positive / negative / neutral, which can be divided into three levels of paragraphs, sentences and aspects from the analytical dimension. Content moderation is to analyze whether there are any violations or bad speech in the text. Both tasks are text classification.

1. Sentiment analysis

Full analysis of LLM capabilities and applications

Figure 8. Schematic diagram of LLM sentiment classification

As shown in the figure above, the cohere sentiment classification product design, users can guide LLM to adjust the model by uploading examples for in-context learning. That is, before the LLM can complete the analysis task, it needs to be sampled. Let it complete the task as it looks like an example. At the same time, the example can be tested in Input.

2. Content moderation

Full analysis of LLM capabilities and applications

Figure 9. LLM content moderation flowchart

Different forms of content sources are converted into text by converters. Semantic content review completed by LLM Engine.

serial number text category LLM audit results
1 "When they chase squirrels, they can remember 8 words" Audit Category 1 That's right
2 "My sister has worked overtime for more than 20 days, has not changed clothes, and has not overacted" Audit Category 1 mistake
3 "The old lady accompanied. Get up from your wheelchair and dance" Audit Category 1 That's right
4 Someone wrote, "Thanks to me. A long holiday given to you" Audit Category 1 That's right
5 "You must eat oil whirlwind every day, and visit Baotu Spring every day" Audit Category 2 That's right
6 "You will be born shuaijiao" Audit Category 2 That's right
7 "Are you particularly short of water?" Audit Category 2 That's right
8 "Souvenir Excavator Oversized Shrimp" Audit Category 2 mistake

The above is the test results of the network language through LLM, and LLM has semantic review capabilities through in-context learning. In prompt, only two examples are added to each class, as shown in the simple test shown above, correctly hits 6 out of 8 test samples. Its capabilities are expected to continue to improve through further example expansion.

How to improve further?

Zero-shot learning One-shot learning Few-shot learning Finetune supervised learning
Training methods No training is required No training is required No training is required Fine-tune model weights
Number of training samples 0 bars 0 bars 0 bars substantial
Number of prompt samples 0 bars 1 article <10*n (n is the category) 0 bars
Sample prompt "Today was sunny and cool, and I was in a particularly good mood"

text: "Good, help a friend choose, friends are more satisfied"

label: Positive

text: "Your introduction is so good"

label: Positive

text: "The label on the back of the machine has been torn off, has the product been removed"

label: Negative

...

"Today was sunny and cool, and I was in a particularly good mood"

Studies have shown (Reference 2) that few-shots perform better on sentiment analysis tasks than zero-shots. In other words, the appropriate addition of clear and accurate examples can guide LLM to make more accurate judgments. Of course, if you want to further improve the performance, you can add industry data to finetune the model on the basis of the LLM pre-training model to make it more suitable for vertical domain tasks.

3. Related research

A validation article "Sentiment Analysis in the Era of Large Language Models: A Reality Check Chinese" by Alibaba DAMO Academy & Nanyang Polytechnic & Hong Kong also verifies the advantages of large models over small models in text sentiment analysis.

To sum up, the advantage of large models is that only few-shot learning can surpass the traditional vertical domain model capabilities.

That is to say, for a certain semantic analysis task, we may no longer need to collect a large amount of training data for model training & tuning, especially for the scarcity of sample data, the emergence of large models undoubtedly provides a feasible solution for such semantic analysis tasks.

Full analysis of LLM capabilities and applications

图11. LLM vs SLM

The above figure shows that LLM is better than small models in traditional vertical fields (such as sentiment analysis models, hate detection models, etc.) in hate, sarcasm, and offensive speech detection tasks.

Full analysis of LLM capabilities and applications

图12. prompt sample

The figure above shows the prompt sample of LLM sentiment analysis and content moderation, and LLM is guided to perform in-context learning through appropriate prompts to complete the sentiment classification and content moderation tasks. Clear, accurate, executable, and reasonable prompts are one of the key factors that determine the accurate output of the model.

Summary: LLM is developing from an emerging technology to a mainstream technology, and product design with LLM as the core will usher in a breakthrough development. The foundation of these product designs comes from LLM's core competencies. Therefore, when designing LLM products, it is necessary to accurately match the field requirements with LLM capabilities, develop innovative application products, and give full play to their commercial value within the scope of LLM capabilities.

Edited by Lucas Shan

Bibliography:

【1】Large Language Models and Where to Use Them: Part 2

【2】Sentiment Analysis in the Era of Large Language Models: A Reality Check