laitimes

How to Become LLM Word Master! "The Underlying Mental Method of Big Language Model"

author:AIGC Frontier
How to Become LLM Word Master! "The Underlying Mental Method of Big Language Model"

This article is written with reference to the prompt word classification method summarized by foreign bigwigs, hoping to let everyone have a deeper understanding of LLM's prompt project! The content mainly includes: 3 LLM prompt types and 4 LLM capability classifications.

(Video and PPT address attached at the end of the article)

3 types of prompts

  • Reduce
  • conversion
  • Build/Extend

4 ability classifications

  • Bloom's Taxonomy
  • Latent content
  • Emergent Capabilities
  • Hallucinations are a trait, not a mistake

Let's take a closer look at these categories:

Unleash the potential of large language models

Large Language Models (LLMs) like GPT-4 and Claude have already attracted the interest of technologists and the general public. They are capable of generating human-like texts and conducting conversations, which seems a bit like a plot out of science fiction. However, as with any new technology, there is still a lot of confusion and controversy about exactly how LLM works.

In this post, I aim to provide a high-level classification of the key competencies of LLMs to clarify what they can and cannot do. My goal is to explain the current state of LLM in a way that non-experts can understand, while identifying areas for further research and development. Ultimately, I believe that LLM has great potential to enhance human intelligence if guided in an ethically responsible direction.

What is LLM?

First, what is a large language model? At the most basic level, LLM is a deep learning neural network trained on large amounts of text data such as books, websites, and social media posts. "Large" means that these models have billions of parameters, enabling them to build very complex statistical representations of languages.

The key task for LLM to be trained on is to predict the next word or mark in a sequence given a previous context. So if it sees the text "The cat skipped...", it learns to predict that "fence" is the next mark possible. Doing this process repeatedly gives LLM implicit knowledge about how language works and the relationships between vocabulary and concepts.

This training process, along with the large-scale datasets used, allows LLM models like Claude and GPT-4 to embed quite a bit of world knowledge. However, it is important to understand that LLMs do not have explicit knowledge or hand-coded rules, and all their abilities are the result of recognizing patterns from training data.

How to Become LLM Word Master! "The Underlying Mental Method of Big Language Model"

Basic operation of LLM

At a high level, LLM has three main modes of operation:

  • Scale in operation: from largest to smallest. >
  • Transform operations: Maintain size and/or meaning. ≈
  • Build (or scale) operations: small to large. <

Let's dive in.

Scale down operations

LLM exhibits great competence when performing scale-down operations. These operations involve taking large chunks or a large number of documents as input, compressing them into a cleaner output. Reduced tasks leverage the linguistic modeling benefits of LLM to identify and extract the most important information.

A common scale-down operation is the summary. When given a lengthy input text, LLM can generate a concise summary that covers only the key points. This is done by analyzing the document to find topics, events, and facts that are described. The model then attempts to synthesize these elements into a short summary that conveys the core essence of the full document. LLM is generally very good at basic abstracts of limited length, removing unnecessary details while preserving semantic meaning.

The task associated with summarization is refinement. This goes beyond merely shortening the document, but extracts and refines its basic principles, findings, or facts. Refinement aims to filter out noise and redundancy from the input, purifying core knowledge or claims. For scientific documentation, LLM may identify and synthesize key hypotheses, results, and conclusions of experiments. Refinement requires a deeper understanding to distinguish between peripheral content and core assertions.

Extraction is another reduction technique utilized by LLM. This involves scanning text and extracting specific pieces of information. For example, LLM can read documents and extract names, dates, numbers, or other specific data. Extraction is the foundation of question answering and LLM excels in this area. When asked to extract specific details from a piece of text, LLM is usually able to retrieve the requested information accurately.

Overall, scaling down directly leverages the benefits of large language models. Their statistical learning allows them to identify and communicate the most important parts of lengthy input text. With the continuous development of LLM, technologies such as abstracting, refining, and extraction will become more powerful.

Here are a few ways to apply large model reduction operations:

  • Summary: Say the same thing in fewer words, such as lists, notes, executive summaries, etc.
  • Refinement: Purification of basic principles or facts such as removal of all noise, such as extraction axioms, foundations, etc.
  • Extract: Retrieves the specified type of information, such as Q&A, list name, extraction date, and so on.
  • Representation: Describes the content of the text, such as describing the text as a whole, or within a specific topic.
  • Analysis: Finding patterns or evaluating them against frameworks, such as structural analysis, rhetorical analysis, etc.
  • Assessment: Measure, score, or evaluate content, such as grading papers, evaluating according to ethical standards, etc.
  • Criticism: Provide feedback in the context of the text, such as suggestions for improvement.

Transformative operations

In contrast to reduced tasks, transformational operations aim to reshape or restructure input text without significantly shrinking or expanding them. LLM demonstrates a strong ability in language translation to re-present content into new formats and styles while retaining the overall meaning.

A common conversion technique is reformatting – changing the way text is presented without changing the content of the information. For example, LLM can easily turn prose into a movie script dialogue, translate a blog post into Twitter, or convert a piece of text from active to passive voice. Reformatting leverages the model's structured understanding of genres and language conventions.

Translation between natural languages represents another key conversion capability of LLM. Given input text for one language, they can maintain semantic consistency by replacing vocabulary and grammar, rewriting it into another language. Translation quality varies between language pairs, but improves as training with more multilingual data is used.

Rewriting also falls under the category of transformational operations. Here, LLM aims to rewrite the input text with completely different vocabulary and expressions while conveying the same basic meaning. This tests the model's ability to generate multiple logically equivalent syntactic variations. Rephrases have applications in detecting plagiarism and improving clarity.

Finally, for better fluency and organization, refactoring content also fits the definition of transformation. LLM can rearrange paragraphs, enhance logical associations, present ideas in more order, or improve the readability of text. Their training gives the model the ability to construct coherent narratives and arguments.

In summary, transductive capabilities enable LLM to mix and present text in new ways to meet different needs. These techniques are very helpful for tasks such as tailoring content to meet specific audiences and overcoming language barriers. LLM has excelled in many conversions and is only going to become more proficient.

Here are a few ways to apply large model transformational operations:

  • Reformatting: Change only the presentation method, such as from prose to movie script, from XML to JSON.
  • Refactoring: Achieving the same result in a more efficient way, such as saying the same thing, but expressing it in a different way.
  • Language conversion: Translate between different languages, such as from English to Russian, from C++ to Python.
  • Refactoring: Optimize structures for logical fluency, etc., such as changing the order, adding or removing structures.
  • Revision: Rewriting the text to achieve a different intent, such as changing tone, form, diplomatic approach, style, etc.
  • Interpretation: Making something easier to understand, such as embellishment or clarity.

Build (extend) the action

Unlike minification and transformation tasks that involve manipulating existing text, build operations synthesize entirely new content from scratch. This places higher demands on the creativity of LLM, and the results are more variable and context-dependent. Still, their skills in generative writing are growing rapidly.

A major build app is drafting original files from advanced prompts, such as stories, articles, code, or legal documents. After receiving the initial instruction, LLM composers worked to expand these seed content into coherent drafts with theme, logical fluency, and stylistic fluency. The results may be a bit crude, but a model like GPT-4 can produce impressive first drafts suitable for human perfection.

Given a set of parameters or design goals, LLM is also good at generating plans and steps to achieve them. This planning ability is supported by their ability to infer causal action sequences that lead to desired outcomes, which is inspired by their training data. Planning also draws on their underlying knowledge of how the world works. LLM can come up with a plan from kitchen recipes to software workflows.

LLM's more open-ended generative capabilities include brainstorming and ideation. When given a hint or creative brief, the model can provide a list of possibilities, concepts, and imagined solutions that humans can curate. When brainstorming, they can unfold in unexpected directions based on statistical associations between words. The most promising ideas generated by LLM prompts can be further developed.

Finally, LLM demonstrates strong generative capabilities in extending or detailing existing texts. When given a seed paragraph or document, they excel at expanding the content by providing additional relevant details and explanations. This allows for the organic extension of concise text by leveraging the underlying knowledge of the model. This extension enriches the skeleton text in creative ways.

Together, these methods highlight that LLM can synthesize large amounts of new text from a small amount of input in the appropriate context. Generative writing is not as natural as scaling back or switching tasks, but it represents an area of active research with promising results. Guiding LLM to generate responsible and rich content will be an ongoing design challenge.

Here are a few ways to apply large model generation (extension) operations:

  • Drafting: Generate some type of draft document, such as code, fiction, legal text, knowledge base, science, narrative, etc.
  • Planning: Given parameters, such as making a plan, such as actions, projects, goals, tasks, constraints, context.
  • Brainstorming: Imagining of possibilities, such as ideation, exploration of possibilities, problem solving, hypotheses, etc.
  • Extension: Further expression and explanation of something, such as expansion and elaboration, playing on something.

Bloom's classification of cognitive levels

Bloom's Classification of Cognitive Levels is a classic educational framework that outlines six levels of cognitive skills for learning. First proposed in the 1950s, it provides a hierarchy of competencies from basic to advanced: memorizing, understanding, applying, analyzing, evaluating, and creating. Examining LLM through Bloom's classification of cognitive hierarchies highlights their multifaceted abilities.

At the most basic level, LLMs excel at memorizing and retrieving factual knowledge contained in their training datasets. Models like GPT-4 have "read" far more text than anyone can digest in a lifetime. This enables them to repeat information on almost any topic when prompted. Their statistical learning acts as a large-scale knowledge base that can be queried.

LLM also demonstrates a strong ability to understand concepts, being able to make connections between words and meanings. Their contextual learning allows them to have a powerful ability to recognize everything from abstract philosophy to advanced physics. Even for complex topics that are not directly covered in the training data, LLM can quickly infer meaning through context and interpretation.

Applying knowledge to new situations is also within the scope of LLM. After all, their entire purpose is to generate useful applications of language, be it writing, translation, or conversation. Proper use and adaptation of context is essential for the model to perform any task accurately. Without skillful application, they would be of no practical use.

How to Become LLM Word Master! "The Underlying Mental Method of Big Language Model"

In Bloom's hierarchical classification of cognition, analyzing information by making connections represents another advantage of LLM. Tools like Claude can already analyze text across multiple dimensions, such as structure, style, and coherence of arguments. With the right framework, LLM can critically analyze almost any passage using the learned cognitive abilities.

LLM is also good at evaluating and judging content when given appropriate criteria. Models can easily describe text based on frameworks such as reading level, target audience, grammar, reasoning, and more. More advanced LLMs may even critically evaluate the ethics of the actions under consideration.

Finally, LLM demonstrates the highest level of skill in Bloom's Cognitive Hierarchy Classification: creating original content. While generating requires more careful prompts than scaling down tasks, models can generate synthetic stories, articles, dialogues, and other creative works. Their emerging capabilities give LLM great generative potential under the right circumstances.

In summary, modern LLM exhibits a degree of excellence at every level of Bloom's classification of cognitive hierarchies. Their vast potential knowledge and learned cognitive skills combine to give them the ability to apply everything from remembering facts to imagining creations. As LLMs continue to evolve, we can expect their Bloom capabilities to become more powerful and multi-faceted.

Potential content

One of the most compelling aspects of large language models is their ability to demonstrate knowledge and reasoning that is not explicitly programmed. This stems from the extensive underlying knowledge accumulated through the predictive training process in the parameters of a model like GPT-4.

The underlying knowledge embedded in LLM can be loosely divided into three categories:

  1. Training data: The large amount of text consumed during training gives the model factual knowledge about countless topics. For example, Claude potentially encodes information about history, science, literature, current events, and more based on its training corpus. This acts as a massive knowledge base that can be queried with the right hints.
  2. World knowledge: In addition to concrete facts, LLM accumulates more general world knowledge about how things work. Their exposure to diverse contexts allows models to learn unspecified assumptions about culture, physics, cause and effect, and human behavior. This makes it possible to make intuitive reasoning about everyday situations.
  3. Cognitive skills learned: Prediction-based learning methods also give the model potential capabilities such as summarization, translation, and answering open-domain questions. These abilities emerge indirectly from self-monitoring goals, rather than hard-coded rules.

This reserve of latent knowledge is a game-changer for AI. However, channeling and extracting this knowledge remains challenging. The right hints or techniques are often required to activate the relevant parts of the model. Figuratively, latent knowledge forms a dense forest that requires skillful navigation.

While promising, the reliance on latent knowledge also highlights the limitations of current LLM. Their reasoning can rely highly on using human intuition to determine the required knowledge. In order for models to better learn, index, and activate their own latent knowledge, more advanced techniques will be required.

Overall, the range of knowledge and skills implicit in a model like GPT-4 is impressive. As LLM continues to evolve, interpreting, organizing, and selectively utilizing this reserve will become an important area of research. Latent knowledge unlocks possibilities that go far beyond hardcoded rules.

Emerging capabilities

Among the largest language models developed to date, new capabilities have emerged that go beyond what is explicitly included in their training data. These advanced capabilities are generated by the interaction between large model scales, extensive data, and prediction-based learning methods.

Four examples of emerging capabilities include:

Theory of Mind – LLM demonstrates a level of ability to recognize different perspectives between oneself and others. Models like Claude's, which can adapt their tone and style to the context of the conversation, seem to be able to understand confusion and distinguish between their own knowledge and human knowledge. These signs of "theories of mind" may have emerged in the process of modeling the communication of countless social conversations. Arguably, Reddit is somewhat useful. LLM learned to understand the human mind by reading the comment section, so what could go wrong...

  1. Implicit cognition: The ability of LLM to "think" before generating each label means that there is no underlying cognitive ability directly present in the training data. When the model predicts the next word, they appear to perform dynamic inference, abstraction, and inference. Accurately modeling causal chains requires cognitive processes such as induction, deduction, and analogy formation.
  2. Logical reasoning: LLM also demonstrates some skills in deductive and inductive reasoning through the information provided to make inferences based on the information provided. Their statistical learning allows them to make connections between concepts and generalize abstract principles. While limited, goal-oriented reasoning seems to be an emerging byproduct of modeling causal chains in text.
  3. Learning in context: Large models demonstrate an ability to absorb new information and skills by incorporating context into their predictions. Without explicit training, they can use information and instructions that were not seen in the original training data. This rapid in-context acquisition of knowledge and competence is not directly built-in. In humans, we call it "improvisation," which is a sign of high intelligence.

These emerging abilities stem from the recognition of complex patterns in human discourse, rather than hand-coded rules. They hint at the possibility that future LLM may move from pattern recognition to deeper reasoning, imagination, and causal understanding. However, there are still some significant limitations that require further research and development.

Creativity and illusion

LLM's ability to make up plausible-sounding statements may seem like a flaw, but it actually represents a core feature of intelligence. Just as humans evolved imagination and creativity in the presence of risks such as hallucinations, AI systems must also develop generative capabilities while taking precautions.

Humans exhibit continuity between creativity and hallucinations, which stem from the same neural source - spontaneous pattern generation. The earliest cave art first exercised this ability by combining animal features into novel creatures. If unrestricted, it may manifest as some psychological disorders when the imagination transcends reality. LLM demonstrates a similar speculative generative spectrum, which is necessary for intelligence.

Complete suppression of unpredictable "illusions" will also eliminate creative potential. The ideal state is not elimination, but responsible lead generation. Research that aligns, ethics, and societal benefits will allow AI creativity to flourish.

Reducing risk includes staying connected to facts and reality. Add real-world data from out-of-band systems. LLM is gently encouraged to verify its statements in order to maintain its connection to the truth. Checking references or data against unrestricted speculation provides key constraints.

It is also important to transparently convey the level of confidence in the generated text. When imagination goes beyond observation, LLM should indicate uncertainty to maintain trust. Techniques such as verifiability scoring can help quantify the difference between speculation and factual knowledge.

Overall, responsible AI development should embrace, not reject, capabilities like speculation and metaphor. With caution, these abilities can enhance human creativity and problem-solving rather than spreading misinformation. The solution is to recognize the continuity between creativity and illusion, and then carefully nurture the former while minimizing harmful hallucinations.

conclusion

Large language models represent a technological breakthrough that opens new avenues that promise to augment human intelligence. However, to reach their full potential, it is necessary to delve into how they actually work and take ethical precautions.

This article aims to provide an easy-to-understand classification of the basic capabilities and limitations of large language models. Reduce, transform, and generate operations take advantage of different benefits of the current model. The underlying knowledge is very powerful, but relies heavily on hints when activated. And emerging properties like reasoning show potential for future systems.

Undoubtedly, the risks associated with misuse of large language models should not be underestimated. But rather than shutting them out, deeming them too dangerous or uncontrollable, the responsible way forward is to direct research and development toward beneficial applications that truly enhance human capabilities and creativity. If carefully guided in the coming years, large language models could help humanity break new ground.

Video introduction: https://youtu.be/aq7fnqzeaPc?si=yFvCfCfIPWi0p21o

LLM prompt taxonomy PPT: https://github.com/daveshap/YouTube_Slide_Decks/blob/main/Business%20and%20Product/LLM%20Prompt%20Taxonomy.pdf

Read on