laitimes

The big model predicts, why should the next token be text?

author:Quantum Position

Ming Min Jin Lei from Au Fei Temple

量子位 | 公众号 QbitAI

Too fast, too fast...

The generation skills of large models have reached a realm that ordinary people can't understand!

It can generate physical examination reports for the 1st, 2nd, and 3rd years in the future based on the user's physical examination reports for the past 5 years.

You see, this generation process is not very similar to ChatGPT, which predicts the next word based on historical words.

The big model predicts, why should the next token be text?

It can view the operation of the sub-components of the unit in the past 7 days and generate an hourly sub-component report for the next 3 days.

The big model predicts, why should the next token be text?

It can also generate the first and second days of the future based on historical hydrological data and meteorological data for the next 7 days...... Hourly precipitation analysis report up to day 7, including detailed precipitation and precipitation distribution.

The big model predicts, why should the next token be text?

Nowadays, the generated content of large models is no longer just text/images/videos.

The analysis of these reports generated above involves a lot of professional knowledge, and it is difficult for ordinary people to evaluate their reasonableness and correctness based on their own knowledge reserves.

At most, I can only comment on one sentence: I don't know how to feel great!

"AI seems to be generating everything".

LLM+ industry data, the wrong way?

简单理解大模型,就是Predict the Next “X”。 ChatGPT是Predict the Next “Word”。

But what the industry needs is often not the next word of prediction.

For example, for the health management planning of patients with chronic diseases, it needs to be based on a series of physiological index data and data prediction from a medical perspective. To give an inappropriate example, this is more like solving a problem mathematically.

If a large number of professional medical corpora are fed on the basis of a large language model, it is more like reading questions in a Chinese way. Although the terminology and indicators are understood, the forecasts given are likely to be inaccurate. Because the problem itself is beyond the scope of "language", it cannot be solved by linguistic methods.

If the modality of "X" changes from "Word" to "Physical Examination Report", the model can predict the next physical examination report based on the historical physical examination report data, which is a large health management model.

The big model predicts, why should the next token be text?

Its logic is more like "sow melons and get melons, sow beans and get beans". That is, enter "X" and output "X".

The "X" here may include different forms of professional data such as hydrological data, health reports, equipment monitoring values, design deductions, etc.

Based on the geometry of the concert hall and room data, it emits 5000Hz rays from the sound source to generate a ray distribution map and find the best sound source placement for the auditory.

The big model predicts, why should the next token be text?

How to predict "X"?

So, how to build these large industry models that can predict the next X?

By the just-released Prophet AIOS 5.0. Its core feature is to build an industry base model based on X-modal data from various industry scenarios.

It solves the problem that the current industry model can only feed the industry text data to the large language model and generate the next word, so that the large model can come to a wider range of fields.

The big model predicts, why should the next token be text?

Prophet is the core product of 4Paradigm, an AI company. In 2015, AIOS version 1.0 was released for the first time, improving model accuracy through a high-dimensional, real-time, and self-learning framework, in 2017, AIOS version 2.0 used the automatic modeling tool HyperCycle to lower the threshold for model development, in 2020, AIOS version 3.0 was released to standardize AI data governance and put into production, and in 2022, AIOS version 4.0 introduced the North Star indicator to maximize the value of AI applications.

AIOS version 5.0 puts forward a new idea for industry models from the perspective of generative AI+ industry.

In the first year of the application of the recognized large model, the development and influence of the industry large model must be several times that of the previous one. This trend towards greater scale has also formed the next paradigm of the AIGC trend.

One More Thing:AIGC迈向新范式?

From pictures, texts, and videos to health, water conservancy, and ...... It's not hard to see that AIGC is now rushing in the direction of AI-generated everything.

Generally speaking, the development of everything seems to need some paradigm to promote, and it is not the new paradigm that replaces the old paradigm, but the complementarity between them to make it deeper and more comprehensive.

Just like the four paradigms in scientific research, namely experimental induction, theoretical deduction, computer simulation, and data-intensive scientific discovery, they complement each other and together promote the progress of scientific research.

If we look at AIGC with this logic, it seems that similar four paradigms have begun to emerge.

AIGC's First Paradigm takes text generation as the core, and demonstrates AI's ability to understand and generate natural language through applications such as intelligent customer service and content renewal. This stage of AIGC technology has laid the foundation for subsequent development, enabling machines to effectively communicate and interact with humans.

AIGC's second paradigm extends the application field to image generation.

For example, generative adversarial networks (GANs), variational autoencoders (VAEs), etc., can learn mappings that generate realistic images from random noise. And the output results can be used in art creation, image enhancement, virtual scene generation and other fields. This paradigm further demonstrates the imagination of AI.

The third paradigm of AIGC focuses on video generation, such as Gen2 and Sora.

To a certain extent, video generation reflects AI's understanding of the world. Since the birth of Sora, there has been a debate about whether or not the world can be understood and whether it is a world simulator. Because if it is determined that Sora can understand the world, it will mean that the door to AGI is officially opened.

The big model predicts, why should the next token be text?

The fourth paradigm of AIGC is industry-oriented, and technology will fully penetrate into all industries.

The core task of this phase is to deeply integrate AI technology with industry knowledge. This year, as the first year of large-scale model application, we see that AIGC technology has begun to play an important role in key fields such as healthcare, education, and finance.

What can be done to promote AIGC into the industry faster? Based on large language models, or directly training industry models, different routes have their own underlying logic, and it is too early to say whose route is better.

But what is certain is -

In the process of AI generating everything, those individuals and industries who can take the lead in using AI technology will be able to enjoy the dividends brought by technology earlier. They will have the opportunity to lead the transformation of the industry and shape the social and economic landscape of the future.

Moreover, only when AIGC enters the fourth paradigm means that the flywheel transformation from technological innovation to business entrepreneurship has been completed, and generative AI has opened a new qualitative productivity revolution.

The big model predicts, why should the next token be text?

— END —

QbitAI · Headline number signed

Follow us and be the first to know about cutting-edge technology trends

Read on