laitimes

Professional large models, not "general" artificial intelligence

author:Purple sound on the clouds

Original Pure Science Pure Science 2023-06-14 18:41 Posted in Beijing

From the perspective of conceptual interdisciplinary definition, today's artificial intelligence technology does not imitate all human intelligence, but mainly imitates the process of empirical knowledge formation through the induction of large amounts of data (called "training" in artificial intelligence terminology). Even the so-called "reasoning" in it is only reasoning based on empirical knowledge and is easy to make "empiricist errors".

First, the current situation of China's large model

It has been a month since Uniview's Wutong AI system was released, and last weekend I visited the Anbo in Beijing to learn more about the Sycamore system. It can be said that since Chat GPT became popular all over the Internet, I am calm about it. Uniview's Sycamore system allowed me to see some of the right paths to effectively apply this technology. Simply put, it is "professional" from the application point of view, not "universal"; Technically a "big model" rather than a "general artificial intelligence" (AGI).

Professional large models, not "general" artificial intelligence

Uniview Technology that dominates the screen at the 2023 Ambo

At present, China can be said to have entered the stage of "thousand model wars", and the following are the statistics of the release of China's large models quoted from Zhidong.

Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence

See also: "Thousand Model Wars" 100 days: Six Ways Players Siege ChatGPT, Li Shuiqing, Zhidong, 2023-05-29 19:55 Posted in Hubei

In addition, the Huawei large model was successfully established in HUAWEI CLOUD in November 2020, released in April 2021, and upgraded to version 2.0 in April 2022. At present, the NLP large model, CV large model, and scientific computing large model (meteorological big model) in its AI large model have all been marked as coming soon. Huawei has not confirmed the online legend that its large model is named "Pangu".

It should be noted that Beijing KLCII Artificial Intelligence Research Institute launched the AI large model "Wudao" project as early as October 2020, and released Wudao large model 1.0 on June 1, 2021, and released two versions 2.0 on June 1, 2021. Among them, the scale of the parameters officially announced by Wudao 2.0 reached 1.7 trillion. At that time, OpenAI had only released the GPT-3 model with 175 billion parameters. Even abroad, this round of large models has not yet become popular. ChatGPT also began to explode abroad at the end of 2022. KLCII is relatively early in China to develop a large model. From the perspective of its technical state alone, it seems to be very good, but it is relatively quiet in China. The reason is that it has not found a good application. ChatGPT is not so much a technical success as finding a suitable application, that is, chat applications that do not require high information accuracy and reliability.

Professional large models, not "general" artificial intelligence

Is ChatGPT an innovative application of OpenAI? No. Prior to ChatGPT, GPT-3-based generative artificial intelligence was invented by the innovative company Jasper. Its products can create hundreds of words of text based on a simple phrase or prompt, making it a hit among media workers and marketers. Founded in January 2022, Jasper started with just nine people, but expanded to over 160 in October alone. Because of its fee model, revenue for the year is expected to reach $60 million. The capital markets have been overwhelmingly welcoming Jasper, raising $125 million in October 2022 at a company valuation of $1.5 billion within just 10 months of its founding.

However, on November 30, just one month after Jasper completed the fundraising, OpenAI's own ChatGPT was released and used in a free model. Obviously, based on the Jasper product, and on its own core technology platform, of course, the product will be better, plus free, Jasper fell from heaven to hell in an instant. Because Jasper's product is based on OpenAI's GPT-3 version, that's why ChatGPT is based on GPT-3.5 — you have to look better than your opponents. So don't think that ChatGPT has created a whole new and groundbreaking product application that is the result of copying the truly innovative Jasper product. This also reflects the importance of China's own mastery of core technology from another perspective.

Professional large models, not "general" artificial intelligence

I have some doubts, the market landscape is far from determined at this time of day. If KLCII 2.0, released on June 1, 2021, has a parameter scale of 1.7 trillion, why not launch its own chat product?

2. Professional and versatile

Why do I always splash some cold water on many hot concept hype in the industry in recent years? Because I've seen too long and too much IT concept hype. When many concepts are hyped, not only the media, but also many technical personnel in the industry are confused. When the concept of metaverse was hot, I posted an article on November 23, 2021: Senior IT people explain in detail what "metaverse" is - detailing those concepts of "nothing" in history. Now the heat of the metaverse concept has basically dissipated. I can't say that this fire was doused by me, but this article did cause a lot of repercussions in the industry. Teacher Yuan Lanfeng also made a video program based on this article. At the beginning of this year, when ChatGPT first became popular in China, I also wrote an article on February 23, 2023: the most authoritative artificial intelligence analysis on the Internet. At that time, too many people in China talked about general artificial intelligence surpassing humans, general artificial intelligence would rule everything, and so on. But when the concept is hyped, it often makes people forget some simple rules.

Professional large models, not "general" artificial intelligence

The above is a list of the top 50 AI foreign companies published by Forbes. We don't just have ChatGPT in our eyes, especially professionals in this industry. Many of the products of the above relatively successful AI companies are aimed at chat, copywriting, painting, synthetic video, and assisting in cell gene research and development applications that do not require high reliability.

No matter how versatile the product develops, using the same resources to focus on a certain professional field, in other cases of the same technical level, it must be that the professional product is better in this professional field. The history of the development of artificial intelligence concepts is very long, but not many have really gained practical applications. In my article "The Most Authoritative Analysis of Artificial Intelligence on the Web", I pointed out the key reason: because artificial intelligence is essentially a probability-based judgment system. Therefore, its reliability is difficult to reach an extremely high level. In addition, the solution of any technical problem must be based on limited premises, and cannot present a problem that has no boundaries and can increase in complexity indefinitely. Such a problem is unsolvable. The prerequisite for solving any problem is to be able to effectively simplify the problem. Relatively speaking, applications such as intelligent transportation and face recognition have been relatively successful. Because the recognition of such objects can be subject to constraints. The license plate of the vehicle itself is relatively regulated. When the face recognition software is running, it can display the imaginary frame of a human head, so that the face is in the most favorable position for recognition relatively regularly.

The error rate of face recognition is at the level of 1 in 10,000 for products on the market. There are also companies that claim to reach 1 in a million, but must give the conditions to achieve this recognition rate. It doesn't make much sense to achieve this recognition rate under ideal conditions in the lab.

To this day, it is still difficult to say that speech recognition and the like are used smoothly. The reason for this is that it is difficult to establish a way to standardize voice input simply through virtual frames like face recognition. If the speech is very standardized and the background noise is small, the recognition rate is okay. However, if the background is slightly noisy, the speed or pauses are irregular, and the speech is irregular (such as a lot of repetition and redundant pronunciation), the recognition rate will decrease significantly. And we can't say that we must first train people to speak habits as announcers, and then apply speech recognition software. Therefore, in order to reduce the effect of background noise, speak as close to the microphone as possible. In addition, after thinking about it, the speed of speech should be stable, and try not to have pauses, repetitions and redundant miscellaneous words (such as: ah..., this and this, um...). etc.).

For many applications, especially industrial applications, the error rate may need to be controlled below 1 in 1 million (7 9s) to be truly commercial. For example, applications such as autonomous driving on urban roads are like this, and it is useless to just show it, and its reliability may take 8 or even 9 9 people to truly accept and truly commercialize. For current artificial intelligence technology, it is almost impossible in principle.

Uniview has been engaged in the research and development of intelligent transportation products from the beginning, so it has selected the most suitable artificial intelligence application field from the beginning. The current large model technology is more precisely just a deeper neural network algorithm, and do not understand that the cost itself is general artificial intelligence. Professional or universal, is only a difference in the direction of application, not the technology itself naturally determined.

Why can intelligent transportation achieve good application results? The reason is that at this stage, artificial intelligence can already achieve the recognition of vehicle information with considerable reliability (structured, that is, identify the license plate number of the vehicle, the color of the vehicle, the model, etc.). This information artificial intelligence recognition is not absolutely accurate, there is a certain recognition error, the current level is roughly 1 thousandth to 1 percent. However, this can be reviewed in multiple dimensions by comparing it with the vehicle information stored in the database by the transportation department, thereby greatly reducing the error.

For example, the last digit in the license plate may be recognized incorrectly (the computer does not know which digit is wrong), but through the cross-comparison of information such as vehicle color and model, it may be easy to correct the wrong digit. It is also possible to compare the misidentification results of the vehicle in different positions to correct the number of identification errors. This is how reliability is improved by reviewing different sources of information. This method will be adopted to varying degrees in the professional field, but it is difficult to adopt in chat applications such as Chat GPT. This is why their reliability is generally not high.

Third, the benefits of large models to the professional field

One question we are concerned about is: Are large models only suitable for areas where reliability is not critical? Does it bring more value to industry applications? In fact, Uniview's Wutong is not simply based directly on a large model such as GPT, but is developed on the basis of Meta's (formerly Facebook) open-source, CV (Computer Vision) general-purpose large model that focuses more on images and videos. This is universal, and in fact it has been professionally enhanced for visual information. Based on this CV general large model, a large number of targeted cuts and optimizations, coupled with targeted industry scenarios and training tuning, will be further specialized and become a general large model for the industry.

Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence
Professional large models, not "general" artificial intelligence

What are the benefits?

Work simplified. The original small-model AI technology needs to be specially trained for completely new vehicles (such as various vehicles in airports), but the new industry large model can not require this process. This allows many partners to carry out professional training and optimization in further application scenarios to improve the recognition rate and reliability in the final application scenario.

The AI computing power of specific applications such as intelligent traffic video is strictly limited, because it is impossible to install a chip with too high computing power in the camera. Therefore, the technology that combines large models is not a comprehensive replacement, but works together with the original small model AI to solve the problem most effectively. The camera is still structured using a small model algorithm, and the application of large models is combined in the cloud.

After combining the general industry large model, it can be easily adapted to more application scenarios, and the algorithm efficiency in the cloud will be significantly improved. Because the parameters that are not needed in the original CV model are greatly reduced and optimized, the computing power required is also greatly reduced compared to the original CV model.

The above application method is worth referring to other AI developers. Don't be misled by the hyped concept of the industry, be sure to choose the most optimal technical path for your application. Others hype the number of parameters and computing power, that is the purpose of someone's family. The greater the computing power required, of course, the more NVIDIA's chips will be sold. But if you're not in the AI chip business, don't be fooled by this conceptual hype.

Using the smallest resources to achieve their true goals is the behavior that best reflects human intelligence. Instead of simply showing how big the resources you use, just to "look more bullish".

The more artificial intelligence is popular, the more it is necessary to improve the intelligence of human beings.

Fourth, the problem of general artificial intelligence and human intelligence research

Here we analyze general artificial intelligence and compare it with human intelligence.

The following is a case analysis of my application of Baidu Wenxin.

Professional large models, not "general" artificial intelligence

Data not found.

Professional large models, not "general" artificial intelligence

This 2,730.9 billion kWh is the power generation of all technologies, not photovoltaics. The total annual power generation of all technologies in the country is only more than 8 trillion kWh, how can the photovoltaic power generation reach 2.7 trillion kWh from January to April? This is something that people with a little basic knowledge of China's power industry can see at a glance that it is wrong.

Professional large models, not "general" artificial intelligence

Inconsistent data.

Professional large models, not "general" artificial intelligence

Bing's CHATBOT AI replied to the results, because they only trained on data before 2021, so they queried the photovoltaic power generation in May 2020, but this result is obviously too far away, it actually said that it is the data released by the authoritative channel of the National Energy Administration.

Wen Xin's answer result: China's photovoltaic power generation data in May 2020 was 13.279 billion kWh, which is correct.

Professional large models, not "general" artificial intelligence

So, why am I not too comfortable with this kind of general artificial intelligence, because I can't confirm its reliability after finding the results. Although the information I found directly from the Internet is not a simple way to confirm its reliability, at least I can confirm which data is more reliable by repeatedly comparing different data sources. Let's take a query of geographic information data as an example to illustrate this point – the altitude of the highest peak in Mount Hua, China. The following is the result of the query from Wen Xin.

Professional large models, not "general" artificial intelligence

If you check directly from the Internet, there are two data with very small differences, one is the above 2160.5 meters, the other is 2154.90 meters, and the difference between the two is only 5.6 meters. Both data are widely disseminated, and it is difficult to directly distinguish who is right and who is wrong. But there is a very simple way to confirm, that is, look up the photos of the southern peak of Mount Hua to see.

Professional large models, not "general" artificial intelligence

This photo was uploaded on December 29, 2022

From this photo, it can be seen that there is a stone monument erected in April 2007 on the peak of Huashan Nanfeng, which clearly and unambiguously says that the height of Huashan Nanfeng is 2154.90 meters, and this data is clearly endorsed by a large number of China's most authoritative institutions such as the Shaanxi Provincial Bureau of Surveying and Mapping, the Shaanxi Provincial Construction Department, the State Bureau of Surveying and Mapping, the Ministry of Construction and the State Council, which is obviously more acceptable. At this time, in theory, we may still have some room for doubt, for example: this is just a stone tablet on the top of the southern peak of Mount Hua, and it is not the official first-hand data source of the Shaanxi Provincial Bureau of Surveying and Mapping, the Shaanxi Provincial Construction Department, the National Bureau of Surveying and Mapping, the Ministry of Construction and the State Council, so its reliability is not the highest.

In addition, from the perspective of pure skepticism, some people may ask whether this photo is PS. This kind of purely questioning question cannot be said to be unreasonable, but for the first question, after all, the authoritative geographic information data mark set up on the most eye-catching scenic spot in China's most famous scenic area such as Huashan, if it is wrong, the relevant institutions of the above endorsement will come out and correct it long ago. Second, it can be confirmed through the data sources of multiple photos that no other photos have been found that are different from the above photos, so the second type of challenge is not supported by any evidence. It is generally extremely difficult to check the first-hand data of the altitude of the South Peak of Mount Hua from the official channels of the above authoritative institutions, so it is very simple and clear to confirm from the important geographic information data stele of the peak of the South Peak of Mount Hua, and its authority is almost very close to the data of the first-hand source.

So how did the 2160.5 meter data come from in the first place? Is it 2160.5 meters after adding this stele? Then let's check another one with a reference height, such as a photo of someone next to it.

Professional large models, not "general" artificial intelligence

A comparison shows that this stone tablet is obviously not as tall as a lady, and its height is at most more than one meter, which is unlikely to bring an increase of 5.6 meters. From the point of view of surveying (that is, geosurveying), the data of 2154.90 meters was obtained by measuring the altitude of the mountain at the base of this stele. The reason why it is expressed to 2154.90 meters is that from the data itself, it shows that its measurement error is less than plus or minus 0.005 meters (5 mm)

Frankly speaking, I really haven't found out how the 2160.5 meter data came from for a while. Especially on some tourism websites, it is clear that the photo uploaded by the editor himself (such as the previous photo of Nanfeng) is 2154.90 meters, but it is written in the text introduction that it is 2160.8 meters. There is a slight deviation of 0.3 meters from 2160.5 meters. This discrepancy between his own and his own data indicates that the editor himself did not seriously confirm the data.

Professional large models, not "general" artificial intelligence

At least one point, the data representation of 2160.5 meters, from the data itself, can be known that its corresponding measurement error is plus or minus 0.05 meters, that is, 5 centimeters. This level of technology is an order of magnitude lower than the level of geosurveying technology at the time of the monument's erection in April 2007. From the perspective of unified surveying, it is not as good as 2154.90 meters simply from the scientific nature of their data expression itself.

The above analysis does not mean that we will eventually absolutely accept the data of 2154.90 meters, but only to illustrate some important issues for artificial intelligence research. When I made the above analysis, did people find a fact: how does human intelligence think about problems? It is not simply relying on the huge number of corpus or information sources to solve the problem, but relying on logic; Judgments on different information are not based on probabilities, but on the quality of information; It is not a single model, whether it is a large model or a small model, but relies on a variety of different dimensions, different ideas, different aspects, different information sources, different types of information (especially cross-confirmation with accurate and reliable data stored in advance), and cross-comparison and repeated confirmation of thinking of different scientific knowledge frameworks.

The human mind itself is unreliable from a single point of view, and a lot of the misinformation that appears online is human error, and very little is purely machine-induced error. However, the reason why humans use neurons that are not reliable in themselves is that it is possible to obtain extremely reliable thinking results, using logic, information quality, and cross-model review to obtain reliability improvement. If one path is difficult to identify, try another path.

It is hoped that the conclusions of the above thinking research can give artificial intelligence researchers certain inspiration. Human intelligence seeks to obtain the most reliable results with as little computing power as possible, rather than simply pursuing the violent aesthetics of algorithms.

The successful application of artificial intelligence traffic video and face recognition is not only because their recognition rate is relatively high and standardized, but also because they can be cross-compared with other ways of information other than pure artificial intelligence recognition. In addition to the above-mentioned cross-comparison of license plate information with vehicle color, model, etc. with the information in the vehicle database, if it is cross-compared with the location of the mobile operator's owner's mobile phone, the recognition rate is higher. Face recognition can also be cross-compared with identity data already stored in the database, such as name, gender, ID number, etc. These will make the final recognition rate substantially improve on the basis of the recognition rate of artificial intelligence itself. The results of speech recognition, such as speech recognition, can only be manually checked and troubleshooted, and there is no cross-comparison of highly accurate information such as pre-stored databases.

Fifth, the big pit of foreign general artificial intelligence

ChatGPT is free abroad, but if you want to use it domestically, you need to go through various "channels" and it is charged. Below are the billing pages for several channels.

Although this charging model is common on the Internet, permanent membership and monthly, quarterly and annual membership fees are so close that it will inevitably make people judge: this is not a long-term service model at all, that is, users are encouraged to pay the cost of permanent membership quickly. Blowing it up so godly, and then encouraging Chinese users to pay money to sign up for permanent membership, is inevitably suspected of cutting leeks. This is also the more important reason why when my popular technical concepts in foreign countries are transmitted to China, I often pour some cool water first.

6. Evaluation of general artificial intelligence

With the popularity of the concept of general artificial intelligence, various methods of how to evaluate its technical level are constantly emerging. For example, there is a so-called "honey bear test" method (see: honey bear test: 5 minutes to feel the big model "strength index", suit and hoodie, suit and hoodie, 2023-03-20 07:01 Published in Singapore). Obviously, this is not a systematic and comprehensive professional evaluation, but a simplified method. The "Honey Bear Test" has eight questions. The topic is very simple, but it covers several fields such as mathematics, common sense of life, logic, Internet memes, and e-commerce:

1. A bear eats 14 cans of honey a day, how many cans of honey does it eat a year?

2. A bear eats 14 cans of honey a day, how many cans of honey does it eat in a leap year? '

3. This bear is going on a business trip, he wants to stock up on some honey, how can honey be best preserved?

4. Please draw a picture of the ascii art of the bear eating honey?

5. If I'm in the wild and I have a jar of honey in my backpack and it's smelled by a bear, can I give the honey to the bear to survive?

6. A bear holding a jar of honey starts from a point, walks a kilometer south, then a kilometer east, and then a kilometer north, just back to the starting point, may I ask: What color is this bear?

7. Bear has recently become obsessed with online shopping. Are there any good honey brand recommendations?

8. Thank you for answering this series of questions above.

Along with this evaluation, there is also an "emergent" concept to qualitatively distinguish different general-purpose artificial intelligence.

Professional large models, not "general" artificial intelligence

See, look at the strength of Baidu Wenxin, and then talk about how the learning ability of ChatGPT touch bypass comes from? , Dear Data Dear Data Posted in Beijing on 2023-03-21 12:10.

What does it mean to make such a distinction? Of course, it will imply that different general artificial intelligence products are fundamentally different - some have already emerged, and some have not yet emerged. If it is only a difference in quantity, it can converge or surpass as long as it is continuously improved in quantity, and if it is a qualitative difference, it may not be surpassed for a long time. Especially in the current domestic H100 chip embargo with the highest computing power of NVIDIA, it will make people feel that it is impossible to achieve "emergent" general artificial intelligence in China.

So the top professionals don't look at the graph, but at the reliability data indicators - in essence, they can't be too high. Instead of "emerging", it will approach indefinitely and stagnate at the level of 99.9% to 99.99%.

In addition, the concept of what the industry calls the "big model" needs to be kept calm in the deepest way. Just like the once hyped concept of "big data". To what extent is data "big" "big data", and is there any essential difference to this extent? The history of big data development to the present has actually proved very fully: the history of computer development is mainly the difference in quantity. If anything is fundamentally different, it will only be application-specific. For example, for video, for every doubling of the line scan, the computing power of the same coding standard needs to increase by about 4 times, so in the past era when Moore's Law has been in effect, the video line sweep can be doubled every 3 years. But in a general sense, just as there is no exact theoretical basis to show how large big data will change essentially, there is no exact theoretical basis to indicate how many parameters will appear "emerge".

The essential difference between AI technology will indeed be reflected in algorithms and computing power. This artificial intelligence outbreak is on the one hand the continuous improvement of computing power, and the other is the progress of Trasformer's new algorithm. It is another small algorithmic revolution (essentially neural networks) after deep learning algorithms brought about by CNNs (neural networks). As long as this new algorithm is used, there is only a difference in quantity, and there will be no essential difference between "emergence" and "not emerge".

Companies engaged in core hardware, especially those with the most advanced core hardware, of course, hope that the entire industry is caught in the thinking trap of competing for model parameters "the bigger the better", and thus need to have as much computing power as possible.

Seventh, general artificial intelligence has "values"

In addition, the computer itself is a highly reliable machine, if it is completely based on human-made corpus information with a large number of errors or deviations, is this the opposite? With so much extremely reliable computing power, but created extremely unreliable thinking results. It is not that the more corpus input, the more parameters, the higher the level of artificial intelligence. If you add more garbage corpus, it will only reduce the quality of the previous training results, not increase. Because the corpus is made by humans and is unreliable in itself, there is a job that needs to be cleaned up first to eliminate low-quality human-made data. But the result of this data cleaning depends on how the person doing the cleaning work chooses. This selection criterion may have a "values" bias.

Therefore, AIGC for content generation under the banner of general artificial intelligence has values. In fact, even the simplest search platform, although it only gives page results from other websites, can also reflect values or business preferences only through different sortings, so it can have a business model of bidding ranking.

Read on