Questioning, selling, and price wars, AI has been "encircled and suppressed" by humans? Titanium media AGI

Titanium Media APP

2024-05-26 10:07Posted on the official account of Jiangsu Titanium Media APP

Questioning, selling, and price wars, AI has been "encircled and suppressed" by humans? Titanium media AGI

(Image source: edited and photographed by Titanium Media App)

Recently, there have been a lot of doubts about generative AI products and industries.

On May 25, Beijing time, a study published by Purdue University in the United States showed that ChatGPT analyzed 517 code development questions on Stack Overflow, and finally showed that 52% of the answers output by AI contained misinformation, 77% of the answers were more verbose than human answers, and 78% of the answers were inconsistent with humans.

ChatGPT still has significant flaws, often producing completely wrong answers out of thin air, and the percentage of errors is alarmingly high, the researchers said.

At the same time, Google's new AI search product "AI Overview" released a few days ago was exposed by netizens AI answers frequently made mistakes, including suggestions such as "smearing glue on pizza" and "eating stones is good for your body", and it will even tell you that everything on the Internet is 100% real, so this has been widely criticized by netizens; In addition, a number of AI companies such as Adept, which is valued at more than $1 billion, Humane, which is valued at $750 million, and Stability AI, an AI unicorn, have been exposed to mergers and acquisitions or "selling".

Summary of information after the price reduction of the domestic large-scale model "price war" in May (source: edited and sorted out by Titanium Media App)

If we add to this the large model API "price war" that has been collectively entered by China in the past month. Obviously, as a new round of AI boom enters the "deep waters", before AI has changed the world, it has now been "encircled and suppressed" by humans.

AI products are questioned: ChatGPT has a 52% error rate, and Google's AI "overturns"

First of all, let's start with the 52% error rate of ChatGPT's programming questions.

According to futurism, a study presented by researchers at Purdue University in the United States at the computer conference ACM this month showed that 52% of the programmed answers generated by ChatGPT AI contained misinformation and 77% were too verbose.

In the 17-page paper, the researchers analyzed ChatGPT's answers to 517 programming questions on Stack Overflow in depth, examining their correctness, consistency, comprehensiveness, and conciseness. Despite the findings that ChatGPT's responses were misinformed, 35% of the time among study participants preferred ChatGPT's responses because they were comprehensive and clearly expressed.

In terms of research methodology, the authors of the paper carried out three parts: manual analysis, linguistic analysis, and user research.

Manual analysis: Manual analysis of ChatGPT's responses is performed to compare them with accepted answers provided by human programmers on Stack Overflow.

Linguistic analysis: Linguistic analysis of 2000 randomly selected Stack Overflow questions, using LIWC tools and sentiment analysis.

User Research: Conduct user research with 12 programmers to understand their preferences for ChatGPT and Stack Overflow answers.

In the end, it was found that ChatGPT's answers performed very well in many cases, but also made frequent errors and unnecessarily prolonged the answers.

At the same time, ChatGPT responses have richer linguistic features, leading some users to prefer ChatGPT over human responses, and sometimes ignore the basic errors and inconsistencies in ChatGPT responses. The data shows that 39% of participants ignored misinformation in ChatGPT's answers. This suggests the need to combat misinformation in ChatGPT's answers and raise awareness of the risks of seemingly correct answers.

In addition, users are often able to correctly distinguish between ChatGPT responses and human responses, looking for factors such as formal language, structured writing, length of responses, or unusual errors to determine the source of an answer. Moreover, the paper also discusses the challenges and risks of ChatGPT in programming, and proposes opportunities to design new interactions and computational methods to counter the misleading information generated by ChatGPT.

Some analysts believe that the research results of this paper are of great significance for understanding the application of ChatGPT in the field of programming and its potential problems, and provide guidance for future research and practice.

Secondly, Google's AI "big overturn" incident has attracted attention.

Recently, Google announced at the I/O Developer Conference that it had added a new feature to its search product called "AI Overview," which can display a short answer directly at the top of the search results. For example, if you search for "how to clean leather boots", you'll immediately see a multi-step cleaning guide that combines information from the web. This feature allows you to quickly get information without having to go through individual pages.

However, after less than two weeks of use, AI Oversight has received a lot of criticism from netizens due to its frequent "outrageous" wrong answers, and users still have no option to turn off this feature.

For example, when a user searches for "cheese nonstick pizza," AI Overview suggests adding "about 1/8 cup of non-toxic glue" to the sauce. Some netizens found that this suggestion seems to originate from a Reddit comment that no one cares about 11 years ago.

For another example, a user asks, "Is all the information you see online true?" ”。 AI Overview replied, "Yes." Everything on the internet is 100% real. ”。

Sometimes, the AI Overview feature presents incorrect information that misleads users into believing that it is the opinion of an authoritative medical expert or scientist.

For example, when a user asks "how many rocks should I eat per day," the tool incorrectly quotes a UC Berkeley geologist as saying, "People should eat at least one small rock a day," and goes on to list the benefits of eating rocks, such as that rocks are a great source of vitamins and minerals for the human body, and that eating stones is good for digestive health.

When asked if Google search violated antitrust laws, AI Overview replied: "Yes, the U.S. Department of Justice and 11 states are suing Google for allegedly violating antitrust laws. ”

In response, a Google spokesperson said, "The vast majority of AI Overviews provide high-quality information and provide more links to web pages for users to learn more about." We've noticed that many of the examples are uncommon queries, and some are tampered with or can't be reproduced. This feature has been extensively tested before launch, and the company takes quick action to ensure the accuracy of the content if necessary. ”

Not only the AI Overview tool, but another image generation tool launched by Google in February this year, called Gemini, also had frequent errors, which led to the suspension of the tool. For example, a user shared on social media that when he inquired about Google's founder, the tool showed an image of an Asian man.

Google said in a statement at the time that it was working to fix Gemini's image generation issues, acknowledging that the tool was "not up to par." Soon after, the company announced that it would immediately "pause the generation of human images" and "re-release an improved version soon", but it has not yet been rolled out again.

Finally, in the past two days, the U.S. AI market has entered a new round of reshuffle, and unicorns have collectively encountered "selling out".

According to reports, Adept, a large-scale model company founded by Transformer authors Ashish Vaswani, Niki Palmer and David Luan, has been sold. The company is valued at $1 billion and has previously raised $400 million in funding, with participation from Frontiers Capital, Microsoft, Nvidia and others. It is reported that Adept has communicated with Meta.

At present, the company's two co-founders, Ashish Vaswani and Niki Palmer, have founded another AI office automation company, Essential AI.

At the same time, it is reported that Humane AI Pin, a smash hit wearable AI device company, is also in contact with a financial adviser to find a potential buyer, with a target price of $7.5-$1 billion. Previously, the company received hundreds of millions of yuan in financing support from Microsoft, Qualcomm, and OpenAI CEO Altman.

In addition, Stability AI, a pioneer in the field of AI image generation and the creator of Stable Diffusion, was exposed that the company's team considered a merger, but the specific progress is unknown. AI search leader Perplexity had news about it in January. However, with the official announcement of the B round of financing of $73.6 million, the acquisition plan is suspected to have been suspended, and it has recently been revealed that it is seeking a new round of financing, which may reach $250 million.

Obviously, for whatever reason, the field of large models has entered a new round of reshuffle. According to PitchBook, about 26,000 startups around the world have raised a total of $330 billion in the past three years.

According to market analysts, investment in the generative AI industry is showing a directional shift, with an obvious "snowball effect" of investment and financing at the model layer, resources gathering to the head, and potential capital focusing on the application layer.

Sequoia Capital said at the recent AI Ascent 2024 that in 2023, AI companies have spent $50 billion on Nvidia GPUs, but the resulting revenue is only $3 billion.

Demis Hassabis, CEO of Google's DeepMind, bluntly said that AI has been overhyped, valuations are too high, and the "bubble" needs a soft landing.

Zhu Xiaohu, managing partner of GSR Ventures, once pointed out that the business model of the large model is too poor, and the technology is not too bad, each generation of technology must be invested, and now it may cost tens of millions of dollars to invest in version 3.5, hundreds of millions of dollars to iterate to version 4, and billions of dollars to version 5, and each generation of models must be re-invested, and the monetization cycle may be two or three years, "which is worse than a power plant."

American economist Tyler Cowen believes that the AI hype has subsided, but the revolution continues.

The price war of domestic large models has intensified, and the market will usher in a new round of reshuffle

Compared with foreign AI doubts and forced companies to sell the environment, competition in the domestic AI field has intensified, and Alibaba, Tencent, Byte, Baidu, iFLYTEK, Zhipu, and DeepSeek have collectively joined the price war.

May 6: DeepSeek, a start-up large model company incubated by the private equity fund High-Flyer Quantitative, launched a price reduction, and the input price of the GPT-4 model DeepSeek-V2 was set at 1 yuan/million tokens.

May 12th and 13th: Zhipu AI's GLM-3-Turbo model and face-wall intelligent model launched a price attack and defense, and the latter announced a 0 yuan purchase.

May 15: ByteDance's Volcano Engine AI Large Model Doubao (formerly known as Skylark) announced that the pricing of the main model is 99.3% cheaper than the industry price level, and the API input price of the Doubao large model is 0.0008 yuan/1000 tokens, that is, 1 yuan can buy 1.25 million tokens in Doubao.

On the morning of May 21, Alibaba Cloud announced that the price of 9 Tongyi large models was reduced, among them, the performance of the main model Qwen-Long, which is benchmarked against GPT-4, the API input price dropped from 0.02 yuan/thousand tokens to 0.0005 yuan/thousand tokens, a direct drop of 97%, that is, 1 yuan can buy 2 million tokens, which is equivalent to the amount of text in 5 "Xinhua Dictionary". The API input price of the Tongyi Qianwen Super Large Cup Qwen-max, which has just been released, has also dropped by 67%, as low as 0.02 yuan/thousand tokens. In terms of open source, the input prices of five open source models, including Qwen 1.5-72B and Qwen 1.5-110B, have also dropped by more than 75% respectively.

5月21日下午,百度发文宣布,旗下两款大模型ERNIE Speed、ERNIE Lite免费。

On May 22, Tencent announced a new large-scale model upgrade plan, one of the main models, the mixed-lite model, which is not only upgraded from 4K to 256K, but also free of charge; In addition to the length upgrade, the price of other models has also dropped significantly.

May 22 noon: iFLYTEK announced that iFLYTEK Xinghuo API capabilities are officially open for free. Among them, the iFLYTEK Xinghuo Lite API is permanently free and open, and the iFLYTEK Xinghuo Pro/Max API is as low as 0.21 yuan/10,000 tokens.

On the one hand, they believe that it is good for the development of the large model market, and on the other hand, they believe that the AI "bubble" is about to burst and a new round of reshuffle is coming, and their views are not consistent.

Among them, Liu Weiguang, senior vice president of Alibaba Cloud Intelligence Group and president of the public cloud business department, said, "(price reduction) its purpose must be to benefit the market" and "to really accelerate the early outbreak of the market".

Tan Cheng, president of Volcano Engine, said that the main reason for the price reduction is that the industry's large-scale model capacity has been greatly improved this year, and it has become very important to do the application, that is, the ecology must be prosperous. Tan said that many of the customers he is currently in contact with are trying to make large models, but the risk of innovation is very high, especially in the field of AI, so it is necessary to reduce the cost and promote more widespread use. From this point of view, both large enterprises and individuals need large models with lower cost and higher quality.

But Bloomberg analysts Robert Lea and Jasmine Lyu argue in their latest report that "China will have a long road ahead of AI profitability, and an industry reshuffle could drive the sector to profitability, albeit in an industry with excess capital, this (industry profitability) situation seems unlikely to happen anytime soon." ”

Kai-Fu Lee, CEO of Zero One Everything, told Titanium Media App that the cost of reasoning will be reduced by 10 times per year, but crazy price reduction is a lose-lose.

"Because the ratio of API to model calls is still very low today, if the inference is reduced by 10 times a year, many people can use it, which is very good news." But on the other hand, Kai-Fu Lee believes that according to the current situation of the domestic market, hundreds of thousands of POC (proof of concept), millions of dollars to make a single business, to do a single order, to lose a single business, "In the early days of the AI 1.0 era, we saw too much and invested more, (now) we resolutely do not do (money-losing business)." ”

Wang Xiaochuan, founder and CEO of Baichuan Intelligence, pointed out to Titanium Media App that the price is free, but it is not necessarily competitive. The price war of large models will accelerate the bubble cycle, which will directly lead to the withdrawal of some companies from the track.

"Let's not mix it with the C-side," Wang Xiaochuan said bluntly, this kind of price war has nothing to do with the C-side. At the same time, after the free, the entire To B market will prosper faster, because everyone is more willing to try to use this model, and there is room for value, but tail companies will withdraw from this track.

Wang Xiaochuan emphasized, "There will be pearls at the end of the high tide and low tide, but there must be a kind of bubble here, which will accelerate the bubble cycle and make it more prosperous, which is an inevitable thing in business." ”

On the whole, whether it is product doubts, companies selling themselves, or this round of domestic AI large model price wars, it shows that the industry is facing a new round of reshuffle and boom cooling. However, the ever-changing nature of AI technology is still important for all industries.

According to the latest data from IDC, a market research agency, in 2024, the global generative AI expenditure will be 40.3 billion US dollars, of which the generative AI infrastructure, models and platforms, applications, and services will account for 45.41%, 11.66%, 15.63%, and 27.30% respectively, and by 2027, the global annual generative AI expenditure will reach 151 billion US dollars, accounting for 29% of the global AI expenditure.

IDC analysis believes that in 2024, the number of basic large models in China will decrease, and it will gradually transform into an industry large model competition for industrial landing, and open source and closed source will coexist in the dimensions of model framework, developer tools, basic large models, deployment and inference tools. At the same time, as Apple, Xiaomi, Honor and other manufacturers have successively released chips or models that support device-side AI inference, AI landing on the device side has become a trend for terminal manufacturers, and device-side AI inference can achieve higher processing efficiency, better privacy protection, and a new way of user experience. It is expected that until 2025, the opportunities for generative AI will remain in infrastructure, and the transition to generative AI platforms and solutions will be in 2025-2026, and the opportunities for generative AI services will explode in full force after 2026.

Xiao Youdan, a researcher at the Institute of Science and Technology Strategy Consulting of the Chinese Academy of Sciences, said that AI large model companies that have survived the big waves will usher in a new round of rapid development opportunities.

Demis Hassabis, co-founder of Google's DeepMind, predicts that artificial general intelligence (AGI) is on track by 2030.

(This article was first published on Titanium Media App, author | Lin Zhijia, editor | Hu Runfeng)

View original image 3.6M