laitimes

Five models to solve college entrance examination mathematics: Ali Tongyi Qianqian, 360 wisdom brain 10 questions all wrong 0 points

Five models to solve college entrance examination mathematics: Ali Tongyi Qianqian, 360 wisdom brain 10 questions all wrong 0 points

Produced | Sohu Technology, Sohu Education

Operations Editor | Zheng Wenqi

In the annual college entrance examination season, college exam questions are the focus of public attention and have also become the touchstone of AI ability. What is the mathematical foundation of AI large model? Is it "smarter" than humans? Sohu Technology used five AI models to test the same question in the 2023 college entrance examination Shanghai mathematics paper.

In the test, Sohu Technology selected the first 10 fill-in-the-blank questions in the high school examination paper, and asked Baidu Wenxin Yiyan, Ali Tongyi Qianqian, iFLYTEK Xinghuo Cognitive Model, 360 Wisdom Brain, and ChatGPT to answer.

Five models to solve college entrance examination mathematics: Ali Tongyi Qianqian, 360 wisdom brain 10 questions all wrong 0 points

The test results show that the five large models have a significant difference in their ability to answer math problems.

The most "smart" iFLYTEK Xinghuo answered 5 questions correctly, with a 50% correct rate. Baidu Wenxin and ChatGPT "followed closely behind" and answered 4 questions correctly, with a correct rate of 40%.

360 Wisdom Brain and Tongyi Qianwen "The whole army was wiped out", and they did not answer a single question correctly, and handed in a blank roll.

Five models to solve college entrance examination mathematics: Ali Tongyi Qianqian, 360 wisdom brain 10 questions all wrong 0 points

It is worth mentioning that yesterday Sohu Technology also used five large language model products to test the national paper (A) of college entrance examination composition, and invited five famous language teachers to score.

The scoring results show that ChatGPT has the highest score in college entrance examination essay, Wen Xin Yiyan and iFLYTEK Spark score slightly lower, but it is at the same level as ChatGPT. 360 Wisdom Brain and Tongyi Qianwen scored the lowest.

Although Chinese composition questions and mathematical calculation problems have different dimensions for examining the ability of large models. But coincidentally, 360 Wisdom Brain and Tongyi Qianqian, who are not good at writing texts, do not seem to be good at doing math problems.

Through these two college entrance examination tests, it can also reflect from the side that the ability of each large model is indeed "uneven". If ChatGPT, Wen Xin Yiyan, and Xunfei Xinghuo are "Xueba", then 360 Wisdom Brain and Tongyi Qianwen are proper "scum".

Five models to solve college entrance examination mathematics: Ali Tongyi Qianqian, 360 wisdom brain 10 questions all wrong 0 points

So how many questions can you challenge to get right? Welcome to leave your answers in the comment area, you can get the answers to the test paper by private message background~

Attached are the math questions used in the Shanghai college entrance examination in the test:

1. The solution set of the inequality | x-2|< 1 is __

2.a=(2,3),b=(-1,2),则a·b=__

3. The first six terms of the series of equal proportions with the first term 3 and the common ratio of 2 and S6=__

4.tanA=3,tan2A=__

5.f(x)={2^xx>0; The range of 1,x≤0} is __

6. The complex number z=1-i, then |1+iz|=__

7. If the equation x^2+y^2-4y-m=0 of the circle is π, then m=__

8. If the length of the three sides of the triangle is a=4, b=5, c=6, then sinA=__

9. The GDP (100 million yuan) of a certain place in the four quarters of a year, the GDP in the first quarter is 232, the GDP in the fourth quarter is 241, and the GDP of the four quarters increases quarter by quarter, and the median and average are equal. The annual GDP of the place is __

10.(1+2023x)^100+(2023-x)^100=a0+a1x+a2x^2+...+a100x^100。 If AK<0, the maximum value of the positive number K is __

Read on