laitimes

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

author:Colorful peaks

Wen Xin Yiyan is a large language model released by Baidu in March this year, and is said to be "the first in Asia in terms of parameter quantity". After three months, the author finally got a distinguished internal test quota and took the time to do a relatively comprehensive test:

Comparison object: GPT-3.5 (official website does not recharge version), newbing (new Bing, it is said that I got the GPT-4 kernel in advance, but castrated some functions,)

First, first test a popular science logic problem:

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

GPT

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

Wen Xin

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

new bing

All three models give good answers. Wen Xin twisted a little, tentatively on the edge of circling himself.

Second, this is followed by a moderately difficult mathematical problem - a monic cubic equation:

(Measurement: All three models can complete the operation of univariate quadratic equations)

The GPT idea is fine, and a positive solution is given, but it seems to be miscalculated later:

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

Chinese the question, Newbing doesn't seem to understand what I mean (no matter how I correct it later):

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

new bing

When asked in English, Newbing gave the correct answer:

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

new bing

Wen Xin's answer was wrong, and he was so wrong that he didn't know how to ask:

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

Wen Xin

Third, finally the scheduling problem, see how good they are as housekeepers:

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

Wen Xin

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

new bing

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

GPT

newbing performed the best, GPT was discounted because it could not update information online, Wen Xin said... It seems that Wen Xin can't understand some of my premise - foreign AI knows more about Chinese pinching than Chinese AI.

Four

As a result, Wen Xin's words are indeed lagging behind the top AI models in many aspects, and perhaps it is more like a knowledge base plusplus version of Siri, Xiaodu, or Xiaoai classmates than GPT.

[Tips for using AI: Talk like a human, rather than asking questions mechanically.] Errors in answers, deviations in understanding, should be corrected in time and let it be re-answered, which will often lead to more accurate conclusions. 】

Wen Xin's words compared with actual measurement - how far is it from the top AI models?

Read on