Wen Xin Yiyan is a large language model released by Baidu in March this year, and is said to be "the first in Asia in terms of parameter quantity". After three months, the author finally got a distinguished internal test quota and took the time to do a relatively comprehensive test:
Comparison object: GPT-3.5 (official website does not recharge version), newbing (new Bing, it is said that I got the GPT-4 kernel in advance, but castrated some functions,)
First, first test a popular science logic problem:
GPT
Wen Xin
new bing
All three models give good answers. Wen Xin twisted a little, tentatively on the edge of circling himself.
Second, this is followed by a moderately difficult mathematical problem - a monic cubic equation:
(Measurement: All three models can complete the operation of univariate quadratic equations)
The GPT idea is fine, and a positive solution is given, but it seems to be miscalculated later:
Chinese the question, Newbing doesn't seem to understand what I mean (no matter how I correct it later):
new bing
When asked in English, Newbing gave the correct answer:
new bing
Wen Xin's answer was wrong, and he was so wrong that he didn't know how to ask:
Wen Xin
Third, finally the scheduling problem, see how good they are as housekeepers:
Wen Xin
new bing
GPT
newbing performed the best, GPT was discounted because it could not update information online, Wen Xin said... It seems that Wen Xin can't understand some of my premise - foreign AI knows more about Chinese pinching than Chinese AI.
Four
As a result, Wen Xin's words are indeed lagging behind the top AI models in many aspects, and perhaps it is more like a knowledge base plusplus version of Siri, Xiaodu, or Xiaoai classmates than GPT.
[Tips for using AI: Talk like a human, rather than asking questions mechanically.] Errors in answers, deviations in understanding, should be corrected in time and let it be re-answered, which will often lead to more accurate conclusions. 】