Xiao Xiao Yuyang comes from the Cave Fei Temple
Qubits | Official account QbitAI
Ali officially joins the ChatGPT battle!
Just now, Ali's version of ChatGPT suddenly officially announced the official opening of enterprise invitations to the outside world.
It is called Tongyi Senwen and was developed by the Dharma Academy.
Well, it's a big model version of 100,000 why that smells.
In fact, as early as the beginning of this month, there were many news that Ali was going to launch a ChatGPT-like class, but it was generally expected to be around the 11th.
And the Tmall elf "bird bird divided bird" talk show version GPT that first flowed out a few days ago is a "compressed version" based on a large model, which has hung the appetite of netizens with its stunning performance, making everyone turn their attention to Ali.
Nowadays, the "main dish" is served in advance, which naturally ignores the attention of public opinion.
So, this Ali version of ChatGPT "Tongyi Qianqian", what is the strength?
It just so happened that qubits got the first batch of invitations to test, and the provincial conclusion: the real competition for Chinese big models began.
Let's see the real thing.
Teased Ali version of ChatGPT transcript
Let's take a look at the main features of Tongyi Qianwen.
As a large language model, its capabilities are mainly focused on text generation, that is, it can also "ask what to answer" like ChatGPT:
Here we try the official writing essay, it seems that even the "total score" commonly used by language teachers can be understood:
△ Another Chinese homework artifact (doge)
In addition to dialogue, it also has a "treasure bag" function, which is equivalent to a toolbox that can quickly generate various specified types of copy:
Without further ado, let's test the model's dialogue ability from the four directions of language ability, context understanding ability, code ability and mathematical ability.
1. Dialogue ability
Language skills
To talk about the domestic big model, the first thing to look at is the Chinese.
Let's start with the basics: what does it mean to wear as much as you can wear?
Yes, I explained it more clearly, and I also talked about my opinion on this sentence by the way:
The creative writing of the continuation class can not only imitate the tone, but even create suspense, which is a bit powerful~
Next, it is the turn of the new generation of AI benchmark for the mentally handicapped: How can I withdraw the money in my dream to my bank card?
"Withdrawing money in a dream is an illusion or imagination", thank you sober man.
"If you often dream about money, it is recommended that you review your finances before going to bed to see what you fall short of"
Although Ali's friends said that the main thing is that the Chinese is good, let's be a little rebellious and see how the other languages of this Chinese language model are.
Let's go off the beaten path, try your Japanese language skills and write a love letter to Hanazawa Coriander.
It seems to be good to see Chinese machine flip, so send it to coriander, it should be fine, right?
Try English again. It just so happened that some netizens PK Bard and GPT-4's rap some time ago, and we also take a look at the rap ability of Tongyi Qianwen.
This top, this stop, and stake! I just don't know if Mu Ye recognizes it.
Try adding a little difficulty, a homophonic meme mixed with Chinese and English?
At this time, Tong Yiqian's answer seemed a little serious.
When we gave the answer, although Tong Yi Qianwen saw that it was a joke, it did not find it funny, and pointed out to us very seriously: This does not mean that Panasonic is afraid of Sony.
However, we found that ChatGPT is also a serious book... Is AI so righteous all these years...
Okay, let's try it next for context.
Contextual understanding
This part is to put it bluntly, the ability to multi-round dialogue, to test whether the general meaning of Qianwen can accurately extract the above information for the next round of questions to answer.
Let's listen to the question: Is it true that one county and one wine in Shandong are real?
Blur off Shandong to try? Let it list a few classic liquor brands in "this province":
Not bad. Try to follow up on the latest answers:
Wait, "according to the history books" will not be compiled information, right? Follow-up question × 2:
Interestingly, we did find the source on the official website of Confucius Jiao:
Code capability
Next, let's try its code capabilities. Try the whole calculator program?
Not bad, implemented a simple four-rule arithmetic. The full version of the code is as follows, interested fat friends can try:
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
double num1, num2;
char operator;
cout << "Enter first number: ";
cin >> num1;
cout << "Enter second number: ";
cin >> num2;
cout << "Enter operator (+,-,*,/,%,...) : ";
cin >> operator;
switch(operator)
{
case '+':
cout << num1 + num2 << endl;
break;
case '-':
cout << num1 - num2 << endl;
break;
case '*':
cout << num1 * num2 << endl;
break;
case '/':
if(num2 == 0)
{
cout << "Error! Cannot divide by 0." << endl;
break;
}
else
{
cout << num1 / num2 << endl;
}
break;
case '%':
cout << num1 % num2 << endl;
break;
default:
cout << "Error! Invalid operator." << endl;
break;
}
return 0;
}
So, let's rewrite this code into Python:
Let it explain every piece of code I've written:
Basic programming ability, it seems that the problem is not big?
However, if you were to ask Tong Yi Qianwen to rewrite the explanation into a comment, there would be a bit of a magical bug.
Although it comments the "Python" code, wait, this is not the original C++ version of the code!
(This is not an NTR)
Mathematical ability
Finally, let's look at the math problem. Chickens and rabbits in the same cage, not bad:
There is no problem with ordinary calculation problems, and it can be accurate to a few decimal places:
Barthes, the high number problem is not very good, although it found that this problem needs to be guided, but the solution method is wrong...
However, Tongyi Qianwen also made it clear that there is no guarantee that the correct answer will be given in all cases:
Well... Like GPTs, large models have relatively rudimentary mathematical skills.
The dialogue ability is almost measured, let's take a look at its "scene ability".
Second, scene capabilities
Although the "treasure bag" gives a lot of functions, it is very common to write an outline and describe the product, so we picked three more interesting ones to try: recipe generation, rainbow fart generator and free ghostwriting love letters.
Recipes that will fly
As we all know, writing recipes is a technical job, which not only tests the context ability (the materials that have been said must be used), but also tests the ability of AI to understand the name of the dish, and the steps of cooking cannot be too outrageous.
The example of "steamed sea bass" is obviously too simple for AI. This can't be the weird name of the whole game to try it?
Let's start with a satiety gel from Genshin.
Good guy, I even thought of using real-life konjac powder to imitate satiety gel, which is a good idea. (But what the hell is calorie powder, protein powder?) )
So, try the same dish again and let ChatGPT do it again, which one do you feel is more delicious?
Then add questions to Tong Yi Qian, let him try the strange little bread made of void eggs in "Stardew Valley Language"?
Wait, really put the Void Egg into the recipe? And really made a loaf of bread! I just don't know how it tastes...
In this way, the recipes in the game can be restored to Tongyi Qianwen, directly breaking the dimensional wall.
Rainbow fart generator
Next, try again to make it generate a rainbow fart.
Exaggerated the oil stains on his clothes into works of art...
Well, the major Kwakwa groups can consider introducing one.
Write love letters for free
Finally, our test ended with a love letter to the Beast's ancestors.
How are you feeling?
Well, after reading so many five (odd) flower (odd) eight (strange) gate (strange) evaluations, are you also a little curious about how Tong Yi Qianwen came about?
Where does the Tongyi Qianqing come from?
Regarding the technical details of Tongyi Qianwen, the official Ali Dharma Academy did not disclose details.
And Tong Yiqian asked himself, this is how he answered:
Training materials from Alibaba DAMO Academy as of February 2023. The training materials include a large amount of language and text data, including Chinese, English, Japanese, French, Spanish, and multilingual text data.
He also mentioned that he is a big language model that can be networked.
However, we measured it and found that Qianwen was just shaking a shot, pretending that he could go online.
In fact, when you ask it individually what the weather is like today, Tongyi Qianwen will admit that it doesn't have access to real-time data.
But if you throw it a website that looks up the weather, it will pretend to see the content of the page and then talk about it seriously.
This should be called out to Ali programmers: Your big model really wants to go online.
Although the official caliber is low-key, just as ChatGPT was born out of OpenAI's GPT series, Baidu Wenxin is developed from Ernie big model, and Ali is also one of the earliest technology manufacturers in China to start developing large models.
Public information shows that in 2019, Ali has launched the research and development of Chinese large models. At that time, Alibaba's language model StructBERT surpassed Google, Microsoft, and Facebook and topped the CLUE list.
In 2021, Alibaba has successively released the first domestic multi-modal large model M6 with more than 10 billion parameters, as well as the language large model PLUG, known as the "Chinese version GPT-3".
Among them, M6 has achieved a parameter scale of 10 trillion after multiple iterations, and M6 has been commercialized in China by combining the business needs of Alipay and Taobao.
The parameter scale of PLUG is 27 billion, which is based on two self-developed models of DAMO Academy, StructBERT, a language understanding model, and PALM, a language generation model.
When this large model debuted, it set a new record for the authoritative Chinese language understanding benchmark CLUE classification task list with 80.614 points.
At last year's WAIC (World Artificial Intelligence Conference), Ali also released the Tongyi large model series. The core models have been open sourced.
In the era of big models, China's power competition accelerates
So, how would you rate this Ali version of ChatGPT?
It should be admitted that compared with the current industry benchmark ChatGPT (GPT-4), there is still a lot of room for improvement. Ali also revealed that according to internal testing feedback, this large model is being iterated rapidly.
Previously, Microsoft was revealed to have spent hundreds of millions of dollars specifically on ChatGPT to create a dedicated supercomputer composed of tens of thousands of NVIDIA A100. According to the comprehensive news, there are currently only a few domestic companies with high-performance graphics cards of this order of magnitude, and Ali is one of them.
In the era of big models, an industry consensus has been formed that AI and cloud computing are indispensable to create large models.
Alibaba, on the other hand, is one of the few companies in the world that has a leading layout in algorithms and computing power.
In addition to its long-term technical accumulation in artificial intelligence and large models, Alibaba also has natural advantages in computing power, backed by the first cloud vendor in China and the third in Asia.
ChatGPT is on fire to this day, and the domestic demand for large models that are competitive enough is increasing.
The potential of ChatGPT products to improve productivity has been continuously proven. But at the same time, there was a large-scale ban of ChatGPT, and Asia became a hard-hit area, and then OpenAI stopped selling ChatGPT Plus due to computing power problems...
Various uncertainties once again highlight the value of self-research of technology.
Fortunately, this time, our starting line was not so far apart.
The game doesn't end overnight, and now, the race really begins.
— End —
Qubits QbitAI · Headline number signed
Follow us and be the first to know the latest scientific and technological trends