Editor: Editorial Department
The start-up team Mistral AI once again released a magnetic chain, and the 281GB file lifted the ban on the latest 8x22B MoE model.
A magnetic chain, Mistral AI is here again to do things quietly.
In the 281.24GB file, it turns out to be a new 8x22B MOE model!
The new MoE model has a total of 56 layers, 48 attention heads, 8 experts, and 2 active experts.
Moreover, the context length is 65k.
Netizens have said that Mistral AI, as always, has set off an AI community boom by relying on a magnetic chain.
In this regard, Jia Yangqing also said that he can't wait to see the detailed comparison between it and other SOTA models!
Relying on the magnetic chain to spread the entire AI community
In December last year, after the release of the first Magnet Chain, the 8x7B MoE model released by Mistral AI received a lot of praise.
In the benchmark test, the performance of the eight 7 billion parameter small models outperformed Llama 2, which has up to 70 billion parameters.
It handles 32k contexts well, supports English, French, Italian, German, and Spanish, and shows strong performance in code generation.
In February this year, the latest flagship model, Mistral Large, was launched, and its performance is directly comparable to GPT-4.
However, this version of the model is not open source.
Mistral Large has excellent logical reasoning capabilities and is capable of handling complex multilingual tasks including text understanding, conversion, and code generation.
That is, half a month ago, at a Cerebral Valley hackathon event, Mistral AI open-sourced the Mistral 7B v0.2 base model.
This model supports 32k contexts, no sliding window, Rope Theta = 1e6.
Now, the latest 8x22B MoE model is also available on the Hug Face platform, and community members can build their own applications based on it.