laitimes

Alibaba Cloud Tongyi Qianwen 14B model open source! Performance surpasses Llama2

author:Lu Songsong

On September 25, Alibaba Cloud open-sourced Qwen-14B, a 14 billion parameter model and its dialogue model Qwen-14B-Chat, which is free and commercially available. Qwen-14B surpasses models of the same scale in multiple authoritative reviews, and some indicators are even close to Llama2-70B. Alibaba Cloud previously open-sourced the 7 billion parameter model Qwen-7B, etc., and the number of downloads exceeded 1 million in more than a month, becoming a word-of-mouth work in the open source community.

Qwen-14B is a high-performance open source model that supports multiple languages, using more high-quality data than similar models, and the overall training data exceeds 3 trillion tokens, making the model have more powerful reasoning, cognition, planning and memory capabilities. Qwen-14B supports a maximum context window length of 8k.

Alibaba Cloud Tongyi Qianwen 14B model open source! Performance surpasses Llama2

Figure 1: Qwen-14B outperforms the SOTA model of the same scale in all aspects in twelve authoritative evaluations

Qwen-14B-Chat is a dialogue model obtained by fine SFT on the base model. With the powerful performance of the pedestal model, the accuracy of Qwen-14B-Chat generated content has been greatly improved, it is also more in line with human preferences, and the imagination and richness of content creation have been greatly expanded.

Qwen has excellent tool invocation capabilities, which allows developers to build Qwen-based agents faster. Developers can use simple instructions to teach Qwen to use complex tools, such as using the Code Interpreter tool to execute Python code for complex mathematical calculations, data analysis, chart drawing, etc. It can also develop "advanced digital assistants" with multi-document Q&A, long-form writing, and other capabilities.

Qwen-14B further improves the performance ceiling of small-size models, breaks through the siege of many models of the same size, and achieves the best results in 12 authoritative evaluations such as MMLU, C-Eval, GSM8K, MATH, and GaoKao-Bench, surpassing all SOTA (State-Of-The-Art) large models in all evaluations. It also surpasses the Llama-2-13B across the board, and is not inferior to the 34B and 70B models of the Llama 2. At the same time, Qwen-7B has also been upgraded, with core indicators improved by up to 22.5%.

Alibaba Cloud Tongyi Qianwen 14B model open source! Performance surpasses Llama2

Figure 2: The Qwen-14B outperforms the same-size model

Users can download models directly from the Modai community, or access and call Qwen-14B and Qwen-14B-Chat through the Alibaba Cloud Lingji platform. Alibaba Cloud provides users with a full range of services, including model training, inference, deployment, and fine-tuning.

In August, Alibaba Cloud open-sourced Qwen-7B, a 7 billion parameter base model, and successively rushed to the trending list of HuggingFace and Github. In just over a month, the cumulative number of downloads exceeded 1 million. More than 50 Qwen-based models have emerged in the open source community, and several well-known tools and frameworks in the community have integrated Qwen.

Alibaba Cloud Tongyi Qianwen 14B model open source! Performance surpasses Llama2

A large number of small and medium-sized enterprises, scientific research institutions and individual developers are developing exclusive large models or application products based on Tongyi Qianwen, such as Alibaba's Taobao, DingTalk, Future Genie, as well as external scientific research institutions and start-ups.

Based on Qwen-7B, Zhejiang University United Higher Education Press has developed the Zhihai-Sanle Education vertical large model, which has been applied in 12 universities across the country, providing intelligent question answering, test question generation, learning navigation, teaching evaluation and other capabilities. Zhejiang Youlu Robot Technology Co., Ltd. integrates Qwen-7B in the road cleaning robot, so that the robot can interact with the user in real time in natural language, understand the needs put forward by the user, analyze and disassemble the user's high-level instructions, do high-level logical analysis and task planning, and complete the cleaning task.

Zhou Jingren, CTO of Alibaba Cloud, said that Alibaba Cloud will continue to embrace open source and promote the construction of China's large model ecosystem. Alibaba Cloud firmly believes in the power of open source and takes the lead in open source self-developed large models, hoping to make large model technologies reach small and medium-sized enterprises and individual developers faster.

Alibaba Cloud also led the construction of ModelScope, China's largest AI model open source community, uniting the strength of the whole industry to jointly promote the inclusiveness and application of large model technology. In the past two months, the number of model downloads in the Moda community has soared from 45 million to 85 million, an increase of nearly 100%.

Read on