Meta AI released the most powerful open-source large model, Llama 3, which is available in versions 8B and 70B?

Meta AI launched Llama 3, and the strength of the open-source large model has been upgraded again

In the field of artificial intelligence, open-source models have always been an important force driving the development of technology. After years of hard work, the strength of open source large models has approached or even surpassed some commercial models, providing valuable resources for developers and researchers. Recently, Meta AI released its latest open-source large language model, Llama 3, which has achieved major breakthroughs in many aspects and attracted widespread attention in the industry.

Key features: Llama 3 is a new generation of large models launched by Meta after Llama 2, available in two versions: 8B and 70B. It uses a newly designed 128K vocabulary tokenizer to code the language more efficiently, which greatly improves the performance of the model. Both versions adopt the Grouped Query Attention (GQA) mechanism, which significantly improves the inference efficiency. What's even more impressive is that Llama 3 is trained on a massive dataset of 15 trillion words, which is 7x more than Llama 2, including 4x the code data, which is expected to further enhance the model's capabilities in the field of programming.

Meta AI released the most powerful open-source large model, Llama 3, which is available in versions 8B and 70B?

Comparative advantages: Meta has fine-tuned Llama 3 extensively to make it perform well in various benchmarks, and its reasoning, code generation, and instruction compliance capabilities have been greatly improved. Version 8B has outperformed other well-known open-source models such as Mistral 7B and Google Gemma 7B in several benchmarks. The 70B version is even more powerful, and in some tests, it can even compete with commercial models such as Google Gemini Pro 1.5 and Anthropic Claude 3. Llama 3 once again expands the open source model to a new level.

An in-depth look at the technological innovations of Llama 3

New tokenizer and vocabulary: Llama 3 features a new 128K vocabulary tokenizer, a significant improvement over the previous 32K. A larger vocabulary means that the model is able to represent the language more accurately and reduce the number of unknown words, which improves the accuracy of semantic understanding and generation. The new tokenizer is also optimized for code data, which helps to better handle programming-related tasks.

Attention Mechanism Optimization: Both the 8B and 70B versions of Llama 3 use the Grouped Query Attention (GQA) mechanism. While the traditional fully connected attention increases exponentially with the increase of sequence length, GQA greatly reduces the computational complexity by grouping queries and calculating attention separately, enabling the model to process long sequences more efficiently and improving the inference speed.

Large-scale training datasets: Data is one of the key factors in training large models. Llama 3 is pre-trained on a massive dataset of 15 trillion words, which is 7 times more data than Llama 2. The proportion of code data has also increased by a factor of 4, which is expected to further enhance the performance of the model in the field of programming. Such a large data set ensures that Llama 3 has a more comprehensive coverage of knowledge in various fields.

Model Performance and Benchmarking: Llama 3 has demonstrated superior capabilities in multiple benchmarks after being trained at scale. The fine-tuned version of instructions not only excels in traditional tasks such as inference and code generation, but also excels in emerging instruction-following tasks, demonstrating strong versatility.

Specifically, version 8B has outperformed other well-known open source models such as Mistral 7B and Google Gemma 7B in several benchmarks, such as a score of 57.1% in the Codex code comprehension test, compared to 46.6% in Gemma 7B. The 70B version is even more powerful, and can even compete with commercial models such as Google Gemini Pro 1.5 and Anthropic Claude 3 in tests, such as the accuracy rate on MMLU tasks is as high as 61.9%, which is not much different from the 62.5% of Claude 3.

A new level of open source AI, Llama 3 leads the new trend of large models

Open Source AI Vision: As a strong supporter of open source AI, Meta has been pushing the boundaries of this space. By open-sourceing the Llama 3 model and deploying it on multiple cloud platforms, Meta hopes to provide developers and researchers with more powerful tools to inspire innovation and drive rapid progress in AI technology.

Broad application prospects: As an all-round large model, Llama 3 has broad application prospects in various fields. Not only can it be competent for traditional natural language processing tasks such as question answering, summarization, machine translation, etc., but it also has excellent performance in programming code generation and understanding, which can bring revolutionary improvements to software development. Llama 3's powerful inference and instruction following capabilities make it promising in emerging scenarios such as intelligent assistants and decision support.

A more open future: While Llama 3 currently only offers a textual model, Meta is already working on more ambitious plans. They are training a large model with 400B+ parameters, which will support multimodal inputs such as images and videos in the future, and will also expand to multilingual support and longer context windows. In the near future, the Llama series is expected to become a true general artificial intelligence, becoming the most powerful open-source multimodal large model.

New trend of open source large models: The emergence of Llama 3 marks that open source large models have entered a new stage of development. In the past, open source models were mainly limited to small and medium-sized scales, and there was a certain gap between them and commercial models in terms of performance. But now, the open source model has not only exceeded the billion parameter mark in scale, but also comparable to the top commercial models in terms of performance, and even better in some aspects. This change will not only give developers unprecedented opportunities, but will also greatly promote the democratization of AI technology.

Open-source large models will surely become an important driving force for the development of artificial intelligence. Llama 3 is the latest example of this trend, and its emergence will further stimulate innovation and attract more talent and resources to the field. We have reason to believe that in the near future, open source models will shine in more and more scenarios and have a far-reaching impact on human society.

Meta AI released the most powerful open-source large model, Llama 3, which is available in versions 8B and 70B?

Read on

Shadowless Cloud Classroom at an altitude of 3,200 meters: Children under the snow-capped mountains meet AI models

Xiao Xin shared: cellular automata model

The man stole 800 yuan of mobile phone models and was detained

Only Google's injured world has been achieved, but should the "all-round model" be followed?

Unraveling the Mystery of Memory: Ebbinghaus's Forgetting Curve and Mind Model Playing Cards Help You Grow and Leap

After GPU, NPU becomes the standard configuration again, how do mobile phones and PCs carry large AI models?

Be a sneak peek! ByteDance is unprecedented! The large model is stunningly unveiled, and the price is as low as 99%!

39 million people watched Lei Jun's live test drive; Musk recruits second brain-computer experiment patient; DeepMind launches a large-scale model risk assessment framework

From "sky-high prices" to "fracture prices", large models are about to change

If you want to land a large model, let everyone afford to use it first

Direct interaction with hundreds of millions of users Third-party AI models accelerate access to the Weibo ecosystem

iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

When open source meets large models, what kind of changes will occur?

It is said that the senior management of the Tsinghua Department of the large model company has changed

58.com Sun Qiming: How to build a large model of life service vertical? Self-developed + open source with both hands

AI Dimensity Full Push, China's First End-to-End Large Model Mass Production on the Car Xpeng opens the era of AI intelligent driving