Llama 3 is back, and it can compete with GPT-4, and the open-source model is about to catch up with the closed-source model?

On April 18th, the AI circle welcomed another blockbuster news, and Meta debuted with Llama 3, which is known as "the most powerful open-source model ever".

Meta has open-sourced Llama 3 8B and 70B models of different sizes for external developers to use for free, and in the coming months, Meta will launch a series of new models with multi-modal, multi-language conversations, longer context windows, etc. Among them, the large version of Llama 3 will have more than 400 billion parameters and is expected to "compete" with Claude 3.

Llama 3 is back, and it can compete with GPT-4, and the open-source model is about to catch up with the closed-source model?

At the same time, Meta CEO Zuckerberg announced that based on the latest Llama 3 model, the Meta AI assistant has now covered all applications such as Instagram, WhatsApp, Facebook, etc., and has opened a separate website, as well as an image generator that can generate pictures based on natural language prompt words.

The emergence of Llama 3 is directly benchmarked against OpenAI's GPT-4, which is completely different from OpenAI, which is "not Open".

According to people familiar with the matter, researchers have not yet begun fine-tuning Llama 3 and have not yet decided whether Llama 3 will be a multimodal model. It is reported that the official version of Llama 3 will be officially launched in July this year.

Yann LeCun, chief scientist of Meta AI and winner of the Turing Award, while "waving the flag" for the release of Llama 3, announced that more versions will be launched in the coming months, saying that Llama 3 8B and Llama 3 70B are currently the best performing open source models of the same volume. The llama 3 8B outperforms the llama 2 70B in some test sets.

Even Musk appeared in the comment section, with a concise "Not bad" expressing his recognition and expectation for Llama 3.

Jim Fan, a senior scientist at NVIDIA, believes that the launch of Llama 3 has departed from the progress of technology, and it is a symbol of the difference between the open source model and the top closed-source model.

From the benchmark shared by Jim Fan, it can be seen that the strength of the Llama 3 400B is almost comparable to Claude's "super cup" and the new version of GPT-4 Turbo, which will be a "watershed", and it is believed that it will unleash huge research potential, promote the development of the entire ecosystem, and the open source community may be able to use GPT-4 level models.

The day of the announcement coincided with the birthday of Stanford University professor and top AI expert Ng Enda, Ng said bluntly that the release of Llama 3 is the best gift he has ever received in his life, thank you Meta!

Andrej Karpathy, one of the founding members of OpenAI and former AI director of Tesla, also praised Llama 3. As one of the pioneers in the field of large language models, Karpathy believes that the performance of Llama3 is close to the level of GPT-4:

Llama3 is a model released by Meta that looks very powerful. Stick to the basic principles, spend a lot of high-quality time working on reliable systems and data, and explore the limits of long-term training models. I'm also very excited about the 400B model, which could be the first open-source model at the GPT-4 level. I think a lot of people will ask for a longer context length.

I'd like to have models with smaller parameters than 8B, ideally around 0.1B to 1B, for educational work, (unit) testing, embedded applications, etc.

Llama 3 is back, and it can compete with GPT-4, and the open-source model is about to catch up with the closed-source model?

According to Cameron R. Wolfe, director of AI at Rebuy and a Ph.D. in deep learning, Llama 3 proves that the key to training a good large language model is data quality. He analyzed in detail what Llama 3 has done in terms of data, including:

1) 15 trillion tokens of pre-trained data: 7 times more than Llama 2 and more than DBRX's 12 trillion;

2) More code data: The pre-training process contains more code data, which improves the inference ability of the model.

3) More efficient tokenizer: Having a larger vocabulary (128K tokens) improves the efficiency and performance of the model.

Llama 3 is back, and it can compete with GPT-4, and the open-source model is about to catch up with the closed-source model?

After the release of Llama 3, Xiaozha told the media, "Our goal is not to compete with open source models, but to surpass everyone and build the most advanced artificial intelligence." In the future, the Meta team will publish a technical report on Llama 3, revealing more details about the model.

The debate over open source versus closed source is far from over, and GPT-4.5/5, which is secretly poised to take off, may arrive this summer, and the battle for large models in the AI field is still going on.

This article is from Wall Street News, welcome to download the APP to see more

Llama 3 is back, and it can compete with GPT-4, and the open-source model is about to catch up with the closed-source model?

Read on

OpenAI secretly launched a mysterious model, suspected to be ChatGPT4.5 for public testing

A summary of 9 models of geometric guide angle problems in the mathematics common test of the high school entrance examination

Five forces model to improve personal core competence

Meta AI released the most powerful open-source large model, Llama 3, which is available in versions 8B and 70B?

How to use AI models to solve practical problems?

In the era of large models, is the data center outdated now?

轩辕大模型的实践与应用 | ML-Summit 2024

The mobile UI model came out, and the Apple iPhone may welcome a new cycle of upgrades

iFLYTEK does not tell the "sexy story" of large models

Meta released the "strongest open-source AI model", and the next generation may be stronger than GPT

面壁新模型:早于Llama3、比肩 Llama3、推理超越 Llama3!

Huawei's profit soared by 564% in the first quarter, Tianya community recovered, and Xiaohongshu tested its self-developed large model

13 Models of Effective Communication Expression

Eat through an industrial chain in one day: NO.37 AI large model industrial chain

10 domestic large models vs. mentally handicapped - Chinese comprehension ability assessment

The most complete interpretation of the MoE hybrid expert model: revealing the key technologies and challenges