laitimes

Microsoft was exposed to spend $100 billion on chips in the next 3 years;

author:Titanium Media APP
Microsoft was exposed to spend $100 billion on chips in the next 3 years;

(Image source: Microsoft)

The field of artificial general intelligence (AGI) has announced a series of blockbuster news.

Titanium Media App learned that on April 19, Meta announced the long-awaited multi-modal open source large model Llama 3 series. At the same time, Meta also launched its first AI chatbot product based on the Llama 3 open-source model, the AI assistant, which is directly benchmarked against ChatGPT-4.

At the same time, there is a lot of news of layoffs in the tech industry. Google announced a new round of layoffs starting on March 10 and is expected to last until August, including business services and other functions, and British AI startup Stability AI recently announced a 10% layoff.

In addition, it is reported that Microsoft plans to reserve 1.8 million AI chips by the end of 2024, which means that the number of graphics processing units (GPUs) that Microsoft plans to reserve this year has tripled. Microsoft expects to invest about $100 billion in GPUs and data centers by fiscal 2027.

On April 18, Yang Yuanqing, chairman and CEO of Lenovo Group, said at the 10th Lenovo Innovation and Technology Conference (2024 Lenovo Tech World) that the second half of artificial intelligence (AI) must be from technological breakthroughs to landing applications. The way to land is to cover the three pillars of smart devices, smart infrastructure, smart solutions and services.

The open-source large model Llama3 was launched, and Baidu Intelligent Cloud supported its full range of training and inference

In the early morning of April 19, Beijing time, Meta announced the long-awaited multimodal open source model Llama 3 series, which currently includes tuned versions with two parameters: 8 billion and 70 billion. At the same time, Meta also launched its first AI chatbot product based on the Llama 3 open-source model, the AI assistant, which is directly benchmarked against ChatGPT-4.

The technical details of Llama 3 include the selection of a relatively standard decoder-only Transformer architecture and pre-training on two custom 24K GPU clusters based on a thesaurus of more than 15T – the training dataset is seven times larger than Llama 2 and contains four times the amount of code that Llama 2 contains. At the same time, Llama 3 supports 8K context length, which is twice as long as Llama 2.

The Llama 3 70B is a significant improvement over the 8B version when it comes to handling many types of complex tasks, while outperforming the Google Gemini Pro 1.5 and Claude 3 Sonnet in several benchmarks.

Microsoft was exposed to spend $100 billion on chips in the next 3 years;

Meta said that the pre-trained and guided fine-tuning model is currently the best model at the 8B and 70B parameter scales.

Llama 3 will soon be available on all major platforms, including AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and supported by hardware platforms from AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm. In addition, you can also experience it on Meta AI, the official assistant of Meta.

Meta also said that the Llama 3 8B and 70B models mark the beginning of its Llama 3 launch. Next, Llama's maximum models have more than 400B (400 billion) parameters, although these models are still being trained.

It is worth noting that at noon on April 19th, following the release of Llama3's 8B and 70B large models on the 18th, Baidu Intelligent Cloud Qianfan Large Model Platform announced on the 19th that it was the first in China to launch a training and inference scheme for the full range of versions of Llama3, which is convenient for developers to retrain and build exclusive large models, which is now open for invitation testing.

At present, ModelBuilder, a model customization tool of various sizes in Baidu Intelligent Cloud Qianfan large model platform, has preset the most comprehensive and abundant large models, supporting third-party mainstream models at home and abroad, with a total number of 79, which is the development platform with the largest number of large models in China.

Earlier, Robin Li, founder, chairman and CEO of Baidu, said that because the basic model Wenxin 4.0 can be tailored to a smaller model suitable for various scenarios according to various considerations such as effect, response speed, and inference cost, and supports fine tuning and post pretrain. In this way, the model tailored through dimensionality reduction is significantly better than the model directly called up by the open source model, and the cost is significantly lower under the same effect, "so the open source model will be more and more backward".

Microsoft will stock up on 1.8 million GPUs this year, with a $100 billion investment plan

According to a document obtained by Business Insider, Microsoft plans to stock 1.8 million AI chips by the end of 2024, which means that the number of graphics processing units (GPUs) that Microsoft plans to stock this year has tripled.

According to two people familiar with the matter, Microsoft expects to invest about $100 billion in GPUs and data centers from the current fiscal year to fiscal year 2027.

Microsoft was exposed to spend $100 billion on chips in the next 3 years;

The $100 billion investment is reminiscent of an earlier report. On March 30 this year, Microsoft and OpenAI announced plans to invest $100 billion to build the "Stargate" AI supercomputer. OpenAI's next major AI upgrade is expected to land early next year. Microsoft executives hope to release the Stargate AI supercomputer as early as 2028. In addition, Microsoft and OpenAI plan to develop data center projects for AI.

Microsoft spokesman Frank Shaw declined to comment on the news, saying that public companies must comply with the prescribed quiet period before reporting earnings. The company plans to release results in the coming days.

DA Davidson analysts estimate that Microsoft will spend $4.5 billion on Nvidia chips in 2023, a figure revealed by a Microsoft executive that this figure is roughly in line with actual spending.

Other tech giants are also building huge GPU reserves. Meta CEO Mark Zuckerberg said earlier this year that Meta will buy about 350,000 Nvidia H100 GPUs in 2024 and about 600,000 GPUs by the end of the year.

Lenovo Yang Yuanqing: AI is not a substitute for human intelligence

On April 18, it was reported that at the 10th Lenovo Innovation and Technology Conference (2024 Lenovo Tech World), Yang Yuanqing, chairman and CEO of Lenovo Group, delivered a keynote speech on "AI for all, let the world be full of AI".

Microsoft was exposed to spend $100 billion on chips in the next 3 years;

First of all, Yang Yuanqing pointed out that AI is augmented intelligence rather than replacing human intelligence, and hybrid AI will be a bridge connecting augmented intelligence with thousands of households and industries, and it is the only way for AI to be inclusive.

In Yang Yuanqing's view, it is a hybrid of public models, individual models, and enterprise models.

"The big language models we are familiar with today are actually mainly public large models running on the public cloud. Undoubtedly, it is the result of the rapid development of the underlying core technology of AI, and it is the catalyst and accelerator for the popularization of AI. However, in practice, its further popularity and application are limited by network speed, cloud efficiency, and cost considerations. Studies have shown that the cost of each query of the public model is ten times higher than that of traditional search. The high energy consumption behind this is even more intolerable in the spirit of environmental protection. Yang Yuanqing said.

"Another fatal obstacle to the adoption of public large language models is the inherent deficiency of data security and privacy. In Yang's view, if you want to get an accurate answer from the cloud, you will have to reveal the real personal data and intentions, and allow this data to become part of the public information, which will undoubtedly trigger personal privacy and corporate information security considerations, which will make most people shy away from turning it into a tool for everyday use. Therefore, the public large model is good, but it is essentially a public intelligence. For it to be used more widely, other AI technologies will need to be complemented by the necessary complements.

Yang Yuanqing said that the future of AI is based on personal models, enterprise models, and personal agents and enterprise agents developed on their basis.

Talking about personal agents and enterprise agents, Yang Yuanqing gave a detailed introduction to how they work.

Among them, the personal agent runs on the personal intelligent terminal or edge device through the large model compression technology, receives instructions in a natural interactive way, and makes better inferences and actions through the information stored on the device, such as personal travel records and shopping preferences. It can even predict the next task based on your thought patterns and behavior frequency, and make proactive suggestions, find solutions autonomously, and most importantly, never share or send your personal data to the public cloud unless authorized by the user.

"When such an individual agent is exposed to a variety of smart devices, the smart devices of the future will become a digital extension of each of us. Yang Yuanqing said that smart devices contain agents, and agents contain large models. Our "memory" will be greatly expanded. Our "computing power", that is, the speed at which we can reason and form answers or opinions and conclusions, will be greatly accelerated. Reasoning and decision-making quality will also be significantly improved.

In Yang's view, enterprises will also benefit from similar enterprise agents. It is scattered in multiple terminal equipment and infrastructure of the enterprise, and will be able to learn and reason about a large amount of enterprise data, support enterprise operation decision-making, improve management efficiency, and improve productivity while ensuring information security.

"In the future, the personal model and the enterprise model will coexist and complement each other with the public model in the form of built-in personal and enterprise agents, respectively, so as to form a hybrid AI. Therefore, we believe that this is a general direction in which AI can be popularized. Yang Yuanqing said.

In addition, Yang Yuanqing also answered the question of how to realize the application of hybrid AI.

He pointed out that the implementation is inseparable from the coverage of smart devices, smart infrastructure, smart solutions and services, which are also the three pillars of Lenovo Group's promotion of AI inclusion. Yang Yuanqing believes that for individual users, the most convenient way to get the experience of personal agents is to have personal agents built into their own computers, mobile phones and other personal computing devices.

"For enterprise customers, the implementation of AI must rely on intelligent infrastructure. Every business, every day, generates a huge amount of data, whether it's from terminal sensors or devices, or from the edge and cloud. This data, like oil, is an important strategic resource for enterprises, which needs to be extracted, transported, and refined. Therefore, enterprises need servers, networks, and storage devices to make the best use of massive amounts of data, and then realize enterprise data intelligence through AI models and algorithms. In the future, we will see that AI will not only run on the public cloud, but also in local data centers, private clouds, hybrid clouds, and even at the edge, and computing power will be more evenly distributed among the cloud, edge, and device. This means that the implementation of enterprise intelligent twins requires a hybrid infrastructure, and 'end-edge-cloud-network-intelligence' is indispensable, and such a technical architecture is the new IT technology architecture that Lenovo strives to build. Yang Yuanqing said.

Yang Yuanqing emphasized that hybrid AI is the representative of "new quality productivity", which will greatly accelerate the intelligent transformation of all walks of life. From equipment to infrastructure to solutions and services, these are the three pillars of Lenovo's hybrid AI.

British AI startup Stability AI announces 10% layoffs

Titanium Media App reported on April 19 that following the departure of the controversial former CEO Emad Mostaque, British AI startup Stability AI recently announced a 10% layoff.

Microsoft was exposed to spend $100 billion on chips in the next 3 years;

I think it's a good time to be able to get the most out of it.

After a period of unsustainable growth, Stability AI laid off more than 20 employees to right-size the business, according to an internal memo.

It is reported that Stability AI has about 200 employees, which means that the proportion of layoffs is about 10%.

According to a person familiar with the matter, most of the affected employees are in the area of business operations, and they have been informed of the layoffs.

Stability AI's newly appointed co-CEOs, Shan Shan Wong and Christian Laforte, said in an email that the company needed to restructure parts of its business, which meant it would have to say goodbye to some of its colleagues. "We have notified the employees affected by this and we will support them during the layoffs. ”

On March 23, local time, Stability AI announced that CEO Emad Mostak will leave the company to pursue decentralized AI, and its job vacancies will be replaced by Wong and Laforte.

Currently, Stability AI is still looking for a suitable CEO to fill the top leadership position. The company said it will continue to operate as usual and is still releasing new products.

Google has announced a new round of layoffs and will restructure its finance team

Titanium Media App reported on April 18 that Google, a subsidiary of Alphabet, said on Wednesday that it was carrying out a new round of layoffs and restructuring of some teams, and transferred some positions overseas. The current round of layoffs began on March 10 and is expected to last until August.

According to two current employees, they have been notified of layoffs this week, and the affected teams include teams such as Google's business services, and "the scale of this restructuring is quite large."

Ruth Porat, Google's CFO, also said in a memo on Wednesday that the company's move was to dedicate more resources to supporting AI investments. The restructuring will impact Google's finance teams in the U.S. and other countries, including Asia Pacific and EMEA.

A Google spokesperson said the restructuring is part of normal business processes, with affected employees being able to reapply for other jobs within Google, a small portion of the positions will be transferred to overseas centers in which Google is investing, and the company will also provide severance compensation and re-employment services.

"We are responsibly investing in our top priorities and in the big opportunities ahead. In order to meet these opportunities in the best possible position, over the second half of 2023 and into 2024, some of our teams made adjustments to increase efficiency, work better, reduce hierarchies, and realign resources to the highest product priorities. A Google spokesperson said.

(This article was first published on the Titanium Media App, author|Ren Yingwen, Lin Zhijia, editor|Lin Zhijia)

Read on