From data to chips, developing AI is getting more and more expensive, and only tech giants can "afford it"?
CBN
2024-06-02 19:07Posted on the official account of Shanghai Yicai
More training data, larger models, more chips, and data centers are the "infrastructure" that drives artificial intelligence (AI) advancements, driving costs for high-tech companies.
In May, OpenAI signed a five-year content licensing agreement worth more than $250 million with News Corp, allowing the former to use the latter's news publication content to answer user queries and train AI. Previously, image provider Shutterstock signed deals of $25 million to $50 million with big tech companies such as Apple, Meta, Google, Amazon, and others to provide its vast library of images and videos for AI training.
Irene Tunkel, chief strategist of U.S. equities at BCA Research, a global economic analysis firm, told CBN reporters that technology companies have done a lot of work in the field of AI, but unless they sell AI-related "tools and equipment" or cloud storage, technology companies are still doing more AI capital expenditures than making money from AI.
However, the massive capex requirements will undoubtedly leave companies that can't afford the costs behind, and the players who can compete in this game will continue to be the tech giants we know well.

"Infrastructure" is expensive, capital expenditures are high
In the generative AI ecosystem, "infrastructure companies" that provide products and services such as chips and computer hardware, cloud platforms and services, databases, networks, and analytics help with the smooth development and deployment of models, according to Donkel. For example, James Betker, a researcher at OpenAI, has said that the data used to train models is the key to increasingly complex and powerful AI systems.
But where does the data come from? According to reports, generative AI models are trained primarily on images, text, audio, video, and other data obtained from public web pages, some of which are copyrighted. For example, OpenAI transcribed more than a million hours of relevant videos for use by its flagship model, GPT-4, without the permission of a video networking site or creator. Meta has also been training its models using images and videos from its company's Instagram, and only allows EU citizens to opt out of the mechanism.
As legal proceedings mounted, AI companies began to choose to pay. For example, online community Reddit says it has earned hundreds of millions of dollars by licensing data to organizations like Google and OpenAI. According to reports, the AI training data market is expected to grow from about $2.5 billion today to nearly $30 billion within a decade.
Model training isn't cheap either. OpenAI CEO Sam Altman said it costs more than $100 million to train GPT-4. Dario Amodei, CEO of AI startup Anthropic, also said that the training cost of AI models on the market is about $100 million. "The models that are being trained now, and the models that will be available later this year or early next year, cost close to $1 billion," he said. I think in 2025 and 2026, our costs will be close to $5 billion or $10 billion. ”
Chip spending is even more of a big item. According to reports, Nvidia's H100 graphics chip costs around $30,000. Meta CEO Mark Zuckerberg has previously said that the company plans to purchase 350,000 H100 chips by the end of this year to support its AI research efforts. In addition, Amazon's cloud computing division leases large clusters of workhorse processors manufactured by Intel Corporation to customers for about $6 per hour.
When it comes to cloud service centers, the cost of each data center is measured in billions. For example, Microsoft and G42, a UAE AI company, announced that they will jointly invest $1 billion to build a data center in Kenya and €4 billion to build AI data centers and cloud infrastructure in France. Over the past two years, Amazon has also pledged to spend $148 billion to build and operate data centers around the world to cope with the surge in demand for AI applications and other digital services.
Overall, Microsoft said in April that capital expenditures for the most recent quarter were $14 billion, up 79% from the same period last year, and that these costs had "increased significantly" due to AI infrastructure investments. Google's parent company, Alphabet, also said it spent $12 billion in the last quarter, up 91% from a year earlier, and expects spending to "meet or exceed" that level in the second half of the year. At the same time, Meta has also raised its investment forecast for this year, now believing that capital spending will reach $35 billion to $40 billion, with the upper end of this range increasing by 42%.
What do the antitrust authorities think
Kyle Lo, a senior applied research scientist at the Allen Institute for Artificial Intelligence (AI2) in the United States, believes that the high cost of training will exclude small companies from "developing or researching AI models".
Kyle Lowe said the growing focus on large-scale, high-quality training datasets will focus AI development on a small number of companies with multibillion-dollar budgets that can afford to access them. Major innovations in synthetic data or infrastructure may disrupt the status quo, but none seem to be forthcoming in the near future.
"Overall, entities that manage potentially useful content for AI development have an incentive to lock down their material." "With data access shut down, we're basically giving the green light to some of the early data takers and removing the ladder and others not being able to get the data to catch up," Kyle Lowe said. ”
At present, the antitrust authorities in Europe, the United States and the United Kingdom have also set their sights on the position of technology giants in the field of AI.
For example, the UK's Competition and Markets Authority (CMA) said in a report published in April this year that partnerships among the major players in the AI-based model market could exacerbate monopolies through their value chains. In May, the U.S. Department of Justice (DOJ) also announced increased focus on competition in the AI space. Recently, Jonathan Kanter, director of the antitrust department of the U.S. Department of Justice, said that the antitrust policies of the past 40 years have failed to effectively protect the public interest, resulting in a small number of companies controlling the market and information flow. He highlighted the high fees that content creators and developers face in the current market environment, especially as large companies increase their control over content creation and distribution.
When it comes to Big Tech's acquisition of AI startups, Ninette Dodoo, head of Freshfields Bruckhaus ”
Wu Han, a partner at King & Wood Mallesons, told Yicai that China, the United States and Europe all have a certain degree of commonality in the field of AI digital governance, such as focusing on the transparency and disclosure of AI systems, training data governance, intellectual property protection, content security and ethics.
(This article is from Yicai)
View original image 144K
-
From data to chips, developing AI is getting more and more expensive, and only tech giants can "afford it"?