Tianfeng Electronics: The expansion trend of professional cloud vendors is gradually clear, and it is recommended to pay attention to cloud AI-related enterprises and computing power chip-related enterprises

1. NVIDIA and Microsoft invested in CoreWeave, and cloud services helped the layout of computing power. CoreWeave is a specialized cloud provider built for large-scale GPU-accelerated workloads. 1) Products: CoreWeave has seven product modules, enabling products and expansion through the acquisition of Conductor Technologies. 2) Core advantages: Compared with traditional cloud providers, the Kubernetes-native cloud computing platform infrastructure adopts the Kubernetes-native cloud computing platform infrastructure to manage and schedule container and GPU resources through Kubernetes APIs and control panels, and the computing cost is 80% cheaper than other competitors' products (no infrastructure overhead) (NVIDIA HGX H100 GPU components cost $4.76 per hour), and the speed is 35 times faster. 3) Cooperation: NVIDIA (participating in the company's Series B financing) and Microsoft (investing one billion dollars in cloud computing infrastructure) have partnered with CoreWeave to cope with the growing demand for computing power and the high cost of computing power. In addition, the company has partnerships with Tarteel AI, Anlatan (creator of NovelAI), Stable diffusion, EleutherAI (machine learning project and open source artificial intelligence), Spire Animation, and PureWeb (visual effects and animation company).

2. AIGC has significantly accelerated, and computing power and cloud service demand are elastic. The development of AIGC large models, multi-modalities, and commercialization has promoted the continuous expansion of computing power demand. Under the double-layer superposition of the data scale of the AIGC large model and the algorithm model, the demand for computing power will increase. International technology giants have promoted the commercialization of AI models, further stimulating the demand for computing power. According to the China Academy of Information and Communications Technology, the global computing power is expected to reach 56 ZFLOPS in 2030. The advent of the era of AIGC large models has made intelligent computing power a common demand, affecting the mode and pattern of cloud computing services. According to the China Business Industry Research Institute, the global cloud computing scale reached US$356.6 billion in 2022 and is expected to exceed US$400 billion in 2023.

3. The AIGC infrastructure layer is mature, and cloud services and chips are core resources. Of all the tiers, the infrastructure layer is generally considered the most mature, stable, and commercial. This tier focuses on two core resources: cloud services and chips: 1) Cloud Service Providers: Cloud service providers dominate the market at the infrastructure layer by providing hyperscale and purpose-specific computing, storage, and networking technologies. AI requires huge computing power from machines, and many companies are turning to cloud services to solve computing power problems through cloud service infrastructure. 2) Chip: AI computing power chip is the cornerstone of the ChatGPT-like model. We believe that GPU chips with large computing power and versatility may become the first choice for large computing power applications in the short term, and the boundary between GPUs and ASICs may be blurred to a large extent in the future.

Risk warning: AI development and commercialization are less than expected, competition in the AI industry is intensifying, and policy uncertainty is uncertain.

Tianfeng Electronics: The expansion trend of professional cloud vendors is gradually clear, and it is recommended to pay attention to cloud AI-related enterprises and computing power chip-related enterprises

1. NVIDIA and Microsoft invested in CoreWeave, and cloud services helped the layout of computing power

CoreWeave is a specialized cloud provider built for large-scale GPU-accelerated workloads. Founded in 2017 by Michael Intrator, Brian Venturo and Brannin McBee, CoreWeave was an Ethereum mining company that transformed into a cloud computing platform company. COTO Brian Venturo is an Ethereum mining enthusiast who chose Nvidia hardware to add memory (Nvidia later became an investor in CoreWeave). CoreWeave's rapid growth is also supported by a fundraising team that includes Magnetar Capital, Nvidia, former GitHub executive Nat Friedman, and former Apple executive Daniel Gross.

1.1. It has seven product modules, M&A enabling products and expansion

CoreWeave has seven major product segments. Key products include NVIDIA HGX H100, GPU Compute, CPU Compute, Kubernetes, Virtual Servers, Storage, and Networking:

1) NVIDIA HGX H100: Designed for large-scale HPC and AI workloads, high-performance computing (HPC) applications are 7x more efficient, AI training for largest models is 9x faster, and AI inference is 30x faster than NVIDIA HGX A100. The CoreWeave HGX H100 distributed training cluster uses a track-optimized design to support in-NVIDIA SHARP network collection using the NVIDIA Quantum-2 InfiniBand network, providing 3.2Tbps of GPU Direct bandwidth per node. Optimized for NVIDIA GPU-accelerated workloads, CoreWeave makes it easy to run existing workloads with little or no changes. CoreWeave's fast, flexible infrastructure can also help achieve optimal performance.

2) GPU Compute: CoreWeave is the primary cloud provider for GPU-accelerated workloads, with NVIDIA GPUs in various categories at its core. As a professional cloud provider, CoreWeave provides computing services that match the complexity of your workload on an infrastructure that enables users to scale. With more than 10 use cases of NVIDIA GPUs designed for compute-intensive use cases, CoreWeave enables customers to right-tune the performance and cost of their workloads. CoreWeave is built for scalable, just-in-time consumption, with GPUs when customers need them and configurable instances, transparent pricing, and intuitive billing.

3) CPU Compute: CoreWeave's CPU server farm exists independently. With significant scalability for final frame rendering, data analysis, or video transcoding, CoreWeave's CPU-only instances provide the scale, scope, and flexibility needed for general-purpose computing. CoreWeave CPUs make it easy to scale users' applications to schedule and manage customers' CPU workloads from the same control plane. CoreWeave's CPU compute portfolio offers cost-tuned performance options for any use case, spinning up tens of thousands of CPU cores on demand to meet tight render deadlines, or data analytics at incredible scale.

4) Kubernetes: CoreWeave's customers can take advantage of higher portability, less overhead, and lower management complexity by managing all resources through a single orchestration layer compared to traditional VM-based deployments. Thanks to container image caching and a dedicated scheduler, CoreWeave's workloads can be up and running in as little as 5 seconds. CoreWeave provides instant access to a large number of resources in the same cluster, with only the CPU cores and RAM required by the user requesting, as well as an optional number of GPUs. CoreWeave handles all control plane infrastructure, cluster operations, and platform integration, so customers can spend more time building products.

5) Virtual Servers: CoreWeave's virtual servers are built on top of Kubernetes and use the open-source project KubeVirt to handle workloads that are not easily containerized. Launch virtual servers in seconds from the UI or via the CoreWeave Kubernetes API. CoreWeave passes through the bare metal performance of dedicated GPUs via PCI, with no GPU virtualization or shared resources. Like everything in CoreWeave, the virtual server is customizable, can match customers' workloads to NVIDIA GPUs, enables type switching in seconds, and fully supports Linux and Windows virtual servers.

6) Storage: Depending on the user's workload, CoreWeave offers a range of storage options. CoreWeave Cloud Storage Volumes is built on top of Ceph, a software-defined, scale-out, enterprise-class storage platform designed to provide high-availability, high-performance storage for customers' cloud-native workloads. CoreWeave uses triple replication, spread across multiple server and data center racks, and is built for high availability. All storage volumes can be mounted by containerized workloads and virtual servers, giving you the flexibility to change the underlying compute resources or deployment methods, and quickly scale from 1GB to petabytes (1000 TB).

7) Networking: CoreWeave's Kubernetes-native network design shifts functionality into the network fabric, so customers can spend less time managing IPs and VLANs to get the performance and security they need. Through regionally optimized transportation providers, CoreWeave's public connectivity provides low-latency access to more than 51 million people in the United States. Use Kubernetes network policies to manage firewalls, or deploy VPC networks for Layer 2 native environments. CoreWeave can deploy load balancer services to customers' applications to provide a highly available, scalable infrastructure at no cost. When it's time to manage a Layer 2 environment, CoreWeave Virtual Private Cloud (VPC) puts network control back to users.

CoreWeave is aggressively enhancing its offerings and expanding through mergers and acquisitions. In January 2023, CoreWeave announced the acquisition of Conductor Technologies. Conductor is a developer of cloud-based task management services. CoreWeave's acquisition of Conductor will enhance the product's use in the media and entertainment industry and will help CoreWeave expand the capabilities of VFX and animation studios to easily move workloads to the cloud. At the same time, this acquisition also led to a rapid increase in Core Weave's headcount to more than 90 as of 1/25/23, and Conductor CEO Mac Moore now manages the media and entertainment division at CoreWeave.

1.2. Three advantages help differentiate the generative AI market

Compared with traditional cloud service providers, CoreWeave focuses on generative AI, GPU acceleration technology and has price advantages:

1) Focus on generative AI: Traditional hyperscale cloud computing service providers such as AWS, Microsoft and Google Cloud have formed a series of large-scale cloud computing services and built huge data centers to target almost all potential customer needs. CoreWeave, on the other hand, takes the exact opposite approach, focusing on providing a platform for generative AI at a very competitive price. CoreWeave's collaboration in generative AI is impressive. CoreWeave collaborates with well-known generative AI companies Tarteel AI, Anlatan (creator of NovelAI), and machine learning and open source AI companies Stability AI's Stable Diffusion and EleutherAI. Meanwhile, visual effects (VFX) and animation companies such as Spire Animation and PureWeb have partnered with CoreWeave.

2) Deep GPU acceleration technology: The CoreWeave cloud architecture is a Kubernetes-native cloud built for large-scale GPU-accelerated workloads. Kubernetes is a container orchestration engine that supports automated container deployment, large-scale elastic scaling, and unified management of containerized applications. Unified management and use of GPU resources in Kubernetes can improve deployment efficiency, tenant isolation, and unified resource scheduling and management. The current focus on GPU-accelerated technology has enabled CoreWeave to outperform other cloud providers when it comes to more specialized use cases, especially AI-specific needs. Generative AI technologies, such as ChatGPT chatbots and Stable Diffusion's art-generated AI, require running a large number of nearly identical tasks at scale. Since GPUs excel at doing this, speed and power are greatly improved.

3) Price advantage: Cloud infrastructure can be used for a wide range of use cases, including visual effects rendering, machine learning and artificial intelligence, large-scale batch processing, and pixel streaming, with up to 35x faster processing speed and 80% lower cost compared to general-purpose technologies, according to the company's website. On the one hand, CoreWeave uses a Kubernetes-native cloud to enable portability, that is, it can take full advantage of hybrid cloud and deploy to any cloud provider, which can help customers reduce the cost of building infrastructure. CoreWeave, on the other hand, uses resource-based pricing, where customers only pay for the resources they use when they are used. In addition to this, CoreWeave offers the lowest on-demand price of any large cloud provider and the industry's widest range of NVIDIA GPUs. Take CoreWeave's GPU cloud pricing as an example, pricing is a point pricing, where the total instance cost is a combination of GPU components, number of vCPUs, and amount of RAM allocated. For simplicity, the CPU and RAM costs are the same for each base unit, and the only variable is the GPU selected for the customer's workload or virtual server.

1.3. NVIDIA and Microsoft AIGC are bright and have strategic cooperation with CoreWeave

Nvidia, Microsoft and other giants AIGC performed beautifully. NVIDIA masters the lifeblood of AI computing power. The NVIDIA H100 is described by Jensen Huang as "the world's first computer chip designed for AIGC" and helps AI systems output smooth and natural text, images and content faster. At present, the demand for AI is high, and the AI computing power chip track is fiercely competitive, but NVIDIA has stable advantages with versatility and ease of use. In the first quarter of 2023, NVIDIA's total revenue reached $7.19 billion. On May 30, Eastern time, NVIDIA became the world's first chip company with a market value of more than $1 trillion. Microsoft released a series of "AI family buckets" this quarter, AI computing power demand attracted some new customers, the rise of Bing search engine is expected to squeeze some of Google's market share and revenue. Microsoft's first-quarter revenue was $52.86 billion, mainly from cloud services, of which Azure and other cloud services revenue increased 27%.

NVIDIA and Microsoft cooperate with CoreWeave to increase computing power reserves. AIGC is sought after by investors. According to PitchBook data, in the first quarter of 2023, 46 transactions made by AIGC startups totaled about $1.7 billion, with another $10.68 billion announced during the quarter. As of May 31, 2023, CoreWeave's total financing reached $576.5 million, and NVIDIA also participated in the Series B financing. According to the cbinsights website, CoreWeave's valuation reached $2-2.221 billion in April 2023. The new generation of cloud service providers, represented by CoreWeave, can compete with traditional hyperscale cloud service providers by customizing hardware and lower prices to target interchangeable AI workloads. According to CNBC, Microsoft will invest billions of dollars in CoreWeave over the next few years in GPUs-powered cloud computing infrastructure to ensure that OpenAI has enough computing power to operate, reflecting the tech giant's solution to cope with the growing demand for computing power and the high cost of computing power by combining infrastructure, models and applications.

2. AIGC has significantly accelerated, and the demand for computing power and cloud services is elastic

2.1. AIGC large model, multi-modal, commercial development, computing power demand continues to expand

The AIGC large model drives the demand for computing power. Computing power is the core productivity of the digital economy era, and it is also one of the important supports and driving forces for the development of artificial intelligence technology. Taking the AIGC large model ChatGPT as an example, the computing power demand scenarios can be divided into two categories: training and inference, which can be further divided into three stages according to practical applications: pre-training, Finetune, and daily operation. According to the OpenAI paper, the GPT-3 model has about 175 billion parameters, the amount of pre-training data is 45TB, which is equivalent to about 300 billion tokens in the training set, and the computing power requirement in the training stage is about 3.15×108 PFLOPS. In addition to training, it also needs strong computing power support in reasoning. Under the double superposition of data scale and algorithm model, the demand for computing power will become more and more.

Multimodal AIGC may become a new driving force for computing power demand. After 2021, artificial intelligence gradually shifted from single-modal AI to multimodal AI. As one of the most watched development directions of artificial intelligence, AIGC is based on artificial intelligence as the core, multi-modal interaction technology and other technologies jointly integrated. As algorithms continue to iterate, AIGC can generate content in the form of text, images, audio, and video. In March, OpenAI released a GPT-4 model that accepts text and image inputs. Taking the PaLM-E multimodal model released by Google as an example, the number of parameters can reach up to 562 billion, and it is necessary to integrate various models to embed the information flow, making the overall model larger and further increasing the demand for computing resources.

Computing power empowers the commercialization of AIGC. International tech giants are pushing for the commercialization of AI models. Microsoft announced Microsoft Copilot, which combines LLM, including GPT-4, with business data from Microsoft 365 applications and the Microsoft Graph, bringing users a whole new way of working. At the same time, GPT-4 tries to connect more commercial partners through open API interfaces to create more commercial applications. Google's PaLM 2, Meta's LLaMA, and Amazon's Bedrock all reflect the commercialization of AIGC. With the commercialization of AIGC, enterprises with strong computing power resources will have more business possibilities, which we believe will further stimulate the demand for computing power. According to the China Academy of Information and Communications Technology, the global computing power is expected to reach 56 ZFLOPS in 2030.

2.2. The AIGC model and computing power demand increase, and the pattern and scale of cloud computing services are expected to improve

The development of AIGC stimulates the demand for cloud computing services. With the advent of the era of AIGC large models, intelligent computing power has become a common demand, further affecting the mode and pattern of cloud computing services. AI cloud services provide platform support for AIGC development. Specifically, AI pre-training model development has a large demand for cloud services, and AI cloud services can provide AI development modules, reduce developers' development costs and product development cycles through diversified service models, and provide AI empowerment for model development. The gradual maturity of the AIGC large model will promote the cloud computing landscape from the computing power-based platform IaaS to the model capability-based platform MaaS. The gradual expansion of cloud computing AI capabilities will also stimulate the demand for cloud computing services. According to the China Business Industry Research Institute, the global cloud computing scale reached US$356.6 billion in 2022 and is expected to exceed US$400 billion in 2023.

3. The AIGC infrastructure layer is mature, with cloud services and chips as core resources

The infrastructure layer leads the mature development of the AIGC technology stack. The generative AI technology stack consists of three layers, including the infrastructure layer, the model layer, and the application layer. The infrastructure layer includes two parts, ultra-large-scale computing and chips, which serve as the infrastructure and hardware foundation of AIGC, respectively. The existing leading enterprises in the infrastructure layer mainly provide computing power, network, storage and middleware infrastructure. In terms of chips, manufacturers offer chips that are specifically optimized for AI workloads. The implementation of the model layer to the application layer is mainly in two ways, including vertical integration of the basic model and application development on the basis of the basic model and fine-tuning model, which is equivalent to the platform of AIGC. Of all the tiers, the infrastructure layer is generally considered the most mature, stable, and commercial.

This level focuses on two core resources: cloud services and chips:

1) Cloud Service Providers: Cloud service providers dominate the market at the infrastructure layer by providing hyperscale and purpose-specific compute, storage, and networking technologies. Cloud service providers' business models have proven effective by providing scalable computing resources and adopting a consumption-based pricing strategy. To make AIGC's workloads more stable, cloud service providers have signed commitments with model providers to guarantee future work. AI requires huge computing power from machines, and many companies are turning to cloud services to solve computing power problems through cloud service infrastructure. From the perspective of market share, Amazon is currently the leader in the cloud service market, and Microsoft, IBM, Google and Alibaba Cloud also have a high market share. Specifically, Azure and OpenAI, Google and Anthropic, and AWS and Stability.ai have formed important cooperation.

2) Chips: Another key layer of rapid development in infrastructure is chips. AI computing power chip is the cornerstone of ChatGPT-like model, and a large number of computing power chips are required to support ChatGPT-like models, among which there is a large demand for GPU, FPGA, and ASIC. In this regard, NVIDIA and AMD are industry leaders. NVIDIA's Ampere and Hopper series GPUs, designed for training and inference workloads, respectively, plus NVIDIA's Selene supercomputer computing cluster, can accelerate training time. At the same time, AMD's CDNA2 architecture is also specially designed for supercomputing for machine learning applications, driving competition in the high-performance computing market. We believe that in the short term, GPU chips with large computing power and versatility may become the first choice for large computing power applications, and the boundary between GPUs and ASICs may be blurred to a large extent in the future, forming alternative competition.

4. Investment Advice

It is recommended to pay attention to cloud AI-related enterprises: Cambrian, Haiguang Information (covered by Tianfeng Computer Team), Loongson Zhongke, Unigroup Guowei, Fudan Microelectric, Anlu Technology, Amazon, Microsoft (covered by Tianfeng Overseas Group), Google, Oracle, etc.;

It is recommended to pay attention to computing power chip related enterprises: NVIDIA (covered by Tianfeng overseas group), AMD, Intel, Jingjiawei (jointly covered by Tianfeng computer team), etc.

5. Risk Warning

AI development and commercialization are not as expected: AIGC expects that technology iteration and commercialization process may be affected by software and hardware research and development and market feedback.

Intensified competition in the AI industry: The layout of the AIGC industry chain by domestic and foreign technology companies may lead to a rapid increase in the supply of the AI industry, resulting in industry competition exceeding expectations.

Policy uncertainty: The AIGC industry may be subject to regulatory restrictions on data security and copyright in the future.

This article is selected from the Brokerage Research Report