laitimes

20230703 Google made a big move to develop a new generation of artificial intelligence chip TPU v4

author:Courtyard 39

Google made a big move and developed a new generation of artificial intelligence chip TPU v4

TPU v4, also known as the Tensor Processor, is Google's fifth-generation domain-specific architecture (DSA), designed to perform machine learning tasks. Compared with the previous generation TPU v3, TPU v4 performance is improved by an average of 2.7 times at the scale of 64 chips. A pod consisting of 4096 TPU v4 single-chip chips can reach 1 exaflop-level computing power, which is equivalent to 10 million laptops combined. TPU v4 can accelerate the processing speed of machine learning tasks, thereby improving the training efficiency of models.

The principle of TPU v4 is based on basic mathematical operations such as matrix multiplication and vector addition, which improves performance by storing a large amount of data in the local memory on the chip, reducing access to external memory. In addition, TPU v4 also uses some optimization techniques, such as hardware-optimized convolution algorithm, Winograd algorithm for accelerated matrix multiplication, and compression technology to reduce the amount of data transfer.

20230703 Google made a big move to develop a new generation of artificial intelligence chip TPU v4

Specifically, TPU v4 includes the following components:

  1. Matrix multiplication unit: TPU v4's matrix multiplication unit uses a Winograd-based algorithm that reduces memory access and data transfer by converting input data into a smaller matrix. This algorithm can reduce the size of the matrix to a smaller size without sacrificing performance, which can better fit the size of the chip's local memory.
  2. Vector addition unit: The vector addition unit of TPU v4 has also been optimized, using a chain operation method to add multiple vectors, thereby improving the computational efficiency.
  3. Compression technology: TPU v4 uses compression technology to reduce memory access and data transfer by compressing input data and weights. Specifically, TPU v4 supports two compression formats: reciprocal k-bit (k=2 or 3) compression and k-bit (k=4, 6, or 8) compression.
  4. Hardware-optimized convolution algorithms: TPU v4 also adopts some hardware-optimized convolution algorithms, such as local sampling and deep packet convolution, which can better utilize the chip's computing power and memory bandwidth without degrading performance.
  5. Chip-to-chip communication: TPU v4 transmits data through a dedicated communication channel between chips. These communication channels enable high-speed data transmission and synchronous operation, thus guaranteeing the performance and stability of the entire system.

Combining the above optimization techniques, TPU v4 can efficiently handle machine learning tasks, thereby improving the training efficiency of models.

At the same time, TPU v4 integrates optical interconnect switches, which is one of its most significant features. By using optical interconnect switches, TPU v4 can achieve higher speed data transmission and lower latency, thereby improving the efficiency of communication between chips.

Specifically, optical interconnect switches can connect multiple chips together to communicate through the use of optical signals. This method of communication can transmit data faster than traditional electronic signals, and can enable high-speed communication over longer distances. In addition, optical interconnect switches can also provide greater stability because they are not affected by electromagnetic interference.

By integrating optical interconnect switches, TPU v4 enables more efficient data transmission and communication, thereby improving the performance and stability of the entire system. This technology is one of the important directions for the development of future chips and can play an important role in various fields.

In addition to the optical interconnect switch, TPU v4 has the following notable features:

  1. High performance: TPU v4 uses a variety of optimization techniques, such as hardware acceleration of matrix multiplication and vector addition units, compression technology, and optical interconnect switching, which make TPU v4 have very high performance when processing machine learning tasks.
  2. High energy efficiency: TPU v4 uses a custom optical switch that connects multiple chips together to form a supercomputer. This customized optical switch not only increases the calculation speed, but also reduces energy consumption. Therefore, TPU v4 has a very high energy efficiency ratio, which can help users save energy costs.
  3. Wide application: TPU v4 is widely used in artificial intelligence training work, including speech recognition, image processing, natural language processing and other fields. Due to its high performance and energy efficiency ratio, TPU v4 is the preferred choice for many companies and research institutes.
  4. Flexibility: TPU v4 is not only suitable for different types of machine learning tasks, but can also be configured and scaled according to different needs. For example, multiple TPU v4 chips can be joined together to form a larger supercomputer to handle larger machine learning tasks.
  5. Security: TPU v4 has a very high security to protect users' data and privacy. For example, it employs hardware-level secure encryption to prevent data leaks and attacks.

In short, TPU v4 is a high-performance, energy-efficient, widely used, flexible and secure application-specific domain architecture, which is widely used in artificial intelligence training work.

The working principle of the TPU chip

TPU chip is a chip designed to process large amounts of image, sound, language and other types of data, and its working principle can be summarized as follows:

  1. Data input: The TPU chip first receives input data from input devices such as cameras, microphones, etc. Input data can be in the form of images, audio, text, and so on.
  2. Preprocessing: Before the input data is fed to the TPU chip, preprocessing is required, including data format conversion, data normalization, feature extraction, etc. The purpose of preprocessing is to make the data more suitable for being sent to the TPU chip for processing.
  3. Computing: The TPU chip contains a large number of computing units, which can perform efficient parallel computing, including matrix operations, convolution operations, etc. These compute units can be used to execute deep learning algorithms and other machine learning algorithms to process input data.
  4. Memory access: During the calculation, the TPU chip needs to access the internal memory to store intermediate calculation results and weights. TPU chips typically use on-chip memory to reduce data access latency and power consumption.
  5. Output: After calculation processing, the TPU chip returns the output result to the output device (such as display, speaker, etc.). The output can be data in the form of images, audio, text, and so on.

In general, the TPU chip works by processing the input data through preprocessing and calculation units, while accessing memory to store intermediate results and weights, and finally output processing results. The efficient computing and low power consumption of TPU chips make them particularly suitable for processing large amounts of image, sound, language, and other types of data.

TPU v4 can quickly process large amounts of images, sounds, languages and other types of data, and some companies in the A-share market have the ability to process large amounts of images, sounds, languages and other types of data. Here are the companies that meet your needs:

  1. Digital China Group (000034): Digital China is a digital service provider with businesses covering digital products, digital solutions, digital infrastructure and digital services. Its digital solutions can handle large amounts of data, including images, sounds, languages, and more.
  2. iFLYTEK (002230): iFLYTEK is an enterprise focusing on the R&D and application of intelligent voice technology. Its core technologies such as speech recognition, speech synthesis, and natural language processing can process large amounts of speech data. In addition, the company provides artificial intelligence services such as image processing, video processing, etc.
  3. Topway Information (002261): Topway Information is an enterprise focusing on digital services and software technology research and development. Its business covers digital cities, digital education, digital agriculture and other fields, and can process a large amount of image, sound, language and other data.
  4. Beixinyuan (300352): Beixinyuan is an enterprise focusing on the development and application of information security products. Its cybersecurity technology can protect the security of processing large amounts of data, including images, sound, language, and more.
  5. Huali Chuangtong (300045): Huali Chuangtong is an enterprise focusing on simulation application technology and product research and development, and its business covers aviation, aerospace, navigation and other fields. The company can process large amounts of data such as images, sounds, languages, etc., and provides solutions for simulation applications.
  6. SumaVideo (300079): SumaVision is a company focusing on digital television and video. Its business covers digital TV software and system integration, video transmission, video security and other fields, and can process a large number of images, sounds and other data.
  7. Yinjiang Co., Ltd. (300020): Yinjiang Co., Ltd. is an enterprise focusing on the field of urban intelligence and transportation intelligence. Its business covers intelligent transportation, intelligent medical care, intelligent buildings and other fields, and can process a large amount of images, sounds and other data.
  8. Warburg Pincus (300074): Warburg Pincus is an enterprise focusing on smart cities and smart healthcare. Its business covers smart city solutions, Internet medical care, remote video conferencing and other fields, and can process a large amount of images, sounds and other data.
  9. Rotary Pole Information (300222): Rotary Pole Information is an enterprise focusing on information technology, and its business covers embedded systems, intelligent networking and industrial applications. The company can process large amounts of data such as images, sounds, etc., and provide related solutions.
  10. CTI Navigation (300627): CTI Navigation is an enterprise focusing on the research and development and application of satellite navigation and positioning technology. Its business covers satellite navigation and positioning equipment, satellite navigation and positioning services and other fields, and can process a large amount of location data.
  11. ZTE (000063): ZTE is a world-renowned communications equipment manufacturer and communication solution provider, whose business covers communication network equipment, terminal equipment and other fields. The company can process large amounts of voice, image, video, and other data and provide related solutions.
  12. Inspur Information (000977): Inspur Information is an enterprise focusing on the research and development of computer hardware and software, and its business covers servers, storage and other fields. The company can process large amounts of data, including images, sounds, etc., and provide related solutions.
  13. Huayu Software (300271): Huayu Software is an enterprise focusing on digital solutions, and its business covers e-government, justice, enterprise informatization and other fields. The company can process large amounts of data such as images, sounds, etc., and provide related solutions.
  14. Yinzhijie (300085): Yinzhijie is a company focusing on the digital transformation of the financial industry, and its business covers the fields of financial software and solutions. The company can process large amounts of data such as images, sounds, etc., and provide related solutions.
  15. NSFOCUS (300369): NSFOCUS is an enterprise focusing on network security and information security, and its business covers security product development and security solutions. The company can process large amounts of data, including images, sounds, etc., and provide related solutions.
  16. Newland (000997): Newland is an enterprise focusing on information identification and information processing, and its business covers mobile payment, Internet of Things, intelligent identification and other fields. The company can process large amounts of data such as images, sounds, etc., and provide related solutions.
20230703 Google made a big move to develop a new generation of artificial intelligence chip TPU v4

Read on