laitimes

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

THIS ARTICLE IS SYNTHESIZED BY THE SEMICONDUCTOR INDUSTRY (ID: ICVIEWS).

Under the wave of artificial intelligence, the development of optical chips is accelerating.

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

As one of the troika of artificial intelligence, computing power is the key to training AI models and inference tasks.

The new results of the research team of Tsinghua University were released in the latest issue of Science in the early morning of April 12, creating the first distributed breadth intelligent optical computing architecture, developing the world's first large-scale interference diffraction heterogeneous integrated chip "Taichi", and realizing 160 TOPS/W general intelligent computing.

According to reports, in the process of developing the "Taiji" optical chip architecture, the inspiration came from the classic book "Zhou Yi", and the team members established a new computing model inspired by "Yi has Taiji, is the birth of two instruments", and realized the release of the powerful performance of optical computing.

Optical computing, as the name suggests, is to change the computing carrier from electricity to light, and use the propagation of light in the chip for computing, with its ultra-high parallelism and speed, it is considered to be one of the most powerful competitive solutions for future disruptive computing architectures.

Optical chips have the advantage of high-speed and high-parallel computing, and are expected to be used to support advanced AI applications such as large models.

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

According to Xu Zhiwu, the first author of the paper and a Ph.D. student in the Department of Electronics, in the "Taiji" architecture, the top-down coding split-decoding reconstruction mechanism simplifies complex intelligent tasks and splits them into multi-channel and high-parallel subtasks.

The paper reports: "Taiji" optical chip has an area efficiency of 879T MACS/mm and an energy efficiency of 160 TOPS/N. For the first time, optical computing is empowered to realize complex AI tasks such as thousands of types of objects in natural scenes and cross-modal content generation.

The "Taiji" optical chip is expected to provide computing power support for large model training and reasoning, general artificial intelligence, and autonomous intelligent unmanned systems.

Tsinghua University released an innovative AI optical chip to achieve 160 TOPS/W of general-purpose intelligent computing

Artificial intelligence requires photonic circuits

AI often relies on artificial neural networks for applications such as analyzing medical scans and generating images. In these systems, circuit components called neurons (similar to neurons in the human brain) are fed data and work together to solve problems, such as recognizing faces. If a neural network has multiple layers of these neurons.

As neural networks grow in size and power, they become more and more energy-intensive when running on traditional electronic devices. For example, to train its state-of-the-art neural network, GPT-3, a 2022 study in the journal Nature showed that OpenAI spent $4.6 million to run 9,200 GPUs in two weeks.

The shortcomings of electronic computing have led some researchers to study optical computing as a promising basis for the next generation of artificial intelligence. This photonic method uses light to perform calculations faster and at a lower power compared to its electronic counterpart.

Tsinghua University has led the development of a photonic microchip, Taichi, which can perform the same as electronic devices in advanced AI tasks, while proving to be more energy-efficient.

"Optical neural networks are no longer toy models," said Lu Fang, an associate professor of electrical engineering at Tsinghua University, "and they can now be applied to real-world tasks." ”

How do optical neural networks work?

There are two main strategies for developing optical neural networks: 1) scattering light in a specific pattern within the microchip, and 2) allowing light waves to interfere with each other in a precise way inside the device. When these optical neural networks are fed in the form of light, the output light encodes data for the complex operations performed in these devices.

Fang explained that both photon computing methods have distinct advantages and disadvantages. For example, optical neural networks that rely on scattering or diffraction can bring many neurons together tightly and consume almost no energy. Diffraction-based neural networks rely on the scattering of light beams as they pass through the optical layer that represents the operation of the network. However, one disadvantage of diffraction-based neural networks is that they cannot be reconfigured. Each action string can basically only be used for one specific task.

In contrast, optical neural networks that rely on interference can be easily reconfigured. Interference-based neural networks send multiple beams through a grid of channels, and the way they interfere at the intersection of these channels helps to perform the operation of the device. However, their disadvantage is that interferometers are also bulky, which limits the ability of such neural networks to scale. They also consume a lot of energy.

In addition, current photonic chips encounter inevitable errors. Trying to scale an optical neural network by increasing the number of neuronal layers in these devices will often only increase the noise exponentially. This means that, until now, optical neural networks have been limited to basic AI tasks, such as simple pattern recognition, in other words, optical neural networks are often not suitable for advanced applications.

In contrast, the researchers say, Taichi is a hybrid design that combines diffraction and interference methods. It contains clusters of diffraction elements that compress data in a compact space for large-scale input and output. The chip also contains an array of interferometers for reconfigurable calculations. According to Fang, the coding protocol developed for Taichi divides challenging tasks and large network models into sub-models that can be distributed across different modules.

How does Taichi fuse these two neural networks?

Previous studies have often attempted to expand the capacity of optical neural networks by mimicking what their electronic counterparts often do – increasing the number of neuronal layers. Taichi's architecture scales by distributing computation across multiple chiplets running in parallel, which means that Taichi can avoid the exponential accumulation of errors that occurs when an optical neural network stacks many layers of neurons on top of each other.

"This 'shallow depth, wide width' architecture guarantees network scale," Fang said.

*Disclaimer: This article was created by the original author. The content of the article is his personal point of view, and our reprint is only for sharing and discussion, and does not mean that we agree or agree, if you have any objections, please contact the background.

Read on