The complexity of heterogeneous chips has increased sharply by more than 100 times, and the unknown number is hundreds of millions

According to Lei Feng, heterogeneous integration is the direction of the development of the chip industry, but the design complexity brought about by heterogeneity has increased by a hundredfold. At the same time, the advanced packaging of small chips also faces unknown challenges. Also, the software complexity at the upper layer is also a problem that has to be solved. The exponential increase in complexity faced by the chip industry requires the whole industry chain, including EDA tools, chip design companies, IP suppliers, and foundries, to strengthen collaboration and face together, which is also a challenge that the chip industry needs to deal with in the current decade.

Integrating more different kinds of processors and memories on a single chip or package together can lead to a sharp increase in chip design complexity.

There are good reasons to integrate more chips into SoCs or for advanced packaging, which increases the functionality of the chip, which can greatly improve performance and reduce power consumption, which is difficult to achieve through microtransistors alone. However, no matter how small the individual components are, they all need to take up space. In fact, it's not uncommon for state-of-the-art flat chips to exceed size limits, with different chips "stitched" together to provide more space.

The complexity of heterogeneous chips has increased sharply by more than 100 times, and the unknown number is hundreds of millions

Image courtesy of hpcwire

Heterogeneous chip complexity increases exponentially

But the packaging of components of various functions also greatly increases the complexity of the device. Eliminating the increased inter-chip complexity and problems associated with larger chip areas or packages is becoming a huge challenge.

In the past, chips included processors, on-chip and off-chip memory, and I/O. Now, a SoC may include multi-core CPUs, GPUs, FPGAs, eFPGA, and other dedicated accelerators, as well as the integration of MCUs, DSPs, and NPUs. There may also be various types of memory and storage, such as DRAM, MRAM, SRAM, and flash memory. There will also be a variety of I/Os, some for short-range communications and some for medium- and long-range communications, each with different frequency and signal isolation requirements.

To make matters worse, these designs are tailored to specific markets and applications. A few years ago, the vast majority of chips were designed for computers or smartphones, and that's when engineering teams were able to solve the errors in each device and solve most of the unknown problems in those designs. But today the situation is different, advanced chips are designed for larger systems (such as cars or specific cloud computing operations), and new ways of interacting are not yet fully understood.

All the big EDA vendors quantify these issues differently, but the trends are similar. Regardless of how they split the data, each method shows a sharp rise in complexity, with the result more potential problems.

For example, Ansys, a provider of engineering simulation software and services, focused on unknowns, from about 700,000 unknowns on 0.1 mm² die in 2000 to 9.5 million in 2020, compared with 102 million unknowns on 30 mm² (1.18 inches²) this year (see Figure 1).

"IC designs are best designed according to the system environment in which they work," said Rich Goldman, director of Ansys. "What we've been doing is designing chips and then building systems around it. But systems companies design the system first, and then design the chip. Therefore, it is now more necessary to simulate chips in the entire system environment. ”

Figure 1: Unknowns increase over time and increase with size and complexity. Source: Ansys

Synopsys uses disparate data to point out similar problems. It highlights the complexity of heterogeneous computational designs, which has grown more than 100-fold over the past few years (see Figure 2).

"When you think about the source of the device, you get a CV (capacitance-voltage) curve, an IV (current-voltage) curve, and a model that can make predictions about the device." Aveek Sarkar, vice president of engineering at Synopsys, said, "Modeling all of these parameters is becoming increasingly complex. A customer asks us, 'Do you really need to use this model?' Or can it be adjusted because each model has too much protection built in? In the past, we could have left room, but now we can't. So, can you use some of the data used to create the model upstream and start there? '”

Figure 2: Increased complexity due to heterogeneous computing. Source: Synopsys

From a variability perspective, Siemens EDA points to a similar trend, especially because of analog circuits (see Figure 3). It is worth noting that due to the increasing number of analog/mixed signals in the chip, especially power electronics and sensors.

Harry Foster, chief scientist at Siemens EDA validation, said: "What is happening is that the industry is continuing to evolve to advanced semiconductor nodes, in which case it is difficult to model variability. Most importantly, these models evolve as processes evolve. There are a lot of process corners to verify. However, the more interesting trend is that with the increase in complex mixed-signal designs, chip companies are trying to optimize the chip area, including analog devices, regardless of the node used. ”

Figure 3: Over time, the spikes due to simulation are large. Source: Wilson Research Group/Siemens EDA

Expanding on three dimensions adds another level of complexity. The architecture has changed to enable more computing functionality to be integrated into a package rather than on a single die, but this adds complexity (see Figure 4).

Although it is possible to integrate all functions into one die or encapsulate multiple dies together, it is faster to connect them together using an inserter or some type of bridge. Previously, this approach resulted in a loss of performance and power consumption, but using thicker pipes for a three-dimensional plan layout can shorten the distance required for signal transmission, thereby reducing the drive current.

"The time beyond Moore's Law means more tools are needed in the chip process." John Park, head of product management at Cadence Custom IC & PCB Group, said, "In particular, top-level planning requires multiple system-level (multi-chip) analysis tools. These tools are new to SoC designers, and the process is more complex than ever. ”

Figure 4: Validation challenges in advanced packaging. Source: Cadence

How do you solve complexity problems?

In advanced chips or advanced packages tailored to specific applications or markets, complexity is almost a one-time process. What has changed is that many of these chip designs no longer produce chips in a billion units. Even a derivative chip can look very different from the original architecture.

For system vendors developing these chips, costs are distributed throughout the system development and, in some cases, can be amortized at operating costs. Therefore, for large cloud operators, improving performance and reducing power consumption can reduce the number of server racks required, which in turn affects the real estate of the data center and the cost of powering and cooling those computers.

For automotive designs, advanced AI chips can be used in multiple product lines, at least theoretically, in multiple versions.

However, pressures to streamline the development process and reduce the overall cost of chips persist, with a single advanced chip costing up to hundreds of millions of dollars. To this end, EDA tool providers have been working hard to identify common problems when used in different vertical markets or when actually used. Much of this work revolves around standards that already exist and new standards that are being developed.

"There are several aspects to consider, such as making sure that customers are using the correct version of ip." Arteris IP Chairman and CEO K. Charles Janac said, "The parameters are forced to be set by IP-XACT so that the IP module can enter the SoC, as well as the supply management aspect. Many companies have different suppliers, including layout companies, design companies, and foundries. If the entire supply chain is IP-XACT, then it will go very smoothly. At the same time, the chip contains chips with leading processes and mature processes. Therefore, with NoC-compatible chip-to-chip connectivity, and IP-XACT configuring egress ports, it is possible to simplify the use of chiplets in a system-in-package. ”

The challenge is how to fuse all these fragments together into a high-level abstraction, then dig deeper and then analyze it at a higher level. This is a problem that many large EDA companies have focused on solving over the past few years. EDA vendors have been increasing the speed and capacity of their tools and equipment, including leveraging heterogeneous platforms to accelerate processes, sometimes combined with machine learning.

In addition, all major EDA tool vendors are leveraging the cloud in situations where extreme computing power is required, such as during validation or debugging. The result is more room for simulation, simulation, and prototyping than in the past, and tighter integration between point tools and higher-level platforms.

How to achieve data format standardization to promote cooperation in the whole industry chain?

A new challenge in increasingly complex design processes is different data formats. Multi-chip and system integration generates more data throughout the design and manufacturing process, but not all of the data can be understood by different tools. Being able to unify this data will make the process simpler.

"The data format needs to be standardized so that information can be exchanged between simulators, allowing the use of a common interface to analyze the data format." Roland Jancke, head of design methodology at Fraunhofer IIS Adaptive Systems Engineering, said. "If all parts use standardized interfaces, then they have a better chance of collaborating, which is beneficial to both the development itself and the development process." Before designing a product, we have to build models from the parts, and if these models can be combined together and there is an opportunity for the models of those parts to be used together, then we can be sure that the system can also be used. ”

However, using a consistent data format to raise the level of abstraction is a challenge that requires collaboration across the supply chain. Previously, more expertise was needed to inspect, test, and ensure adequate throughput. Designing complex chips now requires expertise from experts in electrical engineering, validation, testing, power, mechanical engineering, software, and, in some cases, machine learning, deep learning, and AI.

Hany Elhak, director of product management and marketing at Synopsys, said: "In the past, these teams didn't communicate with each other. They used different tools, and they used different processes, and now they have to talk. In the case of EDA, we need to be aware of this and provide converged workflows that enable these teams to work together. We are trying to solve two problems. Circuits are now larger, more complex, and operate at higher frequencies than traditional circuits, and they have more parasitic effects. It's a matter of scale, and we're trying to solve it by providing faster simulations and higher capacity simulations. At the same time, we are also trying to solve another problem, many different types of circuits integrate larger systems, so they need to be designed together.

The second challenge involves incorporating AI/machine learning into more and more devices. AI relies on good data and a consistent format to achieve a level of precision sufficient for its tasks.

Rob Aitken, an Arm researcher and technical director, said: "Precision is challenging in itself. The precision obtained on some of the standardization challenges or datasets does not necessarily indicate what it will do in practice. For example, it correctly recognizes 95% of the images, but if the application accounts for 5% of the total, this is the problem that needs to be solved. ”

In a multifunctional system, accuracy predictions are even more complex.

"If a system has a given precision and another system has another precision, then their overall accuracy depends on how independent the two methods are from each other. It also depends on the mechanism by which the two are used in combination. Aitken said. In applications such as image recognition, it is easier to understand. However, in automotive applications where radar data and camera data are fused, it is difficult. They are actually independent of each other, but their accuracy also depends on external factors that must be known. It's possible that radar thinks it's a cat and the camera says there's nothing there. The reality is that the radar may be correct due to dark. But if it's raining, maybe the radar is also wrong. ”

Unknown challenges posed by heterogeneous systems

Chips or advanced package chips now need to work in a larger system environment, even though chipmakers may not have any knowledge of that larger system. Designing unique chips or chiplets requires the environment of one or more unique systems, which forces EDA tools and IP vendors to look at things differently.

Essentially, they need to take a top-down approach to all potential problems, or they need to find solutions that work for multiple vertical markets.

For example, consider designs that are almost always unique to AI chips and systems.

"For example, when we build a PHY, we want to sell as much as we can." Steven Woo, inventor of Rambus, said, "We built it in numerous use cases. Part of the reason is that building, designing, and developing PHYs is really expensive, so it has to be sold in large quantities. As far as AI is concerned, what we're facing now is actually a very specific use case. That doesn't mean they can't be used in a range of applications, but some of its software properties allow you to fine-tune specific types of applications more than in the semiconductor industry. We're trying to make it very generic, and that's another way. ”

However, focusing on system design brings with it a whole new set of challenges. For example, to replace changes in the chip, there is the possibility of additional system changes. In short, a change in a multi-chip package may be the sum of different chip changes, some of which may be using completely different processes in different sizes or even from different foundries.

Andy Heinig, head of the Advanced Systems Integration Group at Fraunhofer IIS EAS and head of the Efficient Electronics Division, said: "The changes we have seen from the standard chip changes are well understood and there are ways to deal with them. But in terms of packaging, we think new problems will arise. Until now, they are not known, and only by testing can they find out that the system is going to fail and discover new problems. At this point, there are some steps that can be taken to resolve these issues. It may be a combination of various problems that we have not encountered so far, although some of them are known and understood individually. ”

More options for heterogeneous chips

All of this is well beyond the capabilities of a single vendor. Supply chains are complex and global, and not all technologies are maturing at the same rate. In heterogeneous designs involving multiple vendors, the choice from one design to the next can vary greatly.

Douglas Mitchell, Vice President of THE RAM Business Unit at Infineon, said: "You will find that logical processes are evolving towards those very advanced processes, using 5nm or 7nm technology. But storage technology may not evolve as fast as logic technology. So memory technology with decades of experience may be very suitable, but this technology will not be very 7nm or even below. It can use separate chips to optimize the trade-off between reliability, performance, and cost. ”

"Especially in the edge computing environment, we're going to see different combinations." Mitchell said.

"If you have a processor, data logging memory, code storage and real-time processing expansion memory, these different characteristics of the chip need to optimize different indicators. You may want to have some kind of data record memory with a very high service life, such as real-time data in 20 years, which requires that it must have certain characteristics. Flash memory may have to store code and implement security features in harsh environments. Therefore, there will be different combinations in these edge network devices. And, if some machine learning capabilities can be embedded on edge nodes, a lot of real-time processing and decision-making can be made at the edge, and deciding what data needs to be sent to the cloud as needed, it's a complex problem that needs to consider multiple factors. ”

Complexity also adds to the problem of tracking all the IPs used in these designs. "We will definitely see more attractive semiconductor IP suppliers." Simon Rance, head of marketing at ClioSoft, said. "They've been worrying about this for 10 years, and that concern is growing and escalating. This starts with the use of intellectual property, especially legal agreements. For larger IP companies, high-end IP costs are high, and many companies buy licenses. The problem is that IP providers can't regulate it, it's legally binding, but they don't know if their IP has been used in multiple designs. Larger companies don't want to buy IP from IP providers and violate these legal agreements. Chip designers don't know if the company is a one-time license. We see that there are many IPs on file servers. We've been addressing the lack of management. ”

conclusion

The complexity of the chip has been increasing for some time, but for the most part, it is controlled by the economics of Moore's Law. As the cost of the most advanced nodes changes, chip architects are creating more options to dramatically improve performance and optimize performance per watt. While this is creative and has spawned many new options, the number of customizations and the growing chip size and complexity also make it more challenging to use today's EDA tools.

Goldman of Ansys said: "We have been following Moore's Law for more than 50 years, and it's all about semiconductors. However, to design a chip, you need an EDA that supports it. Today, we have a lot of innovation. But now that we're seeing exponential innovation, the number of unknowns is increasing. ”

Addressing these exponential changes will be a major challenge for the current decade, defining how advanced chips are designed, manufactured, and tested, and how they perform over their entire life expectancy.

Compiled by Lei Feng Network, original link: https://semiengineering.com/steep-spike-for-chip-complexity-and-unknowns/ Lei Feng Network