laitimes

Multi-chip designs push complexity to the limit

author:The semiconductor industry is vertical
Multi-chip designs push complexity to the limit

本文由半导体产业纵横(ID:ICVIEWS)编译自semiengineering

Continued downsizing with advanced packaging technologies will require changes across the semiconductor ecosystem.

Multi-chip designs push complexity to the limit

Multi-chip designs challenge the ability to manage design complexity, drive up the cost per transistor, compress the market window, and drive the entire chip industry to scramble for new tools and methods.

The entire semiconductor design ecosystem—from EDA and IP providers to fabs and equipment manufacturers—has evolved over the decades, based on the assumption that more functionality can be integrated into chips and packages while improving power, performance, and area/cost equations. But as it has become more difficult to integrate all of these functions into a single chip or package, the complexity of developing these devices has increased dramatically.

With an estimated 1 trillion transistors in advanced packaging technology in the near future, tight control of power, performance, and area/cost (PPA/C) will require a major shift at every step of the design-to-manufacturing process.

Sutirtha Kabir, Senior Architect of R&D Engineering at Synopsys, said, "The industry isn't ready yet, but we're moving in that direction. "What steps do we think are there between today and that year, whether it's 2030 or earlier? Let's say you pick up an SoC and fold it [a simple 3D-IC analogy], let's say everything you do is put them into two chips that have the same functionality, but nothing else changes, your number of transistors hasn't changed, but what you've done in the process is you've added an interface between those two chips, whether it's a bump or a hybrid bonded interconnect (HBI).

Designs that were previously done on a single chip are more complex because the functions are now distributed across multiple chips or chipsets. "Basically, the tasks that were done before have become more difficult. Ron Press, Senior Director of Technology Enablement for Tessent's Silicon Lifecycle Solutions at Siemens EDA. "Remember Bill Gates's famous quote in 1981, '640K RAM should be enough, right?'" That's when it applied. Complexity is the driving force behind EDA. Once it becomes too difficult to perform a task using traditional methods, some kind of abstraction and automation is necessary. From the early days of electronics, this drove programming language compilation to silicon design as well as many EDA tools. Therefore, the definition of complexity is always relative to the current state of the art. ”

This, in turn, adds to the complexity that comes with higher data rates. "If you look at data rates versus time, for 2G, 2.5G, 3G, 4G, 5G, they support about the same data rates as Moore's Law growth, which also confirms the growing complexity. Chris Mueth, Director of New Markets Management at Capgemini, noted. "A long time ago, 2G phones were made up of a bunch of components – transistors, small modules, and discrete components. Back then, phones were piled up with electronic components, and there was little room for extra functionality. But now everything is integrated. The module is almost as big as an IC chip from a long time ago, and it contains everything. And 3D-IC will take it to the next level. ”

This also significantly increases the validation challenge. "In the 2.5G era, there may be 130 specifications for mobile phones, and 1,500 specifications for 5G phones to verify. Mueth said. "There are so many different frequency bands, different modes of operation, different voltages, digital controls, and so on, and you have to verify everything before you ship it, because the last thing you want to do is find a problem when the phone is already on the market. ”

All of this has led to a huge increase in complexity and is wreaking havoc on long-standing approaches to chip design.

"Previously, a single-chip designer might have been concerned about these issues, but that's more of a packaging issue. Kabir of Synopsys said. "Let the encapsulators worry. The chip design team only needs to work to the pins. Something always happens with RDL bump connections. But now, because the connection between the signal is done through the bumps between these chips, chip designers have to worry about that. What we've seen this year is that we started with millions of bumps, and now the number of bumps has quickly increased to about 10 million, and it's expected that in two or three years, a multi-chip design will have 50 million HBIs connections. ”

Others share the same view. "In my years in the industry, I always felt like we were solving the most complex problems of the time," said Arif Khan, director of the Design IP Senior Product Marketing Team at Cadence. "Moore's Law applies to monolithic systems until mask limits and process limitations are encountered. Transistor density does not increase linearly with advances in process technology, and our demand for increasingly complex designs continues unabated, pushing us to the physical limits (mask limits) of lithography images. It is estimated that NVIDIA's GH100 design has more than 140 billion transistors, a chip size of 814 square millimeters, and a 4nm process. ”

Multi-chip designs push complexity to the limit

Figure 1: Complex Universal Design Process. Source: Cadence

Zooming out in multiple dimensions

As advanced process technologies become more complex, wafer costs are outpacing historical norms. When combined with the gradual decline in transistor scaling of each new generation of new processes, the cost per transistor on each successive frontier node is higher than the previous generation.

"This poses a dilemma for the design, as it is much more expensive to design and manufacture at newer process nodes," says Khan. "Larger designs naturally produce fewer wafers. When random defects are taken into account, when the wafer size is larger, the loss of yield is greater and a portion of the smaller denominator becomes unusable unless these wafers can be repaired. As process technology moves beyond 5 nanometers, extreme ultraviolet technology has reached the limits of single-layer lithography. High numerical aperture EUV technology now comes into play, doubling the magnification and allowing for smaller spacing, but reducing the mask size by half. As a result, today's increasingly complex and larger designs have no choice but to disassemble, and chipset technology is the holy grail. ”

At the same time, there is a greater focus on adding new features to the design, and the main limitation is mask size. This adds a whole new layer of complexity.

"In the Belle Époque of IBM mainframes and Intel/AMD x86 servers, it was all about clock speed and performance," observed Ashish Darbari, CEO of Axiomise. "Since the late 90s, power consumption has been the dominant driver in the industry since the late 90s, with performance versus power and area (PPA) determining the quotient of design complexity as chips are compressed into smaller form factors such as mobile phones, watches, and tiny sensors. According to a 2022 report by Wilson Research, 72% of ASIC power management is reported to be proactive, and power management validation is a growing challenge. However, with the rapid adoption of silicon in automotive and the Internet of Things, functional safety and design complexity dominate. You can't design a chip without considering power, performance, and area (PPA) – as well as security and/or confidentiality.

According to Harry Foster's Wilson Research Report, 71% of FPGA projects and 75% of ASIC projects consider both security and confidentiality. With the advent of "circuit breakers" and "ghosts" (2018), as well as a series of ongoing chip security vulnerabilities, including "GoFetch" in 2024 — security issues are proving to be a direct result of design complexity. To make matters worse, security vulnerabilities often stem from performance-enhancing optimizations such as speculative prefetching and branch prediction.

"To achieve low-power optimization, designers have used selective state holding, clock gating, clock dividers, hot and cold resets, and power islands, which present verification challenges in terms of clock and reset verification," Darbari said. "Multi-speed clocks introduce challenges regarding glitches, clock domain crossovers, and reset domain crossovers. ”

While computing performance has always dominated the design landscape, it is now just one of many factors, such as moving and accessing the increasing amount of data generated by sensors and AI/ML. "HBMs are one of the cornerstones of AI/ML chips, and that's where our industry is headed," Darbari said. "If you look at the broader scope of design complexity, beyond PPA, security, and confidentiality, we should note that in the era of hundreds of cores and AI/ML on a single chip, we are revisiting the design challenges of high-performance computing while minimizing the power footprint, as well as optimizing arithmetic (fixed-point/floating-point) data formatting and correctness. Moving data faster at low power consumption, using high-performance NoCs, introduces deadlock and livelock challenges for designers. The RISC-V architecture opens the door for anyone to design a processor, which has led to clever designs that can be used as both CPUs and GPUs, but the design complexity base around PPA, security, confidentiality, and deadlocks, livelocks, and compute and memory-intensive optimizations will be as relevant for RISC-V as they were before the RISC-V era. Over the past six years, a significant amount of work has been devoted to establishing RISC-V microarchitecture implementations for compliance with the RISC-V instruction set architecture (ISA), using simulations for startup testing and formal methods to mathematically prove compliance. RISC-V validation, especially for low-power, multi-core processors, will open a Pandora's box of verification challenges, as not many design houses have the same level of verification capabilities as more established companies. The Wilson Research report suggests that for ASICs, 74% of design surveys have one or more processor cores, 52% have two or more cores, and 15% have eight or more processor cores – we've seen more of this in our experience deploying formal verification. "

Ways to solve complexity challenges

Solve complexity challenges with an automated and abstracted approach that continuously builds on previous generation capabilities.

"Over time, more and more trade-offs and optimizations are embedded into EDA tools, so users can provide fewer complex 'intent' commands and let the tools do the hard and tedious work," says Siemens' Press. "Innovation is necessary to deal with some of the complexities, such as how to communicate between devices and sort data. In the testing community, scanning is a way to translate a design into shift registers and combinatorial logic. Scans make automated test pattern generation possible, so EDA tools can generate high-quality test patterns without the need for someone to understand the functional design. As data and test times become too large, embedded compression is used to improve efficiency. ”

Darbari agreed. "Test and validation has evolved from architectural verification suites in the '70s and '80s to finite random, formal verification and simulation. Each new verification technique deals with designs with different levels of abstraction, and when used correctly, they can be complementary. While simulation can infer functionality and performance at the entire chip level, finite randomness and formal are good techniques at the RTL level, and formal verification is the only technique for building defect proofs. We've seen an increase in the use of formal verification for architecture verification, as well as for finding deadlocks, livelocks, and logic-related errors. ”

There are other types of complexity as well. "You can define complexity based on the application area and where it happens in the process," said Frank Schirrmeister, Vice President of Solutions and Business Development at Arteris. "You can define complexity in terms of the system you're going to build. Obviously, when you think about the system, you can go back to the old-fashioned chevron diagram, which gives you a sense of complexity. You can then define complexity based on technology nodes and process data. In addition, there are very traditional definitions of complexity, which are addressed by raising the level of abstraction. But what happens next?"

Multi-chip designs push complexity to the limit

Figure 2: Complexity growth in SoCs (left) and NoCs (right). Source: Arteris

Chiplets

The answer is chiplets, but as chiplets and other advanced packaging methods become more widespread, designers have to deal with many issues.

"Chiplets provide a modular solution to this increasing complexity," said Cadence's Khan. "For example, a complex SoC designed at the 'N' process node has many subsystems – compute, memory, I/O, etc. Moving to the next node (N+1) to add additional performance/features does not necessarily provide significant benefits, given limited scaling improvements with other factors (development time, cost, yield, etc.). If the original design was modular, then only those subsystems that benefited from process scaling would need to be migrated to the advanced node, while the other chiplets would remain at the older process node. Breaking down the design to match each subsystem to its ideal process node addresses a critical aspect of development complexity. In the first round, there was overhead in designing for the disaggregated architecture, but subsequent generations reaped significant benefits in terms of reduced development costs and increased SKU generation options. Leading processor companies such as Intel (Ponte Vecchio) and AMD (MI300) have taken this approach. ”

It is especially important to customize chiplets to achieve ideal power, performance, area/cost to manage cost and time to market. "New features can be added without redesigning the entire chip, allowing the design to hit the market window while maintaining the product refresh cadence that would otherwise be slowed down by the development and productization time required in advanced nodes," Khan said. "Nirvana, conceived by companies such as Arm, is a chiplets market that proposes a chiplets system architecture to standardize chiplet types and partition selection (within its ecosystem). SoC designers still need to custom design their secret formulations, which provides differentiation in their implementations. Automation will be a key driver to reduce complexity here. Over the past few years, the complexity of chip-to-chip communication has been largely mitigated through standards such as chip-to-chip standards such as UCIe. However, there are additional implementation complexities that designers must overcome when moving from 2.5D ICs to 3D-IC streams. How do you logically partition between individual chiplets to provide optimal partitioning for direct chip-to-chip connectivity with stacked chips? The next area is to shift this complex problem from user partitioning to automated, AI-driven design partitioning. One can imagine that AI processors in one generation will become the workhorse of the next generation of chiplets-based processors for design. ”

At the same time, chiplets introduce a new dimension of verification – verifying chip-to-chip communication based on the UCIe protocol, while also understanding the complexity of latency and thermal issues.

In other words, chiplets are another evolution of design growth and expansion, says Siemens' Press. "As with many previous technologies, it's important to be able to implement standards for more plug-and-play approaches. Instead of dealing with increasingly complex tradeoffs, designers should adopt an approach that eliminates difficult tradeoffs. In the field of scan testing, packaged scan delivery can eliminate the entire layer of complexity, allowing chipLTE designers to optimize the chiplet design test and pattern. There is a plug-and-play interface and self-optimizing pattern transfer, so users don't need to worry about core or chiplet embedding or I/O pins to get scan data to the chiplet. The idea is to simplify the problem with a plug-and-play approach and automatic optimization. ”

How best to manage complexity

Given the many considerations and challenges involved in multi-chip designs, complexity can be difficult to manage easily. However, there are ways to help with this.

Axiomise's Darbari points out that by intending to use more advanced techniques, such as formal verification, shifting the left of the validation will have a huge impact on the results. "The use of formal verification early in the DV process ensures that we catch errors faster, find errors in edge cases, establish proof that errors don't exist, establish deadlock-free and livelock-free freedom, and gain coverage to find unreachable code. Simulations based on constraints and stochastic stimuli should only be used when formal verification is not available. "

But there is another side. In many cases, complex problems cannot be solved for the entire chiplets. "You have to break it down into segments," said Kabir of Synopsys. "Solve small problems, but make sure you're solving bigger problems. In multi-chip designs, this is the biggest challenge. We're still looking at' it's a hot issue. No, it's a power issue. But yesterday you designed the same chip. Sometimes when the chip comes back in the lab, they find that the timing is inaccurate because the thermal or power effects of the timing are not properly accounted for. Models and standard libraries don't predict this, and it can have a significant impact. So there's a lot of leeway in the design, how can we compress that? It also means that we need to consider multiple physics effects, as well as timing and construction. ”

Breaking down complex problems into manageable pieces is something that chip design engineers are still grappling with. "It's a new conundrum, and I'm seeing a lot of people struggling with it, and it's just one of the complexity challenges, not even at the atomic level," Kabir said. "What's the design process like? Who comes first, who comes later? Which problem do you solve first? And not only that, how do you make sure that the problem is solved throughout the process and that all the different chips are able to come together? No one company knows how to do that, we have to solve it together. Everyone will offer a different solution, and this is where AI/ML tools and the like have a lot to offer. ”

Keysight's Mueth agrees. "It's definitely a multidisciplinary challenge. Your digital designer has to talk to your RF designer, who has to talk to your analog designer, a chip designer has to talk to a packaging designer, and a thermal analysis, a vibration analysis. It's a multidisciplinary world, because now you have your system and your system's system. You have the underlying components. It's really complicated. There are four different dimensions, and then you have to look at it throughout the project lifecycle. Sometimes it's amazing how well people are able to accomplish anything. ”

This may be an understatement. While the complexity has grown exponentially, the number of staff has not increased commensurately. "The average length of an engineer in the U.S. is 4.5 years. In Silicon Valley, that number is 2.5 years," Mueth added. "When they leave, they take all the design know-how, tribal know-how, company know-how, and you're left empty. So, you really want to have a way to digitize your processes, lock them down, and lock down the intellectual property that you've developed. You must find a way to expand or bridge the gap between workers and complexity, which includes finding new automated processes. We've seen a lot of people desperately working on large platforms. But we already know that big platforms don't cover everything. They can't. There are too many variations and too many applications. The solution is a combination of application-specific workflows, peripheral engineering management, and peripheral processes, as engineers don't spend 100% of their time on simulation, or even design. They spend most of their time dealing with peripheral processes that are sadly not automated. ”

*Disclaimer: This article was created by the original author. The content of the article is his personal point of view, and our reprint is only for sharing and discussion, and does not mean that we agree or agree, if you have any objections, please contact the background.

Read on