Microprocessor Report (MPR) Subscribe

Interview: NetSpeed CEO Rethinks SoC Design

Sundari Mitra Discusses Automating the Design of Complex Chips

July 6, 2015

By Linley Gwennap

One of the difficulties posed by Moore’s Law is that raising the number of transistors on a chip can increase the design time. To combat this effect, designers often reuse intellectual-property (IP) blocks, such as a CPU or DSP core. These IP blocks may be in-house designs proven in an earlier product, or they may be licensed from an IP vendor that has pretested the component; in either case, however, they need not be redesigned or revalidated.

Problems, however, often crop up when connecting these proven blocks to form a complete system-on-a-chip (SoC), particularly as the number of IP cores per chip continues to grow. An underprovisioned interconnect can cause bottlenecks and performance loss, whereas an overprovisioned interconnect consumes too much power and die area, adding cost. A poorly designed interconnect is often undersized in some places and oversized in others, and it may have functional errors as well. Finding these errors can take months, extending the time before tapeout. Errors found after tapeout can cost millions of dollars for a new mask set. These challenges are slowing the pace of innovation in SoC design.

NetSpeed founder and CEO Sundari Mitra. (Photo courtesy of Intel)

Sundari Mitra cofounded NetSpeed Systems in 2011 to help solve this problem. She has more than 25 years of experience in chip design, starting at Intel in 1988 before spending 12 years at Sun Microsystems. She founded Prism Circuits, a mixed-signal company, in 2006; MoSys acquired the company three years later. Her experience led her to realize that the SoC design process is fundamentally broken. Along with cofounder and CTO Sailesh Kumar, she determined that the solution is to give the designer more visibility as early in the process as possible. In this exclusive interview, she explains where she got the idea and how it improves the design process.

SoC Design Creates Complexity

What are the major challenges that you see with chip design today?

Sundari: I’ve been working a long time in the industry: I started on the NMOS 286. I’ve seen the industry transform from completely manual design to full EDA tools. When I was an engineer, a project team would consist of one architect; about 10 RTL designers; for circuit designers, which is what I was, there would be about 30 of us; and then there were lots of mask designers. But when synthesis came in, the pyramid started reversing a little bit—at least the number of mask designers went down, because that process was automated, regularized, and correct by construction. This change really helped time to market, because the integration time improved. So that’s one good transformation, and it was back in the 1980s.

Since then, EDA has continued to evolve. The EDA vendors continue to give us more tools, because technology on the back end is getting more complex. Today, you have tools to help with signal integrity and making sure that you can actually manufacture the chip (DFM)—a lot of innovation.

But chip design has really changed; now we make a system on a chip. What used to be an individual chip has become an entity on an SoC. An SoC comprises several IP blocks, and each has a protocol and has standardized how it interfaces with everything else. Now, IP as an industry is maturing, standardizing around ARM CPUs, and then you have a few guys offering memory controllers, a few with video, a few with DSP cores. Once you standardize the interfaces, it is possible to design SoCs using algorithms and mathematics rather than plopping them together manually or using a spreadsheet to architect them.

When did you realize this was a big problem?

Sundari: My first startup [Prism] was a serdes company, so for five years I had no tapeouts. Then I became the VP of Engineering for MoSys. When I had to do another tapeout, all the painful memories came back! I was sitting in this war room for five months, banging my head against the wall at the bickering and finger-pointing between the back-end team and the architecture team. In 30 years, we still hadn’t solved the layout and timing issues. 

I don’t think it’s fair to blame the architects; they have no tools to help. When they are architecting a chip, they have only a vague idea of what is actually going to get implemented, but they have no metrics that they can roll into their analysis. And what has the EDA industry given to this new era of SoC-based design? Really, not much. Most performance simulation is RTL based, but that doesn’t help with up-front design. The architect still uses his or her spreadsheet and has no idea of what the floor plan might be or how to factor it in. You need better placement information, better timing convergence, while the architect is in design.

But at the same time, I realize you can’t give an architect an EDA tool. You have to give them an IP. You’ve got to give them something that they can actually build, analyze, and compare with the RTL simulation so they can benchmark it. No architect will accept something they cannot benchmark.

So, this IP has to use SoC-level metrics that architects define. Architects care about power, performance, and area. What is performance? It’s bandwidth, it’s latency; it’s what type of IP blocks they have, how are they connected, how many masters, how many slaves, the quality of service. They don’t want to build the whole thing, get RTL back, then find a deadlock. If you are lucky, you find the deadlock a week before tapeout.

If you’re not lucky, you find it after tapeout.

Sundari: Yes. So these are the kind of issues that NetSpeed addresses (see MPR 12/1/14, “Opportunity NoCs, NetSpeed Answers”). We are using correct-by-construction synthesis concepts to solve NP-hard problems. We use networking algorithms to come up with a dependency graph of the whole SoC on the basis of the architect’s inputs and their optimization criteria. We give them lots of knobs to customize what they are doing. What is the value above and beyond getting rid of the tapeout pain points? You are getting system-level analysis, right from day one, so you don’t have to wait for RTL. We have actually benchmarked our solution against post-RTL tools and found a very close match.

Could you have taken this approach as part of a large EDA company?

Sundari: Absolutely not. The only way to do something radical like this was to come out on our own. You have a blank sheet of paper. You have no legacy products to worry about. You can think outside the box and create something. That is the reason I decided to do it as a startup, rather than inside one of the large companies, where I had enough power to make things happen. But something like this just doesn’t get embraced when you do it there.

Fitting Into the Design Flow

You are talking about changing the way SoCs are designed. Don’t you have to work within the existing design flows and figure out how to change them? You can’t just walk in and say you are going to change the whole design flow.

Sundari: No, you augment it. We do not change any of the back-end flows at all. We just give them all the views. Our tools generate the RTL, the synthesis scripts. We provide the constraints for their timing, for their placement tools. We give them all that information, but we don’t do the flow ourselves.

Most designers use some kind of C-level simulation, right? But the system C model isn’t timing accurate. It doesn’t get down to the level needed to flush out deadlock conditions.

Sundari: Deadlock conditions, QoS conditions, and some of the things that they didn’t have to worry about 10 years ago, when the physical design constraints were pretty loose. At this point, if you talk to an architect and he doesn’t understand what his power domains are, he doesn’t understand what his clocking strategy is and how it impacts his architecture, you are talking to a guy who isn’t going to last very long. So that’s the level they need to go to, but the C-level model doesn’t get there.

Then you find the disconnect between the RTL and the C model. Traditionally, the RTL model gets built toward the end of the process and then you try match it to the C model, whereas we give you the RTL in minutes, although it is only for the connectivity and the NoC. But guess where the silicon bugs occur? The IPs are already proven.

I went to TSMC and asked one of the executives, “How is it going for the guys in China in terms of silicon success?” He says, “Man, are we in trouble. Because they tell us that all the IPs are silicon proven, but we slap it together and the chip doesn’t work!” Why doesn’t it work? Because the bugs happen in the interfaces. So if you can build it algorithmically, look at the market that becomes available. But to teach people what we have is not easy.

It is a pretty big change. How do you sell this idea?

Sundari: Some of the large accounts we’ve won, it’s only a few days before the architect says, “I must have this! I’ve never seen anything like it before!” The other end of the spectrum is an architect who likes to control everything, who likes to manually massage everything that they are doing. Those guys say, “Oh my God, this is like voodoo!” Or they say, “It’s black-box approach; I don’t like it.”

It can be hard to change an architect’s mind.

Sundari: Yes, which is why we present it saying, “This is a different way of doing things. What do you think?” That is also something that is fundamentally different about us in that we go to the architect. We work purely at the architect’s level and say, “Do you buy into this solution?” If you talk to many architects, you will find that there are some discrepancies that they’ve managed to live with, but they don’t need to anymore.

Not Just Another NoC

People think of NetSpeed as a network-on-a-chip (NoC) company, but what you provide sounds different from basic NoC IP.

Sundari: Our tool is actually meant for SoC architecture-level analysis. One of the IPs that we provide at this point happens to be a NoC, but we have plans for doing more. Once you have this platform, you can tackle other global pieces of an SoC because you control the framework of how the SoC comes together. For example, security is one part of it. Also, very good power management, because you can provide much more fine-grain control.

Since the platform works with third-party IP, there’s no need for you to design everything under the sun.

Sundari: No, we don’t need to. Things associated with the NoC, yes. What about global things? We can integrate other people’s power management, but we can also come up with our own state machines that allow much easier control. I’m looking at the pain points of the industry, where people go wrong when they are designing SoCs. You find so many things in this industry that people don’t want to change. Why do they not want to change it? Because it’s too expensive to try. The minute they try something new, the silicon bombs, and TSMC gets another $3 million from them [for a new mask set]. That is the problem I’m trying to address: you don’t need to be worried if it’s organized, algorithmic, and correct by construction. You can prove before you tape out that it is functionally correct.

But you don’t want to tie people to particular IP.

Sundari: That’s right. When I look inside some of the larger semiconductor companies, on average, they have five or six different NoC designs. One piece of the company is using Arteris, another is using Sonics, and then they have five others that are homegrown. Why? One is optimized for latency. Another is optimized for bandwidth. A third is optimized for power. A fourth is optimized for mobile applications.

And the fifth guy didn’t know the other ones existed, so he made his own.

Sundari: Our vision is to make it uniform, without controlling it. Because we are giving everything to the customer, we are not pushing a protocol. We are protocol agnostic. We don’t want to push our customers one way or the other. Even though Imagination is one of our investors, we have no interest in saying, “You have to use Imagination graphics, don’t use Mali.” We say you can use whatever you want. We have to coexist.

Those are some of the reasons why our business model is different. That’s the hope. We’ll let the story play out in the next couple of years and see how we do.

 A Smarter Design Methodology

A decade ago, most semiconductors performed one or two functions, but the majority today embody all of the functions of a complete system on a single chip. According to a McKinsey study, the average complexity of a chip design has increased 12% per year over the past decade, whereas design productivity is improving only 7% per year. Most chip companies have been increasing the size of their design teams to compensate for this gap, but larger teams are difficult to coordinate, reducing efficiency.

NetSpeed offers a different approach: work smarter instead of bigger. Use proven IP wherever possible. Add value by designing a critical IP block or simply by specifying the right combination of blocks for the end application. Sew it together with a correct-by-construction interconnect. Without this approach, smaller chipmakers will be unable to keep pace with the industry giants.


Linley Fall Processor Conference 2018
Covers processors and IP cores used in embedded, communications, automotive, IoT, and server designs.
October 31 - November 1, 2018
Hyatt Regency, Santa Clara, CA
Register Now!
More Events »


Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »