|
At a recent Linley Tech seminar in San Jose, a panel of IP experts
discussed trends in system-on-a-chip (SoC) design using third-party
CPU cores. The seminar highlighted some key trends, particularly
in the use of multiple CPU cores. Speakers from Freescale, IBM,
MIPS, Tensilica, and ARM participated.
More, Faster CPUs
The trend toward multicore designs was evident in results of a
survey of the seminar attendees, who were mainly designers of networking
and communications SoCs. About 60% of the attendees were designing
a chip with more than one CPU core, and nearly half of those were
using four or more cores. Multicore designs are well suited to
packet processing, particularly in the data plane. And once software
is ported to run on multiple cores, the step from two CPUs to four
is relatively easy.
There was some debate
among the panelists about the pace of this trend. IBM’s Harry Linzer reported that only a small percentage
of his customers are involved in multicore designs. But Tensilica’s
Sumit Gupta sees many customers doing multicore designs, particularly
in the data plane. This difference may reflect on the company’s
products: IBM’s cores are more powerful, but Tensilica’s
are small enough that a chip can easily include several of them. The
speed of licensed CPUs continues to increase as well. Whereas
most designers were implementing 200–300MHz CPUs a couple
of years ago, 77% of the attendees surveyed said their current
designs use CPUs at 300MHz or above. In fact, some expect their
licensed CPU to exceed 600MHz, a mark that few designs achieve
today. But the newest CPUs, such as ARM’s Cortex, combined
with increasing usage of 90nm and even 65nm technology, are driving
clock speeds to new heights.
Multithreading vs. Multicore
Darren Jones of MIPS discussed the advantages of multithreading.
The MIPS 34K is the only commercially available multithreaded CPU
core, although other vendors are using this technology internally.
Darren noted that multithreading adds only a small amount of die
area to the 34K CPU while improving performance by 60% on certain
EEMBC benchmarks.
Gupta pointed out that
Tensilica’s Diamond 570T CPU core,
at 0.5mm2, is so small that four of these CPUs can fit in the same
space as a single 34K. The 570T runs at about half the clock speed
of the MIPS 34K, but on highly parallel applications, a four-core
configuration could deliver better performance than a single 34K.
Most applications don’t scale linearly with multiple CPUs,
however, so a four-core design won’t deliver four times the
single-core performance. Jones also pointed out that a multithreaded
design, such as the 34K, generates better single-thread performance
than a multicore design. For example, if a high-priority thread
needs maximum performance, the 34K can devote all its CPU cycles
to that thread, whereas the Tensilica design would be limited to
the speed of a single, slower CPU.
According to Jones,
the 34K supports up to nine thread contexts to allow programmers
to “park” critical routines in
the CPU, avoiding the need to fetch thread state. For example,
placing an interrupt handler in one of the thread contexts will
greatly reduce interrupt response time. Dealing With Complexity
Moving from a single-core to a multicore model requires major software
work. The hardware impact is debatable. Toby Foster of Freescale
noted that simply combining two CPUs that have already been validated
together is not a difficult challenge, particularly since modern
SoCs have extensive custom logic outside of the CPU that are usually
the design bottleneck. In fact, Gupta argued that a multicore design
is simpler than a design containing several special-purpose hardware
engines; these custom engines can be replaced with off-the-shelf
CPUs plus software.
Another complexity
of multicore designs is the need for a high-bandwidth interconnect
for the CPUs. Even an SoC with only one CPU may have
high-speed memory and I/O controllers, or fixed-function accelerators,
that must be efficiently interconnected. ARM’s Dave Steer
noted a trend from buses to fabrics in order to achieve the necessary
bandwidth while supporting multiple, nonblocking transfers. Many
large vendors have an in-house fabric for this purpose, but attaching
a licensed CPU to this custom interconnect often requires significant
design effort.
Support from the CPU
vendor is critical in a multicore design. Vendors such as ARM
provide MP-validated CPUs with system-level
simulation tools that can identify inter-CPU problems before fabrication.
Freescale will instead design and validate a semicustom chip to
meet a customer’s specifications, offloading validation.
To achieve a successful design, any multicore designer must take
these issues into account.
Originally published in Nikkei
Electronics Asia,
March 2007
© 2002-2007 The Linley Group
|