| Order a report

A Guide to Multicore Processors

Fourth Edition

To be Published August 2017

Authors: Jag Bolaria and Tom R. Halfhill

Single License: $4,495 (single copy, one user)
Corporate License: $5,995

Ordering Information

Get the Facts Quickly

"A Guide to Multicore Processors" (4th Edition) provides an in-depth look at 32- and 64-bit high-speed embedded processors with four or more CPU cores. This completely revised report from The Linley Group contains 190+ pages of information on high-end processors from AMD, Baikal, Broadcom, Cavium, Intel, Kalray, Macom (AppliedMicro), Mellanox (Tilera/EZchip), and NXP.

The report focuses on general-purpose RISC and x86 processors that have four or more CPU cores running at 1.0GHz or more, excluding specialized architectures (e.g. DSPs, NPUs). This report covers processors for embedded applications, focusing on networking, communications, storage, and security; it excludes multicore products designed for servers or for mobile devices. (We cover these processors, as well as embedded processors with four or fewer CPU cores, in other reports.)

"A Guide to Multicore Processors" has detailed coverage of AMD's Opteron family; Broadcom's XLP II and Stingray family; Cavium's Octeon TX and Octeon III families; Intel's embedded Xeon and Xeon-D lines; Kalray’s Bostan and Coolidge processor, Macom's Helix family; Mellanox's BlueField family; and NXP's QorIQ LS1 series, LS2 series, and T4 series.

This handy guide, packed with valuable information, brings you up-to-date on the newest developments in this important market and gives you the analysis you need to help choose a supplier or partner. The report also provides market-share and market-size data for the embedded and multicore markets.

"A Guide to Multicore Processors" begins with tutorials on the key technologies implemented by these products, background on the embedded market, and a discussion of the newest technology and market trends. Following these introductory chapters, the report delivers thorough coverage of all announced products in this area. For each major vendor, the report examines the performance, features, and architecture of each product, highlighting strengths and weaknesses in a consistent, easy-to-compare fashion. The report concludes with our own comparisons of these products and conclusions about which will fare best.

What's New in This Edition

Since publishing the previous edition of this report in 2016, we have updated the coverage to include many new announcements, including:

  • Broadcom’s Stingray Products
  • Cavium's ARMv8-compatible Octeon TX processors
  • NXP's newest ARM-based LS1- and LS2-series processors
  • Intel's new Xeon (Skylake) processors and Xeon D processors
  • Mellanox's new ARM-based BlueField processors
  • Final 2016 market size and vendor share
  • Embedded-processor forecasts to 2021

Multicore processors offer the best performance and flexibility for applications that are divisible into many small tasks, called threads. In embedded systems, the most common application for these products is networking, because each data packet can usually have its own thread. Packet processing is common in a wide range of networking and communications equipment, including routers, security appliances, storage subsystems, broadband infrastructure, and cellular base stations.

To ease programming, these multicore processors employ general-purpose instruction sets, such as x86, the Power Architecture (PowerPC), MIPS, and ARM. This characteristic distinguishes them from dedicated network processors (NPUs), which use custom instruction sets that are more difficult to program — and from packet-processing ASICs, which aren’t programmable at all. Most multicore embedded processors also include specialized hardware that accelerates packet-processing tasks. Thus, they’re widely favored for complex networking applications that require programmability, customization, and high performance. In addition, these devices are useful for a broad range of embedded systems that require general-purpose programmability.

We estimate the total revenue from general-purpose embedded processors grew 3.3% in 2016, reaching a new high of $4.3 billion. But some market segments, such as wireless communications, declined in 2016. This shrinkage was largely due to China’s slowdown in wireless-base-station deployments and a trend toward using more custom ASICs instead of merchant silicon. Other segments grew: security, Internet gateways, automotive, industrial, and storage.

Intel still leads the embedded-processor market by revenue. Despite their relatively high power consumption and relatively poor feature integration, Intel’s products offer the industry’s best single-thread performance — a big advantage in control-plane processing. The acquisition of Altera, the second-largest FPGA vendor, creates opportunities for future products that integrate embedded processors with programmable logic. In 2015 and 2016, Intel also became the leading supplier of multicore processors for communications — a position held for years by Freescale (now NXP), which suffered from the wireless slowdown.

The wave of industry consolidation that began in 2015 continued in 2016 and 2017. Avago acquired Broadcom and now operates as Broadcom Ltd. Mellanox purchased EZchip, which had previously purchased Tilera. Macom acquired AppliedMicro and immediately began seeking a buyer for the processor part of the business. The biggest merger of 2017, however, is still pending: Qualcomm’s bid for NXP, which only recently absorbed Freescale. If that deal succeeds as expected by the end of the year, only Intel will be a larger semiconductor company. Qualcomm is entering the server market with its new ARMv8-compatible Centriq processor, which it could adapt for high-end embedded applications in 2018 or 2019. The Qualcomm-NXP deal has the most potential to rearrange the embedded-processor market.

Broadcom held its position as the third-largest embedded-processor supplier in 2016. It gained share during the year, largely on the success of its ARM-based StrataGX family. The MIPS-compatible XLP family is fading away, however, and the company sold its next-generation ARM-compatible Vulcan processor to Cavium. Instead, Broadcom is introducing the new BCM58800 family, which has up to eight powerful ARM CPUs. Although it’s still less powerful than the high-end XLP chips, it supports 100Gbps networking and other up-to-date features.

Cavium, the fourth-largest embedded-processor supplier, enjoyed another year of healthy growth in 2016. The MIPS-compatible Octeon chips remain the cash cow, but the company is moving quickly to the ARM architecture. The ARMv8-compatible Octeon TX family addresses the midrange high-performance market, and we expect the Vulcan processor acquired from Broadcom to bolster Cavium’s product line in 2018. Although Cavium’s relatively simple CPUs lag in single-thread performance, their small size enables large multicore designs. Consequently, the company focuses on the data plane, where its many small CPUs and wealth of hardware accelerators are ideal.

AMD entered the ARM-based embedded-processor market in 2015 with its Opteron A1100 family, but it lacks a roadmap for future ARM products. Instead, it’s refocusing on the x86 market. In 2018, we expect to see the first high-performance embedded processors based on the new Zen CPU core, which has rejuvenated the company’s server- and PC-processor business. Zen-based chips should compete strongly with Intel’s midrange embedded Xeon processors.

After absorbing EZchip and Tilera, Mellanox plans to introduce its new ARM-based BlueField processor in 2018. The flagship 16-core chip will compete strongly for smart-NIC designs and storage arrays. It integrates a subsystem that’s virtually a Mellanox ConnectX-5 Ethernet adapter on a chip, and its dual 100 Gigabit Ethernet (100GbE) ports target 200Gbps networking. It also implements new standards such as Non-Volatile Memory Express Over Fabrics (NVMe-oF) for networked SSD storage arrays. BlueField replaces the 100-core ARM-based chip that Tilera was designing before the acquisitions.

Another competitor is French startup Kalray. Using a proprietary architecture that’s programmable with industry-standard tools, Kalray’s 256-core chips target massively parallel processing and real-time applications. A newer, smaller design should be easier to program and adds acceleration for security and machine learning. Overall, the embedded-processor industry remains vibrant, and the transition to ARM is gaining momentum. Nevertheless, we expect Intel and the x86 architecture to rule the market for years to come.

List of Figures
List of Tables
About the Authors
About the Publisher
Executive Summary
1 Processor Technology
Processor Basics
Central Processing Unit (CPU)
MMUs and TLBs
Bus Bandwidth
CPU Microarchitecture
Scalar and Superscalar
Instruction Reordering
Pipelining and Penalties
Branch Prediction
Multicore Processors
Main Memory
DRAM Basics
DDR Versions
Memory Subsystems
I/O and Network Interfaces
Ethernet Interfaces
PCI and PCI Express
2 Multicore Applications
Networking and Communications Equipment
Control Plane vs. Data Plane
Control-Plane Processing
Data-Plane Applications
Services Cards
Networked Storage and RAID Controllers
Broadband Infrastructure
Cellular Base Stations
Common Form Factors
3 Standard Instruction Sets
Architecture Comparison
Market Positions
x86 Instruction Set
Initial Instruction Set
Modern Extensions
ARM Instruction Set
Initial Instruction Set
Later Extensions
ARMv8 Architecture
Scalable Vector Extensions
ARM Cortex-A57
ARM Cortex-A53
ARM Cortex-A72
MIPS Instruction Set
Initial Instruction Set
Later Extensions
PowerPC Instruction Set
Instruction Set
4 Multicore Processors
What Is an Embedded Multicore Processor?
What Is Not an Embedded Multicore Processor
Common Characteristics
Standalone vs. Integrated Processors
Multicore Processors
Encryption Engines
RAID and Other Storage Engines
Packet-Processing Accelerators
Acceleration Software
Data-Plane Development Kit (DPDK)
OpenDataPlane (ODP)
Benchmark Software
CPU Benchmarks
Security Performance
5 Technology and Market Trends
Technology Trends
Integration Trends
Software-Defined Functions
CPU Complexity Tradeoffs
Memory Access
Managing Power
Market Overview
Market Size by Vendor
Market Share by Application
Revenue Market Share by Instruction-Set Architecture
Market Forecast
6 Broadcom
Company Background
Key Features and Performance
Internal Architecture
System Design
Development Tools
Product Roadmap
7 Cavium
Company Background
Key Features and Performance
Octeon III CN78xx- and CN77xx-Series
Octeon III CN73xx- and 72xx-Series
Octeon TX Processors
Internal Architecture
Octeon III CPU
Custom MIPS64 Extensions
Octeon III Caches
Octeon III Accelerators
Octeon TX Architecture
System Design
Development Tools
Product Roadmap
8 Intel
Company Background
Product Overview
Key Features and Performance
Xeon Scalable Processors
Xeon Platinum and Gold Processors
Xeon Silver Processors
Skylake-SP vs. Broadwell-EP
Xeon D Processors
Internal Architecture
Broadwell Microarchitecture
Skylake Microarchitecture
Skylake-SP Microarchitecture
System Design
Xeon Scalable Processors
Xeon E5v4 Processors
Xeon D Processors
Development Tools
Product Roadmap
9 Kalray
Company Background
Key Features and Performance
Bostan-1 and Bostan-2 (MPPA2 and MPPA2.2)
Coolidge (MPPA3)
Internal Architecture
Andey and Bostan (MPPA-256 v1, v2, and v2.2)
Coolidge (MPPA v3)
Development Tools
Product Roadmap
10 Macom (AppliedMicro)
Company Background
Key Features and Performance
Internal Architecture
System Design
Product Roadmap
11 Mellanox
Company Background
Key Features and Performance
Internal Architecture
System Design
Development Tools
Product Roadmap
12 NXP
Company Background
Key Features and Performance
QorIQ LS1-Series Processors
QorIQ LS2-Series Processors
QorIQ T4-Series Processors
Internal Architecture
Power e6500 CPU
Acceleration Engines
Quicc Engine
QorIQ Packet-Processing Acceleration (DPAA)
DPAA2 Packet Acceleration
QorIQ Layerscape Secure Platform
System Design
System Interfaces
Application Examples
Development Tools
Product Roadmap
13 Other Vendors
Company Background
Key Features and Performance
Product Roadmap
14 Comparisons
Sub-30W Processors
30-50W Processors
50-100W Processors
Processors Consuming More Than 100W
15 Conclusions
Market and Technology Trends
Vendor Outlook
Other Multicore-Processor Vendors
Appendix: Further Reading
Figure 1‑1. Basic processor design.
Figure 1‑2. Simple superscalar processor design.
Figure 1‑3. CPU pipelining examples.
Figure 1‑4. Generic multicore processor.
Figure 1‑5. Interleaved tasks on a multithreaded CPU.
Figure 1‑6. DRAM evolution.
Figure 2‑1. The control plane and the data plane.
Figure 4‑1. Standalone and integrated general-purpose processors.
Figure 4‑2. Networking-software interfaces.
Figure 4‑3. Typical curve of IPSec performance versus packet size.
Figure 5‑1. Worldwide revenue market share of embedded microprocessors, 2015-2016.
Figure 5‑2. Worldwide revenue market share of the top vendors of embedded processors for communications.
Figure 5‑3. Worldwide revenue market share of the top vendors of embedded processors for storage.
Figure 5‑4. Worldwide revenue market share of the top vendors of embedded processors for other applications.
Figure 5‑5. Worldwide revenue market share of embedded processors by CPU architecture.
Figure 5‑6. Worldwide revenue of embedded processors by application, 2016–2021.
Figure 5‑7. Worldwide revenue of embedded processors by communications segment, 2016–2021.
Figure 6‑1. Block diagram of Broadcom BCM58808H.
Figure 6‑2. Block diagram of the BCM58808H in a storage appliance.
Figure 7‑1. Cavium Octeon III and Octeon TX families.
Figure 7‑2. Block diagram of Cavium Octeon III CN7890.
Figure 7‑3. Block diagram of Cavium Octeon TX CN8370.
Figure 7‑4. Block diagram of ParPro card using the Octeon III CN7890.
Figure 7‑5. Octeon TX CN8370 in a storage array.
Figure 8‑1. Intel’s nomenclature for Xeon Scalable processors.
Figure 8‑2. Block diagram of Intel Broadwell embedded Xeon E5-2650v4.
Figure 8‑3. Block diagram of Skylake microarchitecture.
Figure 8‑4. Comparison of Intel cache hierarchies.
Figure 8‑5. Block diagram of Intel’s Purley platform.
Figure 8‑6. Dual-socket system design based on Intel Xeon E5v4.
Figure 8‑7. Block diagram of Intel Xeon D.
Figure 9‑1. Block diagram of the MPPA-256 VLIW CPU.
Figure 10‑1. Block diagram of Macom Potenza CPU.
Figure 10‑2. Block diagram of Macom Helix 1 processor.
Figure 10‑3. Block diagram of a gateway based on Macom Helix 1.
Figure 11‑1. Block diagram of Mellanox BlueField processor.
Figure 11‑2. Block diagram of Mellanox ConnectX-5 subsystem.
Figure 11‑3. Block diagram of BlueField flash-array controller.
Figure 12‑1. NXP QorIQ T- and LS-series processors.
Figure 12‑2. Block diagram of NXP QorIQ LS1088A..
Figure 12‑3. Second-generation Data Path Acceleration Architecture.
Figure 12‑4. QorIQ Layerscape Secure Platform.
Figure 12‑5. NXP VortiQa Network Security Suite.
Table 2‑1. Some common single-board-computer standards.
Table 5‑1. Worldwide revenue of the top vendors of embedded processors.
Table 5‑2. Worldwide revenue of the top vendors of embedded processors for communications.
Table 5‑3. Worldwide revenue of the top vendors of embedded processors for storage.
Table 5‑4. Worldwide revenue of the top vendors of embedded processors for other applications.
Table 5‑5. Worldwide revenue of embedded processors by application, 2016–2021.
Table 5‑6. Worldwide revenue of embedded processors by communications segment, 2016–2021.
Table 6‑1. Key parameters for Broadcom BCM588xx processors.
Table 7‑1. Key parameters for Cavium Octeon III CN78xx processors.
Table 7‑2. Key parameters for Cavium Octeon III CN77xx processors.
Table 7‑3. Key parameters for Cavium Octeon III CN73xx and CN72xx.
Table 7‑4. Selected Cavium Octeon TX embedded processors.
Table 8‑1. Intel code-names and product numbers.
Table 8‑2. Intel Xeon embedded multicore processors.
Table 8‑3. Key parameters for Intel Xeon Platinum and Gold processors.
Table 8‑4. Key parameters for Intel Xeon Silver embedded processors.
Table 8‑5. Xeon Scalable (Skylake-SP) versus Xeon E5v4 (Broadwell-EP).
Table 8‑6. Intel Xeon D embedded processors.
Table 8‑7. Key parameters for selected Intel Xeon D embedded processors.
Table 8‑8. Key parameters for Intel C62x (Lewisburg) south-bridge chips.
Table 8‑9. Key parameters for Intel DH89xx Coleto Creek chips.
Table 9‑1. Key parameters for Kalray MPPA processors.
Table 10‑1. Key parameters for Macom Helix 1 processors.
Table 11‑1. Key parameters for Mellanox BlueField processors.
Table 12‑1. Key parameters for NXP QorIQ LS1 quad- and octa-core chips.
Table 12‑2. Key parameters for NXP QorIQ LS2 processors with Cortex-A57.
Table 12‑3. Key parameters for NXP QorIQ LS2 processors with Cortex-A72.
Table 12‑4. Key parameters for NXP QorIQ T4 processors.
Table 12‑5. Performance of SEC 5.3 security engine.
Table 13‑1. Key parameters for AMD Opteron A1100 SoCs.
Table 14‑1. Comparison of sub-30W multicore processors.
Table 14‑2. Comparison of 30–50W multicore processors.
Table 14‑3. Comparison of 50–100W multicore processors.
Table 14‑4. Comparison of multicore processors using more than 100W.


Linley Processor Conference 2017
Covers processors and IP cores used in deep learning, embedded, communications, automotive, IoT, and server designs.
October 4 - 5, 2017
Hyatt Regency, Santa Clara, CA
Register Now!
More Events »


Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »