| Order a report

A Guide To Server Processors

Second Edition

Published November 2012

Authors: Jag Bolaria and Bob Wheeler

Single License: $3,495 (single copy, one user)
Corporate License: $5,000

Ordering Information



Examining the Processors Powering Scalable Computing

The market for server processors is changing, creating openings for new vendors. With the emergence of mega data centers and cloud computing, server economics no longer focus on capital expenses alone. Demand for ultimate performance from a single processor has been replaced by a balanced view of capital and operating costs. Performance per watt and performance per watt per dollar are the new metrics driving purchasing decisions in large data centers. Physical density is also growing in importance, driving greater scalability and new form factors such as microservers that pack more nodes into precious rack space.

The market is moving to a new era of where backward compatibility is less important than before and innovation takes the front seat. Intel and AMD—the incumbent vendors—continue to innovate and advance their Xeon and Opteron designs, respectively. Integration, microarchitecture advances, and process technology are the primary factors when evolving these x86 processors. But new entrants are eyeing cloud-computing environments as an opening for radically different architectures or more power-efficient CPU architectures.

Product Information Tempered By In-Depth Analysis

This report covers processors designed specifically for servers. We provide detailed coverage of Intel’s Xeon product lines across Westmere, Sandy Bridge, and Ivy Bridge generations. We cover AMD’s Opteron family and details of the company’s Bulldozer microarchitecture. The new vendors include Tilera, which is shipping 64-bit processors today. Because other vendors entering this market are using ARM-based designs, we provide coverage of ARM’s intellectual-property cores including the Cortex-A15. Vendors of ARM-based server processors include AppliedMicro, Calxeda, Cavium, and Marvell. Startup Calxeda and larger Marvell are shipping 32-bit ARM processors, whereas AppliedMicro and Cavium are developing 64-bit ARMv8 chips as their first server offerings.

New to this edition is coverage of coprocessors (or accelerators) for high-performance computing (HPC). Our initial coverage includes Intel’s new Xeon Phi (Knights Corner) and Nvidia’s Tesla GK110 (Kepler). These board-level products promise to break the double-precision teraflops barrier for floating-point performance.

This report analyzes each vendor and each product, probing their strengths and weaknesses and presenting key details in a consistent, easy to compare fashion. We examine processor performance, integration, power dissipation, and overall system design. Where possible, we also look at the vendors' roadmap.

Make Informed Decisions

As the leading vendor of technology analysis for microprocessors, The Linley Group has the expertise to deliver a comprehensive look at these technologies. Authors Jag Bolaria and Bob Wheeler use their broad experience to deliver the technical and strategic information you need to make informed business decisions. And in case you are not familiar with all of the concepts involved in processor and server designs, the report includes several introductory chapters that define and describe terms such as superscalar, multithreading, pipelines, and virtualization.

This report is written for:

  • OEMs that need to make strategic vendor selections
  • ODMs supplying cloud-computing and HPC customers
  • Data-center architects looking at alternative platforms
  • Marketing and engineering staff at companies that sell other server components
  • Financial analysts who desire a detailed analysis and comparison of both incumbent and new vendors

What's New in This Edition

“A Guide to Server Processors” has been extensively updated to include the latest vendor disclosures.

Here are some of the many changes you will find:

  • Coverage of many new products from Intel, including Xeon E5 (Sandy Bridge), Xeon E3-1200v2 (Ivy Bridge), and Xeon Phi (Knights Corner)
  • Coverage of AMD’s Bulldozer-based Opteron 4200 (Valencia) and 6200 (Interlagos) processors
  • Coverage of Calxeda’s first ARM-based server SoC, the ECX-1000
  • Coverage of AppliedMicro’s X-Gene processor, which should be the industry’s first 64-bit (ARMv8) product
  • Coverage of Cavium’s Project Thunder, a multicore ARMv8 design built off of the company’s successful Octeon architecture
  • Coverage of ARM’s new ARMv8 cores and fabric IP
  • New coverage of NVIDIA’s Tesla accelerators for high-performance computing, focusing on the new Kepler generation
  • Extensive updates to company-background information, roadmaps, and analysis
  • Forecast for merchant server processors through 2016
  • Revised and updated tutorials

The market for server processors is changing, creating openings for new vendors. With the emergence of mega data centers and cloud computing, server economics no longer focuses on capital expenses alone. Demand for ultimate performance from a single processor has been replaced by a balanced view of capital and operating costs. Performance per watt and performance per watt per dollar are the new metrics driving purchasing decisions in large data centers. Physical density is also growing in importance, driving greater scalability and new form factors such as microservers that pack more nodes into precious rack space.

The market is moving to a new era where backward compatibility is less important than before and innovation takes the front seat. Intel and AMD—the incumbent vendors—continue to innovate and advance their Xeon and Opteron designs, respectively. Integration, microarchitecture advances, and process technology are the primary factors when evolving these x86 processors. But new entrants are eyeing cloud-computing environments as an opening for radically different architectures and more-power-efficient CPU designs. With the merchant server-processor market exceeding $7 billion, success requires taking only a few percentage points of share from Intel.

Having reached practical power limits, server-processor designers are increasing performance primarily by adding cores rather than increasing clock speeds. Mainstream x86 processors now offer 16 cores per chip, while startup Tilera is already shipping 36-core processors. Mainstream server processors are currently using 32nm technology, although Intel is shipping its first processors using the 22nm node. By moving to a finer geometry process, vendors get more transistors in the same die area and power envelope. This additional transistor count can be used to add CPUs or to increase cache sizes. Larger caches increase performance by absorbing DRAM latency, which is not decreasing as rapidly as processor performance is growing.

Following the earlier integration of memory controllers, processors are now integrating PCI Express (PCIe) controllers. This step eliminates one system-logic component, the “north bridge,” reducing the chipset to a single “south bridge” chip. For desktop and workstation applications, processors are also integrating the graphics controller. New entrants like Calxeda, Marvell, and Tilera offer system-on-a-chip (SoC) designs that integrate Ethernet controllers in addition to memory and PCIe controllers. For high-density server form factors, this additional level of integration provides differentiation.

Intel offers the broadest line of server processors, which are built using its industry-leading process technology. The company’s current two-socket design is the Xeon E5-2600 platform (Romley) based on the 32nm Sandy Bridge architecture. New to Intel’s Xeon lineup is a four-socket version of this platform, the E5-4600. This new platform offers a lower-cost alternative to the Westmere-based Xeon E7 line, which is designed for scale-up servers with four or more sockets (4P and above). By adding reliability, availability, and serviceability (RAS) features to the E7 line, Xeon processors now serve mission-critical designs that formerly required Itanium (IA-64) processors.

Using its tick-tock development approach, Intel shrank Sandy Bridge to produce the 22nm Ivy Bridge architecture. The first server processors to use Ivy Bridge comprise the Xeon E3-1200 v2 line, which serves uniprocessor designs. By the end of 2012, Intel plans to introduce a server proces­sor based on its low-power Atom architecture. At six watts, this chip will compete with new entrants fielding ARM-based designs.

AMD’s Opteron line of server processors offers a good alternative to Intel for high-volume two- and four-socket platforms. With its 16-core Opteron 6200 (Interlagos), AMD continues to lead in core count for x86 processors. The 32nm Opteron 6200 processors are based on a novel microarchitecture called Bulldozer that improves density. The company also offers Opteron 4200 and 3200 processors based on this design. In 4Q12, AMD introduced the Opteron 6300 Series (Abu Dhabi), which uses an update to Bulldozer called Piledriver. AMD has also licensed ARM’s Cortex-A57 CPU and plans to introduce processors using this core in 2014.

New entrants fielding non-x86 server processors are targeting cloud computing, where compatibility requirements are more manageable compared with traditional enterprise applications. Most of these new vendors implement the ARM instruction set using either licensed or custom CPU designs. Marvell was first to market with a four-core ARM processor suitable for servers. Startup Calxeda added unique server features to its quad-core ARM processor. Other vendors are skipping 32-bit ARM designs and instead using the new 64-bit ARMv8 instruction set. AppliedMicro and Cavium are designing custom 64-bit CPUs for their respective server processors, whereas ARM is licensing its new Cortex-A57 (Atlas) core to vendors that prefer an off-the-shelf CPU design.

One vendor, Tilera, relies solely on open-source Linux distributions and open-source or customer-ported applications. The company is shipping 64-bit server processors with up to 36 proprietary CPUs connected using a unique on-chip mesh network. The startup’s chips deliver a several-fold improvement in performance per watt compared with x86 processors.

Table of Contents
List of Figures
List of Tables
About the Authors
About the Publisher
Preface
Executive Summary
1 Processor Technology
Processor Basics
Central Processing Unit (CPU)
Caches
MMUs and TLBs
Bus Bandwidth
CPU Microarchitecture
RISC Versus CISC
Endianness
Scalar and Superscalar
Instruction Reordering
Pipelining and Penalties
Branch Prediction
Server Processors and Technologies
What Is a Server Processor?
Multicore
Multithreading
System Buses
Memory Subsystem
PCI Express
Server Benchmarks
SPEC Benchmarks
TPC Benchmarks
VMmark
HPL
ApacheBench
2 Instruction Sets
x86 Instruction Set
Background
Initial Instruction Set
ISA Extensions
ARM Instruction Set
Background
Initial Instruction Set
ARMv7
ARMv8
3 Server System Technology
Basic Server Architecture
Main Memory
System-Logic Chipset
Baseboard-Management Controller
Storage
RAID
Storage Interfaces
High-Performance Computing
InfiniBand
RDMA Over Ethernet
MPI and OFED
Networking
Storage Networking
Form Factors
Operating Systems
Windows Server
Linux Server
Virtualization
Hypervisor Software
4 Technology and Market Trends
Technology Trends
x86 Versus ARM
SoC Integration
The Main-Memory Bottleneck
Microservers
Cloud-Computing Workloads
High-Performance Computing
Market Outlook
Cloud Computing
Open Compute
Market Forecast and Segmentation
Market Share
5 Intel
Company Background
Key Features and Performance
Sandy Bridge-Based Xeon Processors
Ivy Bridge-Based Xeon Processors
Westmere-Based Xeon Processors
Itanium Processors
Internal Architecture
System Design
Product Roadmap
Ivy Bridge
Haswell
Itanium
Atom-Based Processors: Centerton
Conclusions
6 AMD
Company Background
Key Features and Performance
Internal Architecture
System Design
Product Roadmap
Conclusions
7 Tilera
Company Background
Key Features and Performance
Internal Architecture
System Design
Development Tools
Product Roadmap
Conclusions
8 ARM
Company Background
Key Features and Performance
Internal Architecture
System-on-a-Chip Design
Development Tools
Product Roadmap
Conclusions
9 AppliedMicro
Company Background
Key Features and Performance
Design Details
Conclusions
10 Calxeda
Company Background
Key Features and Performance
Design Details
Product Roadmap
Conclusions
11 Cavium
Company Background
Key Features and Performance
Conclusions
12 Marvell
Company Background
Key Features and Performance
Internal Architecture
System Design
Development Tools
Product Roadmap
Conclusions
13 HPC Coprocessor Vendors
Intel Xeon Phi
Company Background
Key Features and Performance
Internal Architecture
Programming Model and Tools
Conclusions
Nvidia Tesla
Company Background
Key Features and Performance
Design Details
Product Roadmap
Conclusions
14 Processor Comparisons
Microserver Processors
Performance
Integration
Uniprocessor Platforms
Performance
Integration
Two-Socket Platforms
Performance
Integration
Four-Socket Platforms
Performance
Integration
Conclusions
15 Conclusions
Vendor Outlook
Intel
AMD
Tilera
ARM
Calxeda
Marvell
AppliedMicro and Cavium
HPC Coprocessors
Closing Thoughts
Appendix: Further Reading
Index
Figure 1‑1. Basic processor design.
Figure 1‑2. Simple superscalar processor design.
Figure 1‑3. CPU pipelining examples.
Figure 1‑4. Block diagram of a typical server processor.
Figure 1‑5. Interleaved tasks on a multithreaded CPU.
Figure 3‑1. Typical server architecture.
Figure 3‑2. Rack-mount servers and a standard-size rack.
Figure 3‑3. IBM's BladeCenter H.
Figure 3‑4. Typical blade-server architecture.
Figure 4‑1. Dell's PowerEdge C5000 12-bay microserver.
Figure 4‑2. Server-processor shipment forecast and segmentation.
Figure 4‑3. Server-processor unit share, 2009-2011.
Figure 5‑1. Intel server-processor roadmap.
Figure 5‑2. Block diagram of Intel Sandy Bridge microarchitecture.
Figure 5‑3. Block diagram of Intel Xeon E5-2600.
Figure 5‑4. Server design based on Intel Xeon E3-1200 v2.
Figure 5‑5. Dual-socket server design based on Intel Xeon E5-2600.
Figure 5‑6. Four-socket server design based on Intel Xeon E7.
Figure 6‑1. Diagram of Bulldozer CPU module.
Figure 6‑2. Block diagram of Bulldozer microarchitecture.
Figure 6‑3. Opteron 4200 system design.
Figure 6‑4. Opteron 6200 system design.
Figure 7‑1. Block diagram of Tilera Tile-Gx3036.
Figure 7‑2. Tilera Tile-Gx3036 server design.
Figure 8‑1. Block diagram of Cortex-A15 microarchitecture.
Figure 8‑2. Block diagram of ARM Cortex-A15 in an SoC.
Figure 9‑1. Block diagram of X-Gene CPU.
Figure 9‑2. Block diagram of X-Gene SoC.
Figure 10‑1. Block diagram of Calxeda ECX-1000.
Figure 10‑2. Block diagram of Calxeda four-node system.
Figure 11‑1. Conceptual block diagram of Cavium Thunder.
Figure 12‑1. Microarchitecture of Marvell PJ4B.
Figure 12‑2. Block diagram of Armada XP.
Figure 13‑1. Xeon Phi coprocessor card.
Figure 13‑2. Microarchitecture of Intel Xeon Phi core.
Figure 13‑3. Simplified block diagram of Xeon Phi coprocessor.
Figure 13‑4. Tesla GK110 SMX array.
Table 1‑1. Selected SPEC benchmarks.
Table 5‑1. Product lines and selected versions of Intel Xeon processors.
Table 5‑2. Key parameters for selected Intel Xeon E5-series processors.
Table 5‑3. Key parameters for selected Intel Itanium 9300-series processors.
Table 6‑1. Key parameters for selected Opteron processors.
Table 6‑2. Key parameters for Bulldozer-based Opteron processors.
Table 6‑3. Key parameters for AMD SR56x0 north-bridge chips.
Table 6‑4. Key parameters for AMD SP5100 south-bridge chip.
Table 7‑1. Key parameters for Tilera TilePro64 and Tile-Gx3036 processors.
Table 8‑1. Key parameters for ARM Cortex-A9 and Cortex-A15.
Table 10‑1. Key parameters for Calxeda ECX-1000 processor.
Table 12‑1. Key features for Armada XP processors.
Table 13‑1. Key parameters for Xeon Phi coprocessor cards.
Table 13‑2. Key parameters for Nvidia Tesla coprocessor cards.
Table 14‑1. Comparison of microserver processors.
Table 14‑2. Comparison of high-performance single-socket processors.
Table 14‑3. Comparison of processors for dual-socket servers.
Table 14‑4. Comparison of processors for four-socket servers.

Events

Carrier Conference 2013
Register Now!
June 12 - 13, 2013
Register Now!
Processor Conference 2013
Covers processors and CPU Cores used in networking and communications designs
October 16 - 17, 2013
More Events »

Newsletters

Linley Wire
Analysis and news on processors for networking and communications
Linley on Mobile
Analysis and news on semiconductors for mobile and wireless
Processor Watch
Analysis of high-performance microprocessor developments
Subscribe to our Newsletters »