Oracle Extends SPARC CommitmentSeptember 19, 2011
Author: Bob Wheeler
Squashing any lingering concerns about its commitment to the SPARC architecture, Oracle has begun beta testing servers using its next-generation Sparc T4 processors. At Hot Chips, the company disclosed details of the new CPU core used in this processor as well as future Oracle processors. Formerly code-named Yosemite Falls, Sparc T4 is the successor to Sun’s Rainbow Falls, which Oracle branded Sparc T3. Reversing Sun’s prior direction of increasing the number of threads per processor, Sparc T4 instead prioritizes greater single-thread performance over throughput. The result is a processor with eight CPUs operating at 3.0GHz compared with the T3’s 16 CPUs operating at a lowly 1.65GHz.
Dubbed the S3, Oracle’s new CPU core is far more sophisticated than the relatively simple T1/T2/T3 CPUs from the Sun Niagara lineage. Designed from scratch, the S3 is a dual-issue out-of-order design with extensive branch prediction. Like the T3 CPU, the S3 supports eight threads, yielding 64 threads in the eight-core T4 implementation. Like Sparc T3, Sparc T4 includes two memory controllers, two PCI Express v2.0 ports, and two 10G Ethernet ports, as well as coherent-bus interfaces to support system designs with up to four sockets (4P).
Oracle’s internal OLTP benchmark results show Sparc T4 delivers about the same transaction-processing throughput as Sparc T3 despite using half as many CPUs/threads. The T4 delivers single-thread integer performance, however, about 5× that of the T3 as measured by SPECint2006. For floating-point workloads, the T4’s SPECfp2006 score is about 7× that of the T3. Unfortunately, Oracle disclosed only relative performance rather than actual scores, making it impossible to compare Sparc T4 with other vendors’ server processors. In any case, the company appears to have preserved the throughput performance of its processors while making a major improvement in what had been poor single-thread performance.
The new Sparc T4 design illustrates the ongoing struggle of processor designers to balance single-thread performance versus throughput on highly threaded code. Not all workloads scale with cores and threads, and response times can suffer if individual threads run too slowly. For Oracle, the T4 is a relatively low-risk implementation that should yield more application-agnostic server designs. It also sets the stage for the 16-core Sparc T5 design, which is already under development in 28nm.