Cerebras Breaks the Reticle Barrier

September 3, 2019

Author: Mike Demler

Cerebras’s new neural-network processor is by far the largest integrated circuit ever manufactured. Working with TSMC, the startup developed technology that lets it integrate 400,000 neural-network processing elements (PEs) comprising more than 1.2 trillion 16nm transistors on one silicon substrate. It calls the design a wafer-scale engine (WSE), owing to its use of almost an entire 300mm (12-inch) wafer to fabricate 84 connected die. Normally, scribe lanes isolate each die from its neighbors, but the Cerebras technology enables metal deposition across those barriers, allowing connection of the entire array through a massive 2D mesh network called Swarm. The WSE measures 215mm on each side (46,225mm2)—the largest square that can be cut from a 300mm wafer.

The device targets neural-network training in data centers, where it seeks to compete with powerful Nvidia GPUs. It distributes 18GB of SRAM to the PEs, eliminating external DRAM. The mesh fabric delivers 1017 bits per second (100 petabits per second, or 100Pbs) of memory bandwidth, enabling the WSE to replace a high-performance compute cluster. A host processor must manage operations, but by combining its high-bandwidth interconnect and massive compute parallelism, a single WSE can train multiple instances of the largest neural networks in common use.

Because of the complex system design, the company won’t offer the WSE as a separate component. Instead, it’ll deliver a 15U rack-mounted cabinet, with one rack holding the device and the remainder housing custom power and cooling components. Several potential customers are evaluating the system, but no production schedule is yet public.

