» Current | 2020 | 2019 | 2018 | Subscribe

Linley Newsletter

Arm Joins Bfloat Parade

October 15, 2019

Author: Linley Gwennap

Arm will support the Bfloat16 (BF16) data format for neural-network training, joining Google and Intel. In a recent blog post, the company said it has added instructions to its Arm v8.6-A instruction set (ISA) to handle BF16, which doubles throughput for the matrix-multiplication operations at the heart of most AI models. The first processor with the new instructions will be the Neoverse N2 “Zeus” CPU, which is due next year. We also expect Arm to eventually add BF16 to its GPUs and deep-learning accelerators (DLAs).

Google’s TPUs are widely used through its cloud services, but the company doesn’t sell chips. Intel added BF16 capability to its newest server processors, and it plans to add BF16 to future products including Nervana AI accelerators (Spring Crest and Spring Hill) as well as next-generation Agilex FPGAs. Most training-accelerator startups, including Cerebras and Graphcore, don’t support the format. Nvidia GPUs are by far the most popular for training, and the company hasn’t endorsed the new format.

Rather than extend its existing Neon instructions to operate on BF16 values, Arm created a few new instructions that specifically target neural networks. For example, BFDOT and BFMMLA respectively multiply 1x2 and 2x2 matrices of BF16 values preloaded into Neon registers. The output values employ FP32 format to avoid overflow, but they’re easily converted (truncated) to BF16 using BFCVT. The BFMLAL instruction multiplies pairs of adjacent BF16 values and adds the products to an FP32 accumulator. Arm is also planning similar instructions for CPUs that implement its long-awaited Scalable Vector Extensions (SVE), but not for Cortex-M, whose dot-product instructions focus on inference tasks.

Subscribers can view the full article in the Microprocessor Report.

Subscribe to the Microprocessor Report and always get the full story!

Purchase the full article


Linley Spring Processor Conference 2020
Coming April 7-8, 2020
Hyatt Regency, Santa Clara, CA
Register Now!
Linley Fall Processor Conference 2020
Coming October 28-29, 2020
Hyatt Regency, Santa Clara, CA
More Events »


Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »