Cadence Q6 DSP Merges AI and Vision

April 24, 2018

Author: Mike Demler

Although the trend in AI processors is to offload CPUs, DSPs, and GPUs by using special-purpose neural-network hardware, Cadence tenaciously holds to the fully programmable approach. The company’s new Vision Q6 DSP core builds on previous generations by implementing an ISA that combines computer-vision, image-processing, and neural-network functions.

The new Q6 is backward compatible with the Vision P6 and P5. It integrates the same 256-MAC complement as the P6, but a modified microarchitecture delivers 50% more average performance on a variety of imaging kernels. Most of that gain comes from a longer 10/13-stage pipeline, which supports a 1.5GHz maximum clock frequency compared with 1.1GHz for the 10-stage P6 in the same 16nm process. The base scalar pipeline is 10 stages, but after the fifth-stage instruction decoder, the vector operations dispatch to a separate 8-stage execution pipeline.

The Q6 ISA offers improvements to AI, computer-vision, and image-processing operations, including new pack/squeeze functions, enhanced multiply and subtract operations, and modifications that double histogram performance relative to the P6. It adds more gather registers for handling pixel data, along with slotting enhancements for better VLIW scheduling. Despite the design upgrades, Cadence estimates the Q6 can deliver its performance boost in the same die area as its predecessor, although the company withheld the area.

For high-performance DNN acceleration, designers will probably choose a special-purpose accelerator. But unlike more-specialized cores, the Q6 enables programs written in a variety of coding frameworks, and it can run a broad range of neural-network models. It balances programmability and DNN acceleration, appealing to designers that require the flexibility to run both image-processing and computer-vision routines.

