Synopsys has announced that it has enhanced the convolutional neural network (CNN) engine in its DesignWare EV6x Vision Processors to address the increasing video resolution and frame rate requirements of high-performance embedded vision applications. The CNN engine delivers up to 4.5 TeraMACs per second when implemented in 16-nanometer (nm) FinFET process technologies under typical conditions, four times more performance than Synopsys' previous CNN engine.
It also supports both coefficient and feature map compression/decompression to reduce data bandwidth requirements and decrease power consumption. The vision CPU scales from one to four vector DSPs and operates in parallel to the CNN engine, delivering maximum throughput for a broad range of high-performance embedded vision applications such as advanced driver assistance systems (ADAS), video surveillance, augmented and virtual reality, and simultaneous localisation and mapping (SLAM).
"The technological demands at the heart of embedded vision applications, in the neural network, require solutions that deliver the combination of high precision and performance with low power and area use," said Toshi Torihara, vice president at Morpho US, Inc.
"The unique combination of the vector DSPs and programmable CNN engine in the DesignWare EV6x Vision Processor enables developers to implement vision functionality in their embedded devices with much higher performance efficiency than CPU- and GPU-based alternatives."
The DesignWare EV6x Processor family integrates scalar, vector DSP and CNN processing units for highly accurate and fast vision processing. The EV6x supports any convolutional neural network, including popular networks such as AlexNet, VGG16, GoogLeNet, Yolo, Faster R-CNN, SqueezeNet and ResNet.
Designers can run CNN graphs originally trained for 32-bit floating point hardware on the EV6x's 12-bit CNN engine, significantly reducing the power and area of their designs while maintaining the same levels of detection accuracy.
The engine delivers power efficiency of up to 2,000 GMACs/sec/W when implemented in 16-nm FinFET process technologies (worst-case conditions). The EV6x's CNN hardware also supports neural networks trained for 8-bit precision to take advantage of the lower memory bandwidth and power requirements of these graph types.
To simplify software application development, the EV6x processors are supported by a comprehensive suite of tools and software. The latest release of the DesignWare ARC MetaWare EV Development Toolkit includes a CNN mapping tool that analyses neural networks trained using popular frameworks like Caffe and Tensorflow, and automatically generates the executable for the programmable CNN engine.
For maximum flexibility and future-proofing, the tool can also distribute computations between the vision CPU and CNN resources to support new and emerging neural network algorithms as well as customer-specific CNN layers.
Combined with software development tools based on OpenVX, OpenCV and OpenCL C embedded vision standards, the MetaWare EV Development Toolkit offers a full suite of tools needed to accelerate embedded software development.
"As high-performance neural networks become more prevalent in artificial intelligence applications, designers require both the hardware technology and software tools to deliver their vision-enabled SoCs on schedule," said John Koeter, vice president of marketing for IP at Synopsys.
"With the performance and feature enhancements to the silicon-proven EV6x Vision Processors, designers can more efficiently design and deploy machine learning-based applications with the performance and power efficiency necessary to differentiate in their markets."
The DesignWare EV61, EV62 and EV64 processors with enhanced optional CNN engine are scheduled to be available in August 2017. The MetaWare EV Development Toolkit is available now. Support for the TensorFlow framework in the Toolkit's CNN mapping tool is scheduled to be available in October 2017.