The AI acceleration with the Versal Xilinx platform
Xilinx showed a range of new products to leverage AI applicationsat an event in Frankfurt, Germany. Speakers at the event included Ivo Bosens, Xilinx CTO, and Kirk Saban, Xilinx VP of Product and Technical Marketing.
Xilinx’snew products arebeing driven by a set of new applications that are characterised by the amount of dataassociated with them. Data is now a dominating factor of how we use technology. Data can be structured differently and future technologies will be focused on these architectural innovations.
Xilinx analyzed the needsof modern developers – they need to behighly flexible in their developments, have adaptable hardware, performance in a diverse range of applications,havesoftware programmability, and adaptability to keep pace with rapid innovation. Thisbrings new challenges for the hardware and the software.
Following the developments from FPGAs to the RFSoCs, Xilinx announced the ACAP family (Adaptive Computer Acceleration Platform). The designer is offered flexibility from the software and the hardware application. Xilinx provides not only the silicon, but the whole platform – hardware, software,andlibraries. So the developer can compute acceleration. The architecture brings together different aspects of the programming. One platform contains scalar engines, traditional CPU architectures,and real-time processors that deploy control oriented and decision-making complex algorithms. They are combined with the set of adaptable engines and programmable logic.
The new products combine to formthe Versal platform. It provides acombination of the three engines, and hasa parallel architecture, making the product universal, so a set of problems can be solved. The architecture can be applied toa set of power-sensitive and high-performance infrastructure applications. It combines ease of use of software programmability and hardware adaptability. The ACAP products are heterogeneous, and are scallable with programmable software.
The NoC (Network-on-Chip) shows the ease of use with programmable software, available at boot, where no place-on-root is required. The NoC performs at high bandwidth and low latency with multi-terabit/sec throughput and guaranteed QoS. The power efficiency is eight times bigger compared to soft implementations. This adajustability across heterogeneous engines is essential.
Another technological breakout are the AI engines, which are an array of hundreds of highly efficient processors which will allow the most challenging machine learning applicationsto be realised. They can be programmed with C++, and are time integrated, demonstrating high throughput, low latency and power efficiency. They are well suitedfor AI interference and advanced signal processing. Versal is the NoC for the AI engines.
When changing the architecture the software is developed to provide a highly adaptable environment and flexible hardware. Providing the relevant libraries in different domains, Xilinx offersmore high-level programming frameworks inthe domain of machine learning and other applications. Programming is possible with C, C++, and Phyton.
Versal is ahigh-performance scallable platform which improves the functionalities of data centres. It also targets tasks on the edge. The computation is powerful, low cost, and understands the sensitivity of applications on the edge. The NoC is leveraging theseedge applications. Versal offers the platform for any developer to combine the heterogeneous platform (with scalar engines, adaptable engines, intelligent engines, operation systems, embedded run-time and custom hardware), with the programmable software that is the new unified software development system. The Versal platform is adaptable for any platform, performing application-specific frameworks for machine learning, searches, databases etc. Versal is amulti-market platform, that is applicable todatacentres, networking and the edge.
The big trend now is the projected growth in AI interference. There is a lot going on in the field of machine learning,andthe next stage will be the deployment of AI applications. The challeges are the rate of innovation (new algorithms, neural networks and so on), the performance at low latency, low powerconsumption and acceleration. The rate of innovationhasresulted in a wide range of machine learning neural networks – CNN, RNN, LSTM, MLP. Thesenetworks are changing very quickly, and so the hardware should be very adaptable. Latency becomes a very important component for applications like machine learning and the edge.
The requirements for adaptable hardware are custom data flow, custom memory hierarchy and custom precision. This leads to the domain specific architecture (DSAs) on adjustable platforms. The Xilinx solutions are characterised with low latency, which allows theleveraging of AI engines.
Versal is suitable for hardware developersand system integrators to data scientists and application developers. Xilinx also offers the Alveo data centre accelerator cards, targeted to reach the needs of the developers.
The detailed explanation of the Versal, Adaptive Compute Acceleration Platform (ACAP) was delivered by Kirk Saban, VP of Product and Marketing.
The Versal platform has multiple types of engines, which are designed to service any type of application. Development of this product is based on the partnership with the TSMC fab. Versal includes the next generation of Arm Cortex-A72 Application Processor, Arm Cortex-R5 Real-Time Processor (functional safety and security applications) and the new platform management controller (PMC), which is designed to configure without any need of hardware control. The PMC allows users to managethe device through the software, and enables the whole ecosystem to run without RTL flow in the device.
Another advantage of the Versalrangeis the adaptable hardware engines. Here foundational hardware is re-architectered for bigger compute density; it enables the custom memory hierarcy; and has eight times faster dynamic reconfiguration, that suits to a veriety AI applications emerging now.
In the Versal range there are two types of intelligent engines applied:
- DSP engines, that enable a high-precision floating point and low latency; and granular control for customised datapaths.
- AI engines, designed for high throughput, low latency, power efficiency. These are well suitedfor AI interference and advanced signal processing. The key point of the AI engines is that they are a tightly coupled array of vector processors, having the localised memory in each vector core. This brings a very high capability to the application.
The AI engines add interference capability to the adaptable hardware.And at the same time, it is software programmable and hardware adaptable.
Xilinx has also designed a variety of diferent types of interface technology, that are at the cutting edge of the semiconductor industry. The host interfaces havePCIe Gen4x16, integrated AXI-DMA, and CCIX for acceleration to server-class CPUs. From the memory standpoint, Versal supportsthe latestmemory technologies-DDR4-3200, LPDDR4-4266 and high bandwidth memory (HBM). The collaboration with TSMC allows Xilinx to deliver the single-chip solution, with integrated, stacked HBM in a single package. The Versal platform contains the integrated protocol engines, supporting the 100G multi-rate Ethernet, 600G Ethernet,and600G cryptographic engines (AES/IPSEC/MACSEC) for security usecases.
From the transciever perspective theVersal products support the cutting edge 32G power optimised transciever for edge applications, 58G PAM4 technology, and 112G PAM4 in the high-end products of the Versal platform.
Versal supportsintegrated direct RF signal chain including the next-generation multi-GPS direct RF-ADC/DAC, integrated DDC/DUC and SD-FEC for 5G and DOCSIS. Versalalsoincludesprogrammable I/O interfaces including MIPI D-PHY>32 Gb/s for sensors, supports NAND and storage-class memory and traditional LVDS and general-purpose I/O.
The Versal is based on the Network-on-Chip concept, whichprovides a high-speed connection of different levels of heterogenious processes together. It makes the Versal platform software programmable, available at boot, with high bandwidth and low latency, with multi-terabit/s throughput and guaranteed QoS; and significant power efficiency .
Versal is designed as a multi-market platform, applicable for the AI adoption across the edge, network and cloud. In order to successfuly perform these functions, Versal is presented with aset of products that share the same fundamental blocks. Here the Versal platform is architected with the AI products - AI Core – a broad range of AI engines, AI Edge – with smaller AI engines, AI RF. Versal is also available in the Versal Prime, Versal Premium and Versal HBM series. The AI Core andVersal Prime will be available inQ2 2019. The next series will be available the following year.
Speaking about the Prime series, it is designed for the broad range of applications. It does have the DSP engines instead of AI engines, and has the adaptable hardware, scalar processor concept, well suited for connectivity applications, for inline acceleration and diverse workloads. Another application for the Versal Prime series are communication test equipment, data centre network and storage acceleration, Nx100G Ethernet and OTN networking, broadcast switching, imaging, and avionics control.
The AI core series deploys abreakthrough AI interference throughput, well-optimesd for the cloud, automotive applicationsand networking applications, has a very wide dynamic range of how much interference and work-load acceleration can be provided, and has a large amount of AI engines through the platform.
Finally the Versal platform includes the ACAP system which has been designed to be software programmabile, hardware adaptable and provides the application acceleration, whichmakes it suitable for heterogeneous acceleration, for any application and any developer (hardware, software, data scientist and others).