Making robots more efficient with adaptive computing
As roboticists encounter limitations imposed by traditional processor architectures, customisation and parallelism are needed to meet forthcoming performance, security, and safety challenges. Victor Mayoral-Vilches, former Systems Architect at Xilinx, AMD Consultant, and Founder of Acceleration Robotics (previously Robotics System Architect, Adaptive & Embedded Computing Group, AMD discusses).
Software developers targeting robotics applications face a growing struggle to meet performance requirements, ensure real-time determinism, and ensure adequate safety and security. Increasingly, the general-purpose nature of the scalar (CPU) processor architectures at the heart of the machine, together with limitations on performance scaling, presents a barrier to meeting the diverse requirements placed on today’s industrial robots. Common problems include time inefficiencies that impact determinism, excessive power consumption, and security issues. A further challenge to security is that the hardware cannot be reconfigured to update protection against evolving cyber-threats.
A new generation of computing platforms, better suited to the demands of robotics, is now emerging. These modules comprise heterogeneous processing elements that allow roboticists to build flexible compute architectures. This article assesses their make-up by examining the various compute resources that are available to roboticists, including CPUs, DSPs, GPUs, FPGAs, and ASICs. Each has specific strengths and therefore a continued role as the evolution of robotics technology progresses.
Compute technologies for robotics applications
Scalar processors (CPUs)
CPUs, as scalar processing elements, can handle complex algorithms with diverse decision trees and a broad set of libraries in an efficient manner. Although CPUs are highly flexible, and multi-core processors can handle different tasks running simultaneously without distractions or coordination problems, their underlying hardware is fixed. Most CPUs still operate on the stored-program computer principle, where data is brought to the processor from memory, operated on, and then written back to memory. The focal point of the architecture is the arithmetic logic unit (ALU), which requires data to be moved in and out for every operation. Fundamentally, each CPU operates in a sequential fashion, one instruction at a time, and many steps are typically needed to complete a task. Despite these drawbacks, scalar CPUs have a fundamental role in modern robot architectures. They are well suited to coordinating information flows across the various subsystems and components for sensing, actuation, and cognition.
Vector processing elements (DSPs, GPUs) are more efficient at a narrower set of parallelisable compute functions, but they experience latency and efficiency penalties because of inflexible memory hierarchy.
GPU architectures contain large numbers of cores that are optimised to do a few specific tasks. It is most efficient for them to execute these simultaneously and concurrently. Hence vector processors overcome one of the major drawbacks of CPUs in robotics with their ability to process large amounts of data in parallel.
Programmable logic (FPGAs)
Unlike processors that contain general-purpose processing units and memory structures, programmable logic (FPGAs) can be precisely customised as needed to perform a particular compute function. Although highly effective for latency-critical real-time applications, this imposes extra programming complexity. Also, reconfiguration and re-programmability have longer compile times when compared to scalar and vector processors.
Robot designers can use FPGAs to create runtime-reconfigurable robot hardware that can be re-programmed and adapted via software. These engines can handle data-flow computations quickly and efficiently and so are well suited to uses such as interfacing with sensors and actuators, as well as dealing with networking aspects. Designers can also create custom hardware acceleration kernels to handle data-processing tasks they would otherwise need to assign to vector processors.
Application-Specific Integrated Circuits (ASICs)
In an ASIC, the processing elements can be customised, as with an FPGA. However, once determined it cannot be modified. This fixed architecture allows unmatched performance and power efficiency, as well as the best prices for high-volume mass production. On the other hand, ASICs can take months or even years to develop and do not allow for any changes. They cannot adapt to ensure the robot will keep up with future productivity enhancements.
Adaptability is important because robotic algorithms and architectures are still evolving rapidly. An ASIC-based accelerator could fall significantly behind the state-of-the-art algorithms. Given the time taken to develop the ASIC, this could begin to happen soon after – or even before – it enters production. At this stage of the robot-technology lifecycle, their use is limited.
Realising adaptive computing in robotics
Robots are networks of networks that exchange data on a continuous basis throughout the entire machine, from sensors, to compute engines, and back to the actuators at their extremities. We can visualise these networks as the nervous system of the robot, which facilitates exchanging information. As in the human nervous system, these exchanges are critically dependent upon deterministic performance and real-time responsiveness if the robot is to behave coherently. This is difficult to guarantee using scalar and vector processors, with their fixed architectures.
The customised, highly parallel architectures implemented in FPGAs and ASICs offer the opportunity to overcome these limitations.
The FPGA, in particular, by enabling software-defined hardware for robots, introduces a fundamental shift in the approach to software development in robotics. Instead of programming functionality in the CPU, working within the limitations imposed by the CPU’s pre-defined architecture and constraints, building a robotic behaviour with FPGAs is about programming an architecture that performs the desired task.
Roboticists need suitable tools and hardware to properly leverage the flexibility of FPGAs when building adaptable robots that exhibit deterministic, real-time behaviour. A System-on-Module (SOM) like the Xilinx Kria K26 is one example, designed for edge applications and with high-speed interfaces, memory, and power on-board. It contains a Zynq UltraScale+ MPSoC System-on-Chip (SoC) that provides programmable logic cells and DSP slices while handling scalar and vector processing workloads with a quad-core application processor complex, dual-core real-time processor, and a 2D/3D GPU.
In addition to the SOM, appropriate libraries and utilities are needed to build industrial-grade robotics solutions. The Kria Robotics Stack (KRS) (Figure 2) is tightly integrated with the Robot Operating System (ROS), which is the de facto framework for robot application development, and simplifies the use of hardware acceleration. The SOM provides native support for ROS 2, which boosts performance in robotics and industrial automation applications.
This stack uses the ROS 2 Software Development Kit (SDK) and works with the ROS 2 ecosystem to help build robot systems with deterministic, real-time performance using a modular approach. It leverages known techniques like Quality of Service (QoS) mechanisms and Time Sensitive Networking (TSN) and includes application-level acceleration kernels, ROS communication middleware, and a runtime tool that facilitates interactions with the FPGA. A hypervisor helps support mixed criticality using virtual machines.
Adaptive, accelerated computing leveraging FPGAs can enhance the performance of industrial robots while also improving energy efficiency and permitting future-proof flexibility and security. Realising these next generations of machines requires suitable hardware, such as SOMs that combine FPGA logic with scalar processors and GPUs, as well as software and tools that can be easily used with a framework familiar to roboticists, such as ROS 2.