Bridging the FPGA and GPU worlds

In today’s world of high-performance embedded computing, systems need to handle massive data streams in real time, making every nanosecond matter. TECHWAY is pushing these limits by integrating NVIDIA’s GPUDirect RDMA technology into its PCIe FPGA platforms, providing a direct and highly efficient data path between acquisition hardware and GPU processing.

This approach opens the door to a new generation of responsive, low-latency applications across defence, aerospace, industrial systems, AI acceleration, and embedded vision.

TECHWAY’s PCIe solutions, built on AMD/Xilinx Kintex-7, and UltraScale+ FPGAs, integrate GPUDirect RDMA as a native feature within a unified development kit. This common PCIe architecture is shared across all TECHWAY boards, enabling developers to move seamlessly between FPGA platforms while keeping the same software environment. This consistency simplifies integration, shortens development cycles and strengthens long-term maintainability. With the TECHWAY driver compiled with the NVIDIA kernel module, data is sent directly from the FPGA card’s memory to GPU memory over PCIe, bypassing both the host CPU and system memory. This significantly reduces latency and improves overall data transfer efficiency, enabling real-time processing and advanced applications such as AI, image, and signal processing directly on incoming data.

A key advantage of GPUDirect RDMA is its ability to shift processing workloads from the FPGA to the GPU. The FPGA can then focus on real-time data acquisition and essential front-end tasks such as packetisation and preprocessing, before streaming data directly into GPU memory. GPUDirect RDMA also takes full advantage of the GPU’s massively parallel architecture, its high-bandwidth GDDR memory and its optimised processing cores. These strengths make the GPU particularly effective for handling large datasets and repetitive, high-throughput operations, ideal for fast and adaptive data processing in demanding environments.

By combining a TECHWAY FPGA board with an NVIDIA GPU, the system achieves maximum computing efficiency while minimising FPGA logic usage, all while maintaining consistently low data-transfer latency with GPUDirect RDMA. This approach also makes TECHWAY’s FPGA product range suitable for most field applications, providing the flexibility and performance required across a wide variety of field applications. For developers, this means moving tasks initially done in firmware to software-based development, similar to approaches used in High-Performance Computing (HPC). This shift benefits from the support of NVIDIA’s developer community and open-source toolkits, including CUDA tools.