January 7, 2026
Article
As edge systems evolve to support AI-driven perception, high-speed data acquisition, and real-time control, a single processing architecture is often no longer sufficient. Many applications demand deterministic, low-latency control and I/O handling alongside high-throughput AI inference, creating a need for heterogeneous compute architectures. To address this challenge, iWave has validated a PCIe Gen3 x4 integration of Zynq UltraScale+ MPSoC and NVIDIA Jetson Orin, an architecture that combines FPGA-based real-time processing with GPU-accelerated AI.
In this integration, Zynq™ UltraScale+™ MPSoC acts as the real-time and data-ingestion engine, handling high-speed sensor interfaces, deterministic processing, and pre-processing in programmable logic. The NVIDIA® Jetson™ Orin Nano complements this by providing high-performance GPU and Tensor Core acceleration for compute-intensive AI workloads such as deep learning inference, vision analytics, and signal classification. Connected via a high-bandwidth PCIe interface, the solution enables efficient data movement between the two domains, allowing each processor to operate where it performs best.
Jetson Orin Nano functions purely as a PCIe-attached AI accelerator, offloading >95% of DL compute from the MPSoC. The MPSoC handles deterministic control, high-speed I/O, and data pre-processing in programmable logic, while AI-intensive workloads such as deep-learning inference are offloaded to a GPU-based platform.
The architecture represents, Jetson Orin Nano operating as a PCIe Root Port, while ZU11EG SoM is configured as a PCIe Endpoint. The two platforms are interconnected using an M.2 NVMe edge-card interface with appropriate board-level adaptations.
On the software side:
This PCIe link enables low-latency, high-bandwidth data movement between the FPGA fabric and the Jetson GPU, allowing compute-intensive workloads such as deep learning inference and vision processing to be offloaded efficiently to the GPU.
To validate the heterogeneous compute architecture, iWave implemented an end-to-end object detection pipeline using the YOLOv3 deep learning model.
In this setup:
This closed-loop pipeline demonstrates low-latency operation and efficient workload partitioning between deterministic FPGA processing and GPU-based AI acceleration.
MPSoC and Jetson Integration
Output Image after YOLOv3 post processing
This Jetson + Zynq™ UltraScale+™ MPSoC architecture is well suited for a wide range of real-time and AI-driven applications:
The integration of iWave’s ZU11EG-based Zynq™ UltraScale+™ MPSoC SoM with the NVIDIA Jetson Orin Nano over PCIe creates a powerful heterogeneous compute platform for next-generation edge AI systems. By combining deterministic FPGA-based processing with high-performance GPU acceleration, this solution enables scalable, low-latency, and AI-enabled data pipelines for demanding applications in vision, robotics, instrumentation, and communications.
iWave Global is a leading provider of embedded computing solutions, FPGA System on Modules, and ODM design services. With over 26 years of engineering excellence, iWave specializes in high-performance SoMs built on cutting-edge processor technologies. Through deep domain expertise in FPGA, RF, AI, and edge compute architectures, iWave partners with global OEMs to accelerate product development, reduce technical risk, and deliver reliable solutions for mission-critical applications.
For more information, reach out to us through mktg@iwave-global.com
We appreciate you contacting iWave.
Our representative will get in touch with you soon!