Page 15 - TU Dresden - StuFoExpo 2024: Book of Abstracts
P. 15
Exploration of a HW/SW FPGA-based Co-Design of a Visual
SLAM Algorithm
Arora, Bhavay| bhavay.arora@mailbox.tu-dresden.de
Faculty of Electrical and Computer Engineering, Technische Universität Dresden
_____________________________________________________________________________________________________
Are you fascinated by the technology behind autonomous vehicles, drones and augmented reality?
At the core of these innovations is Simultaneous Localization and Mapping (SLAM), a crucial pro-
cess that allows devices to navigate unknown environments while creating detailed maps and de-
termining their precise location. This research focuses on Visual SLAM, particularly the ORB-SLAM3
algorithm, which tackles this challenge by extracting Oriented FAST and Rotated BRIEF (ORB) fea-
tures from images—distinct patterns that enable the algorithm to identify and track key points,
facilitating accurate localization and mapping. While ORB-SLAM3 is renowned for its performance
and accuracy, achieving real-time processing on embedded devices remains challenging.
This work aims to accelerate the ORB-SLAM3 algorithm on the KRIA KR260 FPGA, a powerful plat-
form designed for robotics and industrial applications, by using a hardware/software co-design
approach. FPGAs are well-suited for this task compared to traditional CPUs and GPUs due to their
ability to provide low-energy, high-performance computing through parallel processing, making
them ideal for embedded systems.
Profiling tools like Chrono and Callgrind were used to identify the most time-consuming stages in
ORB-SLAM3. The ORB feature extraction component, which consumes 68% of the runtime in the
Tracking stage, was targeted for acceleration. Using a High-Level Synthesis (HLS) approach, which
automatically converts C/C++ algorithms into hardware descriptions that can run efficiently on INGENIEURWISSENSCHAFTEN
FPGAs, an accelerator was designed to speed up this component. The accelerator was integrated
with a user-space DMA buffer (u-dma-buf) to enable efficient data transfer between the hardware
and software components. The system’s performance was evaluated using the EuRoC Micro Aerial
Vehicle (MAV) dataset, computing metrics such as Absolute Trajectory Error (ATE) and Root Mean
Square Error (RMSE) to assess accuracy.
The hardware/software co-design achieved a 10x speedup in the ORB feature extraction stage
compared to the software-only version, while maintaining high accuracy with an RMSE value of
0.0168. Ongoing work focuses on accelerating other computationally expensive stages of ORB-
SLAM3, particularly Local and Global Bundle Adjustment, which are optimization processes essen-
tial for refining the 3D map and camera pose. This is crucial for unlocking additional performance
improvements, making ORB-SLAM3 more viable for real-time applications with limited resources.
Overall, this research highlights the potential of FPGA-based acceleration in enhancing the effi-
ciency and real-time performance of visual SLAM algorithms like ORB-SLAM3.
10