-
摘要:
自主定位是移动机器人实现完全自主移动的基础,其中的关键技术同步定位与建图(SLAM)备受关注。惯性测量单元(IMU)因具有不受外界环境影响的优势广泛应用在SLAM系统中。SLAM后端使用非线性优化方法优化IMU的状态,实际应用中存在实时性差、能耗高的问题。为此,在现场可编程门阵列(FPGA)平台上设计了用于IMU状态优化的加速器。分析算法的数据流程实现电路和数据复用,充分利用算法特有的稀疏特性实现计算简化和存储压缩,同时对于计算量最大的解方程步骤,硬件电路采用可配置设计,通过改变该电路模块的配置使加速器实现性能、功耗和资源使用的折中,满足不同场景的要求。在Xilinx ZC706平台上的实验结果表明:相比于Intel i5-8250U和Arm Cortex-A57处理器,所设计的加速器应用于高性能场景的电路时,可以达到26.7倍和87倍的加速效果;应用于低功耗场景的电路时,加速效果分别为17.4倍和53.7倍;不同配置的电路均可节约能耗90%以上。
Abstract:Mobile robots to localize themselves is a prerequisite for full ego-motion. Simultaneous localization and mapping (SLAM), a key localization technology, has gained a lot of attention. An inertial measurement unit (IMU) is widely used in SLAM systems for its advantage of not being affected by the external environment. The backend of SLAM uses nonlinear optimization methods to optimize IMU states. There are problems of poor real-time performance and high energy consumption. This paper designs an accelerator for IMU state optimization on the field programmable gate array (FPGA) platform. Firstly, by analyzing the data flow of the algorithm, the accelerator realizes circuits and data reuse. Secondly, by making use of the sparse characteristics, calculation simplification and storage compression are realized. Finally, the hardware employs a configurable design for the equation solving step, which requires the most computation. By changing the configuration, the accelerator can achieve a compromise between performance, power consumption, and resource utilization. The experimental results on the Xilinx ZC706 platform indicate that the accelerator designed for high-performance scenarios can achieve 26.7 times and 87 times performance improvement compared to Intel i5-8250U and Arm Cortex-A57 processors; the performance improvement of low-power scenario accelerator is 17.4 times and 53.7 times, respectively. Besides, different configurations can save more than 90% of energy.
-
表 1 硬件设计的资源消耗
Table 1. Resource utilization of hardware design
电路 资源数 占比/% 查找表
(LUT)触发器
(FF)块随机存取存储器
(BRAM)数字信号处理
单元(DSP)查找表
(LUT)触发器
(FF)块随机存取存储器
(BRAM)数字信号处理
单元(DSP)C1 29057 29407 86.5 278 13.3 6.7 15.9 30.9 C2 44229 41840 131.5 458 20.2 9.6 24.1 50.9 C3 59617 54007 176.5 638 27.3 12.4 32.4 70.9 表 2 软硬件实现的性能比较
Table 2. Performance comparison between hardware and software implementations
状态数目 执行时间/ms FPGA加速比 x86 TX1 C1 C2 C3 x86 vs C1 x86 vs C2 x86 vs C3 TX1 vs C1 TX1 vs C2 TX1 vs C3 N=20 19.09 41.38 0.84 0.65 0.59 22.73 29.37 32.36 49.26 63.66 70.14 N=50 82.59 461.18 27.87 15.32 11.16 2.96 5.39 7.40 16.55 30.10 41.32 N=80 126.89 662.65 60.33 32.05 22.64 2.10 3.96 5.60 10.98 20.68 29.27 平均 76.19 388.40 29.68 16.01 11.46 9.26 12.91 15.12 25.60 38.15 46.91 表 3 能耗对比
Table 3. Comparison of energy consumption
mJ 状态数目 硬件能耗 软件能耗 C1 C2 C3 x86 TX1 N=20 1.93 1.92 2.22 286.35 207.73 N=50 63.91 45.29 41.95 1238.85 2315.12 N=80 138.34 94.74 85.10 1903.35 3326.50 -
[1] 张国良, 姚二亮. 移动机器人的SLAM与VSLAM方法[M]. 西安: 西安交通大学出版社, 2017: 3-5.ZHANG G L, YAO E L. SLAM and VSLAM methods for mobile robots[M]. Xi’an: Xi’an Jiaotong University Press, 2017: 3-5(in Chinese). [2] BARFOOT T D. State estimation for robotics[M]. Cambridge: Cambridge University Press, 2017: 4-5. [3] READY B B, TAYLOR C N. Inertially aided visual odometry for miniature air vehicles in gps-denied environments[J]. Journal of Intelligent and Robotic Systems, 2009, 55(2): 203-221. [4] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: Towards the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332. doi: 10.1109/TRO.2016.2624754 [5] STRASDAT H, MONTIEL J M M, DAVISON A J. Visual SLAM: Why filter?[J]. Image and Vision Computing, 2012, 30(2): 65-77. doi: 10.1016/j.imavis.2012.02.009 [6] LIU Q, QIN S, YU B, et al. π-BA: Bundle adjustment hardware accelerator based on distribution of 3D-point observations[J]. IEEE Transactions on Computers, 2020, 69(7): 1083-1095. [7] SUN R, LIU P, XUE J, et al. BAX: A bundle adjustment accelerator with decoupled access/execute architecture for visual odometry[J]. IEEE Access, 2020, 8: 75530-75542. doi: 10.1109/ACCESS.2020.2988527 [8] ASGARI B, HADIDI R, GHALESHAHI N S, et al. PISCES: Power-aware implementation of SLAM by customizing efficient sparse algebra[C]//2020 57th ACM/IEEE Design Automation Conference. Piscataway: IEEE Press, 2020: 1-6. [9] INDELMAN V, WILLIAMS S, KAESS M, et al. Factor graph based incremental smoothing in inertial navigation systems[C]//2012 15th International Conference on Information Fusion. Piscataway: IEEE Press, 2012: 2154-2161. [10] QIN T, LI P, SHEN S. VINS-Mono: A robust and versatile monocular visual-inertial state estimator[J]. IEEE Transactions on Robotics, 2018, 34(4): 1004-1020. doi: 10.1109/TRO.2018.2853729 [11] 张超凡. 基于多目视觉与惯导融合的SLAM方法研究[D]. 合肥: 中国科学技术大学, 2019: 18-20.ZHANG C F. Research on SLAM method of multi-view vision and inertial navigation[D]. Hefei: University of Science and Technology of China, 2019: 18-20(in Chinese). [12] LOURAKIS M L A, ARGYROS A A. Is Levenberg-Marquardt the most efficient optimization algorithm for implementing bundle adjustment?[C]//IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2005: 17-20. [13] TRIGGS B, MCLAUCHLAN P F, HARTLEY R I, et al. Bundle adjustment a modern synthesis[C]//International Workshop on Vision Algorithms. Berlin: Springer, 1999: 298-372. [14] FORSTER C, CARLONE L, DELLAERT F, et al. On-manifold preintegration theory for fast and accurate visual-inertial navigation[EB/OL]. (2016-10-30)[2021-07-01]. [15] XILINX. AMD Zynq 7000 SoC ZC706 evaluation kit[EB/OL]. (2019-08-07)[2021-07-05]. [16] QIN T, PAN J, CAO S, et al. A general optimization-based framework for local odometry estimation with multiple sensors[EB/OL]. (2019-01-11)[2021-07-05]. [17] SAMEER A. Ceres solver[EB/OL]. (2018-06-30)[2021-07-05]. [18] SULEIMAN A, ZHANG Z, CARLONE L, et al. Navion: A 2-mW fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones[J]. IEEE Journal of Solid-State Circuits, 2019, 54(4): 1106-1119. doi: 10.1109/JSSC.2018.2886342 [19] NVIDIA. Jetson TX1[EB/OL]. (2016-08-01)[2021-07-05]. [20] BURRI M, NIKOLIC J, GOHL P, et al. The EuRoC micro aerial vehicle datasets[J]. The International Journal of Robotics Research, 2016, 35(10): 1157-1163. doi: 10.1177/0278364915620033 -