留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种用于SLAM的IMU状态优化加速器设计

刘强 刘威壮 俞波 刘少山

刘强,刘威壮,俞波,等. 一种用于SLAM的IMU状态优化加速器设计[J]. 北京航空航天大学学报,2023,49(5):1027-1035 doi: 10.13700/j.bh.1001-5965.2021.0382
引用本文: 刘强,刘威壮,俞波,等. 一种用于SLAM的IMU状态优化加速器设计[J]. 北京航空航天大学学报,2023,49(5):1027-1035 doi: 10.13700/j.bh.1001-5965.2021.0382
LIU Q,LIU W Z,YU B,et al. An IMU state optimization accelerator for SLAM[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(5):1027-1035 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0382
Citation: LIU Q,LIU W Z,YU B,et al. An IMU state optimization accelerator for SLAM[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(5):1027-1035 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0382

一种用于SLAM的IMU状态优化加速器设计

doi: 10.13700/j.bh.1001-5965.2021.0382
基金项目: 国家自然科学基金(61974102)
详细信息
    通讯作者:

    E-mail: qiangliu@tju.edu.cn

  • 中图分类号: TP368

An IMU state optimization accelerator for SLAM

Funds: National Natural Science Foundation of China (61974102)
More Information
  • 摘要:

    自主定位是移动机器人实现完全自主移动的基础,其中的关键技术同步定位与建图(SLAM)备受关注。惯性测量单元(IMU)因具有不受外界环境影响的优势广泛应用在SLAM系统中。SLAM后端使用非线性优化方法优化IMU的状态,实际应用中存在实时性差、能耗高的问题。为此,在现场可编程门阵列(FPGA)平台上设计了用于IMU状态优化的加速器。分析算法的数据流程实现电路和数据复用,充分利用算法特有的稀疏特性实现计算简化和存储压缩,同时对于计算量最大的解方程步骤,硬件电路采用可配置设计,通过改变该电路模块的配置使加速器实现性能、功耗和资源使用的折中,满足不同场景的要求。在Xilinx ZC706平台上的实验结果表明:相比于Intel i5-8250U和Arm Cortex-A57处理器,所设计的加速器应用于高性能场景的电路时,可以达到26.7倍和87倍的加速效果;应用于低功耗场景的电路时,加速效果分别为17.4倍和53.7倍;不同配置的电路均可节约能耗90%以上。

     

  • 图 1  IMU积分示意图

    Figure 1.  Schematic diagram of IMU integration

    图 2  硬件设计架构

    Figure 2.  Hardware design architecture

    图 3  误差函数对状态量的雅可比矩阵

    Figure 3.  Jacobian matrix of error function versus states

    图 4  误差、雅可比和成本计算电路及配置模式

    Figure 4.  Error, Jacobian and cost calculation circuit and its configuration modes

    图 5  误差、雅可比和成本计算阶段1电路

    Figure 5.  Stage one of error, Jacobian and cost calculation circuit

    图 6  H矩阵的存储优化

    Figure 6.  Storage optimization of H matrix

    图 7  在20个和80个优化状态下的存储优化效果

    Figure 7.  Storage optimization results with 20 and 80 optimization states

    图 8  乔可斯基分解算法线性方程组求解电路

    Figure 8.  Cholesky decomposition linear equation solving circuit

    图 9  乔可斯基分解基本单元的计算时序

    Figure 9.  Computing schedule of CPEs

    表  1  硬件设计的资源消耗

    Table  1.   Resource utilization of hardware design

    电路资源数占比/%
    查找表
    (LUT)
    触发器
    (FF)
    块随机存取存储器
    (BRAM)
    数字信号处理
    单元(DSP)
    查找表
    (LUT)
    触发器
    (FF)
    块随机存取存储器
    (BRAM)
    数字信号处理
    单元(DSP)
    C12905729407 86.527813.3 6.715.930.9
    C24422941840131.545820.2 9.624.150.9
    C35961754007176.563827.312.432.470.9
    下载: 导出CSV

    表  2  软硬件实现的性能比较

    Table  2.   Performance comparison between hardware and software implementations

    状态数目执行时间/msFPGA加速比
    x86TX1C1C2C3x86 vs C1x86 vs C2x86 vs C3TX1 vs C1TX1 vs C2TX1 vs C3
    N=20 19.09 41.380.840.650.5922.7329.3732.3649.2663.6670.14
    N=50 82.59461.1827.8715.3211.162.965.397.4016.5530.1041.32
    N=80126.89662.6560.3332.0522.642.103.965.6010.9820.6829.27
    平均 76.19388.4029.6816.0111.469.2612.9115.1225.6038.1546.91
    下载: 导出CSV

    表  3  能耗对比

    Table  3.   Comparison of energy consumption mJ

    状态数目硬件能耗软件能耗
    C1C2C3x86TX1
    N=201.931.922.22286.35207.73
    N=5063.9145.2941.951238.852315.12
    N=80138.3494.7485.101903.353326.50
    下载: 导出CSV
  • [1] 张国良, 姚二亮. 移动机器人的SLAM与VSLAM方法[M]. 西安: 西安交通大学出版社, 2017: 3-5.

    ZHANG G L, YAO E L. SLAM and VSLAM methods for mobile robots[M]. Xi’an: Xi’an Jiaotong University Press, 2017: 3-5(in Chinese).
    [2] BARFOOT T D. State estimation for robotics[M]. Cambridge: Cambridge University Press, 2017: 4-5.
    [3] READY B B, TAYLOR C N. Inertially aided visual odometry for miniature air vehicles in gps-denied environments[J]. Journal of Intelligent and Robotic Systems, 2009, 55(2): 203-221.
    [4] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: Towards the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332. doi: 10.1109/TRO.2016.2624754
    [5] STRASDAT H, MONTIEL J M M, DAVISON A J. Visual SLAM: Why filter?[J]. Image and Vision Computing, 2012, 30(2): 65-77. doi: 10.1016/j.imavis.2012.02.009
    [6] LIU Q, QIN S, YU B, et al. π-BA: Bundle adjustment hardware accelerator based on distribution of 3D-point observations[J]. IEEE Transactions on Computers, 2020, 69(7): 1083-1095.
    [7] SUN R, LIU P, XUE J, et al. BAX: A bundle adjustment accelerator with decoupled access/execute architecture for visual odometry[J]. IEEE Access, 2020, 8: 75530-75542. doi: 10.1109/ACCESS.2020.2988527
    [8] ASGARI B, HADIDI R, GHALESHAHI N S, et al. PISCES: Power-aware implementation of SLAM by customizing efficient sparse algebra[C]//2020 57th ACM/IEEE Design Automation Conference. Piscataway: IEEE Press, 2020: 1-6.
    [9] INDELMAN V, WILLIAMS S, KAESS M, et al. Factor graph based incremental smoothing in inertial navigation systems[C]//2012 15th International Conference on Information Fusion. Piscataway: IEEE Press, 2012: 2154-2161.
    [10] QIN T, LI P, SHEN S. VINS-Mono: A robust and versatile monocular visual-inertial state estimator[J]. IEEE Transactions on Robotics, 2018, 34(4): 1004-1020. doi: 10.1109/TRO.2018.2853729
    [11] 张超凡. 基于多目视觉与惯导融合的SLAM方法研究[D]. 合肥: 中国科学技术大学, 2019: 18-20.

    ZHANG C F. Research on SLAM method of multi-view vision and inertial navigation[D]. Hefei: University of Science and Technology of China, 2019: 18-20(in Chinese).
    [12] LOURAKIS M L A, ARGYROS A A. Is Levenberg-Marquardt the most efficient optimization algorithm for implementing bundle adjustment?[C]//IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2005: 17-20.
    [13] TRIGGS B, MCLAUCHLAN P F, HARTLEY R I, et al. Bundle adjustment a modern synthesis[C]//International Workshop on Vision Algorithms. Berlin: Springer, 1999: 298-372.
    [14] FORSTER C, CARLONE L, DELLAERT F, et al. On-manifold preintegration theory for fast and accurate visual-inertial navigation[EB/OL]. (2016-10-30)[2021-07-01].https://arxiv.org/abs/1512.02363v1.
    [15] XILINX. AMD Zynq 7000 SoC ZC706 evaluation kit[EB/OL]. (2019-08-07)[2021-07-05].https://www.xilinx.com/products/boards-and-kits/ek-z7-zc706-g.html.
    [16] QIN T, PAN J, CAO S, et al. A general optimization-based framework for local odometry estimation with multiple sensors[EB/OL]. (2019-01-11)[2021-07-05].https://arxiv.org/pdf/1901.03638.
    [17] SAMEER A. Ceres solver[EB/OL]. (2018-06-30)[2021-07-05]. http://ceres-solver.org.
    [18] SULEIMAN A, ZHANG Z, CARLONE L, et al. Navion: A 2-mW fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones[J]. IEEE Journal of Solid-State Circuits, 2019, 54(4): 1106-1119. doi: 10.1109/JSSC.2018.2886342
    [19] NVIDIA. Jetson TX1[EB/OL]. (2016-08-01)[2021-07-05]. https://developer.nvidia.com/embedded/jetson-tx1.
    [20] BURRI M, NIKOLIC J, GOHL P, et al. The EuRoC micro aerial vehicle datasets[J]. The International Journal of Robotics Research, 2016, 35(10): 1157-1163. doi: 10.1177/0278364915620033
  • 加载中
图(9) / 表(3)
计量
  • 文章访问数:  253
  • HTML全文浏览量:  80
  • PDF下载量:  30
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-07-08
  • 录用日期:  2021-11-02
  • 网络出版日期:  2021-11-16
  • 整期出版日期:  2023-05-31

目录

    /

    返回文章
    返回
    常见问答