留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于立体图像的多路径特征金字塔网络3D目标检测

苏凯祺 阎维青 徐金东

苏凯祺, 阎维青, 徐金东等 . 基于立体图像的多路径特征金字塔网络3D目标检测[J]. 北京航空航天大学学报, 2022, 48(8): 1487-1494. doi: 10.13700/j.bh.1001-5965.2021.0525
引用本文: 苏凯祺, 阎维青, 徐金东等 . 基于立体图像的多路径特征金字塔网络3D目标检测[J]. 北京航空航天大学学报, 2022, 48(8): 1487-1494. doi: 10.13700/j.bh.1001-5965.2021.0525
SU Kaiqi, YAN Weiqing, XU Jindonget al. 3D object detection based on multi-path feature pyramid network for stereo images[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1487-1494. doi: 10.13700/j.bh.1001-5965.2021.0525(in Chinese)
Citation: SU Kaiqi, YAN Weiqing, XU Jindonget al. 3D object detection based on multi-path feature pyramid network for stereo images[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1487-1494. doi: 10.13700/j.bh.1001-5965.2021.0525(in Chinese)

基于立体图像的多路径特征金字塔网络3D目标检测

doi: 10.13700/j.bh.1001-5965.2021.0525
基金项目: 

国家自然科学基金 61801414

国家自然科学基金 62072391

国家自然科学基金 62066013

山东省自然科学基金 ZR2019MF060

山东省高等学校科研计划重点项目 J18KZ016

详细信息
    通讯作者:

    阎维青, E-mail: wqyan@tju.edu.cn

  • 中图分类号: TP391

3D object detection based on multi-path feature pyramid network for stereo images

Funds: 

National Natural Science Foundation of China 61801414

National Natural Science Foundation of China 62072391

National Natural Science Foundation of China 62066013

Shandong Provincial Natural Science Foundation ZR2019MF060

Shandong Province Higher Educational Science and Technology Key Program J18KZ016

More Information
  • 摘要:

    3D目标检测是计算机视觉和自动驾驶中一项重要的场景理解任务。当前基于立体图像的3D目标检测方法大多没有充分考虑多个目标之间的尺度存在较大差异,从而尺度小的物体容易被忽略,导致检测精度低。针对这一问题,提出了一种基于立体图像的多路径特征金字塔网络(MpFPN)3D目标检测方法。MpFPN对特征金字塔网络进行了扩展,增加了自底向上的路径、由上至下的路径及输入特征图到输出特征图之间的连接,为联合区域提议网络提供了更高语义信息和更细粒度空间信息的多尺度特征信息。实验结果表明:在3D目标检测KITTI数据集上,无论在场景简单、中等、复杂情况下,所提方法获得的结果都优于比较方法的结果。

     

  • 图 1  本文网络的整体框架

    Figure 1.  Overall framework of proposed network

    图 2  KITTI验证集上的3D目标检测结果

    Figure 2.  3D object detection results on KITTI validation set

    表  1  KITTI验证集上汽车类别的APbev和AP3D

    Table  1.   APbev/AP3D of car category on KITTI validation set %

    方法 输入 IoU=0.5 IoU=0.7
    简单 中等 困难 简单 中等 困难
    MonoGRNet[24] M 54.21/50.51 39.69/36.97 33.06/30.82 24.97/13.88 19.44/10.19 16.30/7.62
    M3D-RPN[5] M 55.37/48.96 42.49/39.57 35.29/33.01 25.94/20.27 21.18/17.06 17.90/15.21
    AM3D[25] M 72.64/68.86 51.82/49.19 44.21/42.24 43.75/32.23 28.39/21.09 23.87/17.26
    3DOP[9] S 55.04/46.04 41.25/34.63 34.55/30.09 12.63/6.55 9.49/5.07 7.59/4.10
    TLNet[26] S 62.46/59.51 45.99/43.71 41.92/37.99 29.22/18.15 21.88/14.26 18.83/13.72
    Stereo R-CNN[10] S 87.13/85.84 74.11/66.28 58.93/57.24 68.50/54.11 48.30/36.69 41.47/31.07
    本文方法 S 87.62/86.49 75.04/72.62 59.31/58.04 69.44/55.26 49.36/37.94 42.11/32.38
    注:S表示双目图像对作为输入,M表示单目图像作为输入。“/”前数据为APbev, “/”后数据为AP3D
    下载: 导出CSV

    表  2  本文方法与Pseudo-LiDAR[11]方法在KITTI验证集上汽车类别的APbev和AP3D

    Table  2.   APbev and AP3D of car category on KITTI validation set between the proposed method and seudo-LiDAR[11] method %

    方法 APbev (IoU=0.7) AP3D(IoU=0.7)
    简单 中等 困难 简单 中等 困难
    本文方法 69.44 49.36 42.11 55.26 37.94 32.38
    PL+FP[11] 69.7 48.1 41.8 54.9 36.4 31.1
    下载: 导出CSV

    表  3  在KITTI数据集上对于MpFPN方法的消融实验

    Table  3.   Ablation experiment of MpFPN approach on KITTI dataset %

    Path Conn APbev (IoU=0.7) AP3D(IoU=0.7)
    简单 中等 困难 简单 中等 困难
    × × 65.92 46.11 40 52.25 34.69 30.27
    × 68.01 48.15 41.21 54.78 36.88 31.42
    69.44 49.36 42.11 55.26 37.94 32.38
    下载: 导出CSV
  • [1] WANG Z, JIA K. Frustum ConvNet: Sliding frustums to aggregate local point-wise features for a modal 3D object detection[C]//Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2019: 1742-1749.
    [2] SHI S, WANG X, LI H. PointRCNN: 3D object proposal generate-on and detection from point cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 770-779.
    [3] QI C R, LITANY O, HE K, et al. Deep Hough voting for 3D object detection in point clouds[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9277-9286.
    [4] SHI S, GUO C, JIANG L, et al. PV-RCNN: Point-voxel feature set abstraction for 3D object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10529-10538.
    [5] BRAZIL G, LIU X. M3D-RPN: Monocular 3D region proposal network for object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9287-9296.
    [6] KU J, PON A D, WASLANDER S L. Monocular 3D object detection leveraging accurate proposals and shape reconstruction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 11867-11876.
    [7] CHEN Y, TAI L, SUN K, et al. MonoPair: Monocular 3D object detection using pairwise spatial relationships[C]//Proceedi-ngs of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 12093-12102.
    [8] LIU L, WU C, LU J, et al. Reinforced axial refinement network for monocular 3D object detection[C]//European Conference on Computer Vision. Berlin: Springer, 2020: 540-556.
    [9] CHEN X, KUNDU K, ZHU Y, et al. 3D object proposals usingstereo imagery for accurate object class detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(5): 1259-1272.
    [10] LI P, CHEN X, SHEN S. Stereo R-CNN based 3D object detection for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7644-7652.
    [11] WANG Y, CHAO W L, GARG D, et al. Pseudo-LiDAR from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 8445-8453.
    [12] SUN J, CHEN L, XIE Y, et al. Disp R-CNN: Stereo 3D object detection via shape prior guided instance disparity estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10548-10557.
    [13] GHIASI G, LIN T Y, LE Q V. NAS-FPN: Learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7036-7045.
    [14] ZHAO Q, SHENG T, WANG Y, et al. M2Det: A single-shot object detector based on multi-level feature pyramid network[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2019: 9259-9266.
    [15] TAN M, PANG R, LE Q V. EfficientDet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10781-10790.
    [16] 曹帅, 张晓伟, 马健伟. 基于跨尺度特征聚合网络的多尺度行人检测[J]. 北京航空航天大学学报, 2020, 46(9): 1786-1796. doi: 10.13700/j.bh.1001-5965.2020.0069

    CAO S, ZHANG X W, MA J W. Transscale feature aggregation network for multiscale pedestrian detection[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1786-1796(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0069
    [17] 李晓光, 付陈平, 李晓莉, 等. 面向多尺度目标检测的改进Faster R-CNN算法[J]. 计算机辅助设计与图形学学报, 2019, 31(7): 1095-1101. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF201907005.htm

    LI X G, FU C P, LI X L, et al. Improved faster R-CNN for multi-scale object detection[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(7): 1095-1101(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF201907005.htm
    [18] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2117-2125.
    [19] LIU S, QI L, QIN H, et al. Path aggregation network for inst-ance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768.
    [20] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [21] HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2961-2969.
    [22] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
    [23] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2012: 3354-3361.
    [24] QIN Z, WANG J, LU Y. MonoGRNet: A geometric reasoning network for monocular 3D object localization[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2019: 8851-8858.
    [25] MA X, WANG Z, LI H, et al. Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6851-6860.
    [26] QIN Z, WANG J, LU Y. Triangulation learning network: From monocular to stereo 3D object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7615-7623.
    [27] CHANG J R, CHEN Y S. Pyramid stereo matching network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 5410-5418.
    [28] KU J, MOZIFIAN M, LEE J, et al. Joint 3D proposal generation and object detection from view aggregation[C]//Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2018: 1-8.
  • 加载中
图(2) / 表(3)
计量
  • 文章访问数:  361
  • HTML全文浏览量:  116
  • PDF下载量:  36
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-06
  • 录用日期:  2021-09-17
  • 网络出版日期:  2021-10-18
  • 整期出版日期:  2022-08-20

目录

    /

    返回文章
    返回
    常见问答