留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于交叉注意力的多任务交通场景检测模型

牛国臣 王晓楠

牛国臣,王晓楠. 基于交叉注意力的多任务交通场景检测模型[J]. 北京航空航天大学学报,2024,50(5):1491-1499 doi: 10.13700/j.bh.1001-5965.2022.0610
引用本文: 牛国臣,王晓楠. 基于交叉注意力的多任务交通场景检测模型[J]. 北京航空航天大学学报,2024,50(5):1491-1499 doi: 10.13700/j.bh.1001-5965.2022.0610
NIU G C,WANG X N. A multi-task traffic scene detection model based on cross-attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(5):1491-1499 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0610
Citation: NIU G C,WANG X N. A multi-task traffic scene detection model based on cross-attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(5):1491-1499 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0610

基于交叉注意力的多任务交通场景检测模型

doi: 10.13700/j.bh.1001-5965.2022.0610
基金项目: 天津市科技计划项目(17ZXHLGX00120);天津市研究生科研创新项目(2021YJSO2S30);中央高校基本科研业务费专项资金(3122022PY17);中国民航大学研究生科研创新项目(2021YJS025)
详细信息
    通讯作者:

    E-mail:niu_guochen@139.com

  • 中图分类号: TP391.4

A multi-task traffic scene detection model based on cross-attention

Funds: Tianjin Science and Technology Plan Project (17ZXHLGX00120); Tianjin Research Innovation Project for Postgraduate Students (2021YJSO2S30); The Fundamental Research Funds for the Central Universities (3122022PY17); Civil Aviation University of China Postgraduate Research and Innovation Project (2021YJS025)
More Information
  • 摘要:

    感知是自动驾驶的基础和关键,但大多数单个模型无法同时完成交通目标、可行驶区域和车道线等多项检测任务。提出一种基于交叉注意力的多任务交通场景检测模型,可以同时检测交通目标、可行驶区域和车道线。使用编解码网络提取初始特征,利用混合空洞卷积对初始特征进行强化,并通过交叉注意力模块得到分割和检测特征图。在分割特征图上进行语义分割,在检测特征图上进行目标检测。实验结果表明:在具有挑战性的BDD100K数据集中,所提模型在任务精度和总体计算效率方面优于其他多任务模型。

     

  • 图 1  多任务交通场景检测结果

    Figure 1.  Results of multi-task traffic scene detection

    图 2  SPPF结构

    Figure 2.  SPPF structure

    图 3  CA结构

    Figure 3.  CA structure

    图 4  网格问题示意图

    Figure 4.  Illustration of the gridding problem

    图 5  HDC结构

    Figure 5.  HDC structure

    图 6  多任务交通场景检测模型结构

    Figure 6.  Multi-task traffic scene detection model structure

    图 7  交通目标检测结果可视化

    Figure 7.  Visualization of the traffic objects detection results

    图 8  可行驶区域检测结果可视化

    Figure 8.  Visualization of the drivable area detection results

    图 9  车道线检测结果可视化

    Figure 9.  Visualization of the lane detection results

    表  1  交通目标检测结果对比

    Table  1.   Comparison of the traffic object detection results %

    模型 召回率 mAP50
    Faster R-CNN[5] 77.2 55.6
    YOLOv5s 86.8 77.2
    MultiNet[16] 81.3 60.2
    DLT-Net[19] 89.4 68.4
    YOLOP[20] 89.2 76.5
    TDL-YOLO 88.6 78.0
    下载: 导出CSV

    表  2  可行驶区域检测结果对比

    Table  2.   Comparison of the drivable area detection results %

    模型 mIoU
    ERFNet[11] 68.7
    PSPNet[1] 89.6
    MultiNet[16] 71.6
    DLT-Net[19] 72.1
    YOLOP[20] 91.5
    TDL-YOLO 91.4
    下载: 导出CSV

    表  3  车道线检测结果对比

    Table  3.   Comparison of the lane detection results %

    模型 精度 IoU
    ENet[22] 34.12 14.64
    SCNN[2] 35.79 15.84
    ENet-SAD[13] 36.56 16.02
    YOLOP[20] 70.50 26.20
    TDL-YOLO 72.30 26.50
    下载: 导出CSV

    表  4  不同光照条件下的检测结果

    Table  4.   Detection results under different lighting conditions

    模型光照mAP50/%mIoU/%精度/%IoU/%
    YOLOP[20]77.891.771.326.6
    77.491.071.026.3
    73.791.269.125.5
    TDL-YOLO79.591.573.126.9
    79.091.472.826.5
    75.091.271.226.0
    下载: 导出CSV

    表  5  消融实验

    Table  5.   Ablation experiment %

    方案 召回率 mAP50 mIoU 精度 IoU
    检测 88.5 77.0
    分割 92.0 74.7 27.0
    多任务 88.6 78.0 91.4 72.3 26.5
    下载: 导出CSV
  • [1] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
    [2] PAN X G, SHI J P, LUO P, et al. Spatial as deep: Spatial CNN for traffic scene understanding[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 12301.
    [3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580- 587.
    [4] GIRSHICK R. Fast R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1440-1448.
    [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 21-37.
    [7] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
    [8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2980-2988.
    [9] TIAN Z, SHEN C H, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9627-9636.
    [10] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
    [11] ROMERA E, ÁLVAREZ J M, BERGASA L M, et al. ERFNet: Efficient residual factorized ConvNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(1): 263-272. doi: 10.1109/TITS.2017.2750080
    [12] YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation[C]// Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 325-341.
    [13] HOU Y N, MA Z, LIU C X, et al. Learning lightweight lane detection CNNs by self attention distillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 1013-1021.
    [14] TABELINI L, BERRIEL R, PAIXAO T M, et al. Keep your eyes on the lane: Real-time attention-guided lane detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 294-302.
    [15] QIN Z Q, WANG H Y, LI X. Ultra fast structure-aware deep lane detection[C]//European Conference on Computer Vision. Berlin: Springer, 2020: 276-291.
    [16] TEICHMANN M, WEBER M, ZOELLNER M, et al. MultiNet: Real-time joint semantic reasoning for autonomous driving[C]//Proceedings of the IEEE Intelligent Vehicles Symposium. Piscataway: IEEE Press, 2018: 1013-1020.
    [17] 刘占文, 范颂华, 齐明远, 等. 基于时序融合的自动驾驶多任务感知算法[J]. 交通运输工程学报, 2021, 21(4): 223-234.

    LIU Z W, FAN S H, QI M Y, et al. Multi-task perception algorithm of autonomous driving based on temporal fusion[J]. Journal of Traffic and Transportation Engineering, 2021, 21(4): 223-234(in Chinese).
    [18] 刘军, 陈岚磊, 李汉冰. 基于类人视觉的多任务交通目标实时检测模型[J]. 汽车工程, 2021, 43(1): 50-58.

    LIU J, CHEN L L, LI H B. A real-time detection model for multi-task traffic objects based on humanoid vision[J]. Automotive Engineering, 2021, 43(1): 50-58(in Chinese).
    [19] QIAN Y Q, DOLAN J M, YANG M. DLT-Net: Joint detection of drivable areas, lane lines, and traffic objects[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 21(11): 4670-4679.
    [20] WU D, LIAO M W, ZHANG W T, et al. YOLOP: You only look once for panoptic driving perception[EB/OL]. (2021-08-25) [2022-06-25]. https://arxiv.org/abs/2108.11250.
    [21] YU F, CHEN H F, WANG X, et al. BDD100K: A diverse driving dataset for heterogeneous multitask learning[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 2633-2642.
    [22] PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation[EB/OL]. (2016-06-30)[2022-06-25]. https://arxiv.org/abs/1606.02147.
  • 加载中
图(9) / 表(5)
计量
  • 文章访问数:  318
  • HTML全文浏览量:  73
  • PDF下载量:  34
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-07-12
  • 录用日期:  2022-08-14
  • 网络出版日期:  2023-02-01
  • 整期出版日期:  2024-05-29

目录

    /

    返回文章
    返回
    常见问答