留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于GLCNet的轻量级语义分割算法

马素刚 陈期梅 侯志强 杨小宝 张子贤

马素刚,陈期梅,侯志强,等. 基于GLCNet的轻量级语义分割算法[J]. 北京航空航天大学学报,2024,50(11):3358-3366 doi: 10.13700/j.bh.1001-5965.2022.0822
引用本文: 马素刚,陈期梅,侯志强,等. 基于GLCNet的轻量级语义分割算法[J]. 北京航空航天大学学报,2024,50(11):3358-3366 doi: 10.13700/j.bh.1001-5965.2022.0822
MA S G,CHEN Q M,HOU Z Q,et al. Lightweight semantic segmentation algorithm based on GLCNet[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3358-3366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0822
Citation: MA S G,CHEN Q M,HOU Z Q,et al. Lightweight semantic segmentation algorithm based on GLCNet[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3358-3366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0822

基于GLCNet的轻量级语义分割算法

doi: 10.13700/j.bh.1001-5965.2022.0822
基金项目: 国家自然科学基金(62072370);西安市科技计划(22GXFW0125)
详细信息
    通讯作者:

    E-mail:msg@xupt.edu.cn

  • 中图分类号: TP391.4

Lightweight semantic segmentation algorithm based on GLCNet

Funds: National Natural Science Foundation of China (62072370); Science and Technology Project of Xi’an City (22GXFW0125)
More Information
  • 摘要:

    多数基于卷积神经网络的语义分割算法伴随庞大的参数量和计算复杂度,限制了其在实时处理场景中的应用。为解决该问题,提出了一种基于全局-局部上下文网络(GLCNet)的轻量级语义分割算法。该算法主要由全局-局部上下文(GLC)模块和多分辨率融合(MRF)模块构成。全局-局部上下文模块学习图像的全局信息和局部上下文信息,使用残差连接增强特征之间的依赖关系。在此基础上,提出了多分辨率融合模块聚合不同阶段的特征,对低分辨率特征进行上采样,与高分辨率特征融合增强高层特征的空间信息。在Cityscapes和Camvid数据集上进行测试,平均交并比(mIoU)分别达到69.89%和68.86%,在单块NVIDIA Titan V GPU上,速度分别达到87帧/s和122帧/s。实验结果表明:所提算法在分割精度、效率及参数量之间实现了较好的平衡,参数量仅有0.68×106

     

  • 图 1  GLCNet整体框架

    Figure 1.  Overall framework of GLCNet

    图 2  GLC模块

    Figure 2.  GLC module

    图 3  Cityscapes数据集的可视化对比结果

    Figure 3.  Visual comparison results of Cityscapes dataset

    图 4  Camvid数据集的可视化对比结果

    Figure 4.  Visual comparison results of Camvid dataset

    表  1  不同算法在Cityscapes数据集上的测试结果

    Table  1.   Test results of different algorithms on Cityscapes dataset

    算法 骨干网络 参数量 分割速度/(帧·s−1 mIoU/%
    ENet[14] None 0.4×106 76.9 58.3
    SegNet[40] VGG16 29.5×106 14.6 56.1
    ICNet[17] PSPNet50 26.50×106 30.3 69.5
    BiSeNet[20] Xception39 5.80×106 106 68.4
    FSSNet[41] None 0.2×106 51 65.6
    SwiftNet[43] MobileNetv2 2.4×106 27.7 69.7
    EDANet[31] None 0.68×106 81 67.3
    DFANet[18] Xception 4.8×106 120 67.1
    ESNet[15] None 1.6×106 41.7 69.1
    Fast-SCNN[37] None 1.11×106 123 68.0
    LEDNet[22] None 0.91×106 71 70.6
    CGNet[38] None 0.5×106 64.8
    NDNet[42] None 0.5×106 40 65.3
    CFPNet[45] None 0.55×106 30 70.1
    BSDNet[44] Xception 1.2×106 84.6 68.3
    BiSeNet V2[21] None 3.40×106 156 72.6
    SGCPNet[13] MobileNet 0.61×106 178.5 69.5
    本文算法 None 0.68×106 87 69.89
    下载: 导出CSV

    表  2  不同算法在Camvid数据集上的测试结果

    Table  2.   Test results of different algorithms on Camvid dataset

    算法 骨干网络 参数量 mIoU/%
    ENet[14] None 0.36×106 51.3
    SegNet[40] VGG16 29.50×106 55.6
    BiSeNet[20] Xception39 65.6
    BiSeNet[20] ResNet18 49×106 68.7
    DFANet[18] Xception 7.80×106 64.7
    DABNet[23] None 0.76×106 66.4
    CGNet[38] None 0.5×106 65.6
    RGPNet[49] None 17.7×106 66.9
    FDDWNet[47] None 0.8×106 66.9
    LDPNet[48] None 0.8×106 67.3
    LRNNet[50] None 0.67×106 67.6
    HPNet[51] None 68.0
    BCPNet[52] MobileNet 0.61×106 67.8
    BSDNet[44] ResNet50 22.8×106 67.8
    FBSNet[46] None 0.62×106 68.9
    本文算法 None 0.68×106 68.86
    下载: 导出CSV

    表  3  消融实验结果

    Table  3.   Ablation experiments results

    模块 融合方式 MRF mIoU/% 参数量
    相加 拼接 残差连接
    GLC 66.31 0.80×106
    GLC 67.22 0.67×106
    GLC 67.39 0.67×106
    (2,2,2,4,4,8,8,16,16) 67.39 0.67×106
    (2,2,2,2,4,4,8,8,16) 67.25 0.67×106
    (2,2,2,2,4,8,8,16,16) 67.61 0.67×106
    (2,2,2,2,2,4,8,8,16,16) 67.49 0.69×106
    (1,1,1,1,4,4,8,8,12) 68.15 0.67×106
    GLCNet 68.86 0.68×106
    下载: 导出CSV
  • [1] SIAM M, GAMAL M, ABDEL-RAZEK M, et al. A comparative study of real-time semantic segmentation for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2018: 700-710.
    [2] 郑宇祥, 郝鹏翼, 吴冬恩, 等. 结合多层特征及空间信息蒸馏的医学影像分割[J]. 北京航空航天大学学报, 2022, 48(8): 1409-1417.

    ZHENG Y X, HAO P Y, WU D E, et al. Medical image segmentation based on multi-layer features and spatial information distillation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1409-1417(in Chinese).
    [3] SHI W J, XU J W, ZHU D C, et al. RGB-D semantic segmentation and label-oriented voxelgrid fusion for accurate 3D semantic mapping[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(1): 183-197. doi: 10.1109/TCSVT.2021.3056726
    [4] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
    [5] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 234-241.
    [6] YU C Q, WANG J B, GAO C X, et al. Context prior for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 12413-12422.
    [7] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. doi: 10.1109/TPAMI.2017.2699184
    [8] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
    [9] YUAN Y H, HUANG L, GUO J Y, et al. OCNet: Object context network for scene parsing[EB/OL]. (2021-03-15)[2022-09-01]. http://arxiv.org/abs/1809.00916.
    [10] CHENG H K, CHUNG J, TAI Y W, et al. CascadePSP: Toward class-agnostic and very high-resolution segmentation via global and local refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 8887-8896.
    [11] HE J J, DENG Z Y, ZHOU L, et al. Adaptive pyramid context network for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7511-7520.
    [12] NEKRASOV V, SHEN C H, REID I. Light-weight RefineNet for real-time semantic segmentation[EB/OL]. (2018-10-08)[2022-09-01]. http://arxiv.org/abs/1810.03272.
    [13] HAO S J, ZHOU Y, GUO Y R, et al. Real-time semantic segmentation via spatial-detail guided context propagation[J/OL]. IEEE Transactions on Neural Networks and Learning Systems, 2022: 1-12[2022-09-01]. https://ieeexplore.ieee.org/document/9729997. DOI: 10.1109/TNNLS.2022.3154443.
    [14] PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation[EB/OL]. (2016-06-07)[2022-09-01]. http://arxiv.org/abs/1606.02147.
    [15] WANG Y, ZHOU Q, XIONG J, et al. ESNet: An efficient symmetric network for real-time semantic segmentation[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Berlin: Springer, 2019: 41-52.
    [16] MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 561-580.
    [17] ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 418-434.
    [18] LI H C, XIONG P F, FAN H Q, et al. DFANet: Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 9514-9523.
    [19] HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. (2021-09-01)[2022-09-01]. http://arxiv.org/abs/2101.06085.
    [20] YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 334-349.
    [21] YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068. doi: 10.1007/s11263-021-01515-2
    [22] WANG Y, ZHOU Q, LIU J, et al. LEDNet: A lightweight encoder-decoder network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2019: 1860-1864.
    [23] LI G, YUN I, KIM J, et al. DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation[EB/OL]. (2019-10-01)[2022-09-01]. http://arxiv.org/abs/1907.11357.
    [24] GAO R. Rethinking dilated convolution for real-time semantic segmentation[EB/OL]. (2021-11-18)[2022-09-01]. http://arxiv.org/abs/2111.09957.
    [25] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [26] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11531-11539.
    [27] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13708-13717.
    [28] GAO Z L, XIE J T, WANG Q L, et al. Global second-order pooling convolutional networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3019-3028.
    [29] HUANG Z L, WANG X G, HUANG L C, et al. CCNet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 603-612.
    [30] QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: Frequency channel attention networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 763-772.
    [31] LO S Y, HANG H M, CHAN S W, et al. Efficient dense modules of asymmetric convolution for real-time semantic segmentation[C]//Proceedings of the ACM Multimedia Asia. New York: ACM, 2019: 1-6.
    [32] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 2818-2826.
    [33] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2022-09-01]. http://arxiv.org/abs/1704.04861.
    [34] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 1-9.
    [35] CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 3213-3223.
    [36] BROSTOW G J, SHOTTON J, FAUQUEUR J, et al. Segmentation and recognition using structure from motion point clouds[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2008: 44-57.
    [37] POUDEL R P K, LIWICKI S, CIPOLLA R. Fast-SCNN: Fast semantic segmentation network[EB/OL]. (2019-02-12)[2022-09-01]. http://arxiv.org/abs/1902.04502.
    [38] WU T Y, TANG S, ZHANG R, et al. CGNet: A light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179. doi: 10.1109/TIP.2020.3042065
    [39] ROMERA E, ALVAREZ J M, BERGASA L M, et al. ERFNet: Efficient residual factorized convNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 19(1): 263-272.
    [40] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
    [41] ZHANG X T, CHEN Z X, WU Q M J, et al. Fast semantic segmentation for scene perception[J]. IEEE Transactions on Industrial Informatics, 2019, 15(2): 1183-1192.
    [42] YANG Z G, YU H S, FU Q, et al. NDNet: Narrow while deep network for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(9): 5508-5519. doi: 10.1109/TITS.2020.2987816
    [43] ORŠIC M, KREŠO I, BEVANDIC P, et al. In defense of pre-trained ImageNet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 12599-12608.
    [44] YE L, ZENG J X, YANG Y, et al. BSDNet: Balanced sample distribution network for real-time semantic segmentation of road scenes[J]. IEEE Access, 2021, 9: 84034-84044. doi: 10.1109/ACCESS.2021.3087510
    [45] LOU A G, LOEW M. CFPNet: Channel-wise feature pyramid for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2021: 1894-1898.
    [46] GAO G W, XU G A, LI J C, et al. FBSNet: A fast bilateral symmetrical network for real-time semantic segmentation[J]. IEEE Transactions on Multimedia, 2022, 25: 3273-3283.
    [47] LIU J, ZHOU Q, QIANG Y, et al. FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2020: 2373-2377.
    [48] HU X G, JING L Y. LDPNet: A lightweight densely connected pyramid network for real-time semantic segmentation[J]. IEEE Access, 2961, 8: 212647-212658.
    [49] ARANI E, MARZBAN S, PATA A, et al. RGPNet: A real-time general purpose semantic segmentation[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 3008-3017.
    [50] JIANG W H, XIE Z Z, LI Y Y, et al. LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Multimedia & Expo Workshops. Piscataway: IEEE Press, 2020: 1-6.
    [51] DONG G S, YAN Y, SHEN C H, et al. Real-time high-performance semantic image segmentation of urban street scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(6): 3258-3274. doi: 10.1109/TITS.2020.2980426
    [52] HAO S J, ZHOU Y, GUO Y R. Bi-direction context propagation network for real-time semantic segmentation[EB/OL]. (2022-03-19)[2022-09-01]. http://arxiv.org/abs/2005.11034.
  • 加载中
图(4) / 表(3)
计量
  • 文章访问数:  280
  • HTML全文浏览量:  97
  • PDF下载量:  8
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-09-29
  • 录用日期:  2022-11-07
  • 网络出版日期:  2022-11-30
  • 整期出版日期:  2024-11-30

目录

    /

    返回文章
    返回
    常见问答