Citation: | MA S G,CHEN Q M,HOU Z Q,et al. Lightweight semantic segmentation algorithm based on GLCNet[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3358-3366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0822 |
Most semantic segmentation algorithms based on convolutional neural networks have massive parameters and high computational complexity, which limit their applications in real-time processing scenarios. Therefore, this paper proposed a lightweight semantic segmentation algorithm based on a global-local context network (GLCNet). The algorithm consisted of a global-local context (GLC) module and a multi-resolution fusion (MRF) module. The GLC module learned the global and local context information of the image, in which the dependencies between features were enhanced using residual connections. On this basis, the MRF module was proposed to aggregate features at different stages. First, upsampling was performed on low-resolution features, which were then fused with high-resolution features to enhance the spatial information of higher-level features. Tests were conducted on the Cityscapes and Camvid datasets, and the mean intersection over union (mIoU) of the algorithm achieved 69.89% and 68.86%, respectively, with speeds of 87 frame/s and 122 frame/s on a single NVIDIA Titan V GPU. The experimental results show that the algorithm achieves a good balance among segmentation accuracy, efficiency, and the number of parameters, and the number of the parameters is only 0.68×106.
[1] |
SIAM M, GAMAL M, ABDEL-RAZEK M, et al. A comparative study of real-time semantic segmentation for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2018: 700-710.
|
[2] |
郑宇祥, 郝鹏翼, 吴冬恩, 等. 结合多层特征及空间信息蒸馏的医学影像分割[J]. 北京航空航天大学学报, 2022, 48(8): 1409-1417.
ZHENG Y X, HAO P Y, WU D E, et al. Medical image segmentation based on multi-layer features and spatial information distillation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1409-1417(in Chinese).
|
[3] |
SHI W J, XU J W, ZHU D C, et al. RGB-D semantic segmentation and label-oriented voxelgrid fusion for accurate 3D semantic mapping[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(1): 183-197. doi: 10.1109/TCSVT.2021.3056726
|
[4] |
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
|
[5] |
RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 234-241.
|
[6] |
YU C Q, WANG J B, GAO C X, et al. Context prior for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 12413-12422.
|
[7] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. doi: 10.1109/TPAMI.2017.2699184
|
[8] |
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
|
[9] |
YUAN Y H, HUANG L, GUO J Y, et al. OCNet: Object context network for scene parsing[EB/OL]. (2021-03-15)[2022-09-01]. http://arxiv.org/abs/1809.00916.
|
[10] |
CHENG H K, CHUNG J, TAI Y W, et al. CascadePSP: Toward class-agnostic and very high-resolution segmentation via global and local refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 8887-8896.
|
[11] |
HE J J, DENG Z Y, ZHOU L, et al. Adaptive pyramid context network for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7511-7520.
|
[12] |
NEKRASOV V, SHEN C H, REID I. Light-weight RefineNet for real-time semantic segmentation[EB/OL]. (2018-10-08)[2022-09-01]. http://arxiv.org/abs/1810.03272.
|
[13] |
HAO S J, ZHOU Y, GUO Y R, et al. Real-time semantic segmentation via spatial-detail guided context propagation[J/OL]. IEEE Transactions on Neural Networks and Learning Systems, 2022: 1-12[2022-09-01]. https://ieeexplore.ieee.org/document/9729997. DOI: 10.1109/TNNLS.2022.3154443.
|
[14] |
PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation[EB/OL]. (2016-06-07)[2022-09-01]. http://arxiv.org/abs/1606.02147.
|
[15] |
WANG Y, ZHOU Q, XIONG J, et al. ESNet: An efficient symmetric network for real-time semantic segmentation[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Berlin: Springer, 2019: 41-52.
|
[16] |
MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 561-580.
|
[17] |
ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 418-434.
|
[18] |
LI H C, XIONG P F, FAN H Q, et al. DFANet: Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 9514-9523.
|
[19] |
HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. (2021-09-01)[2022-09-01]. http://arxiv.org/abs/2101.06085.
|
[20] |
YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 334-349.
|
[21] |
YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068. doi: 10.1007/s11263-021-01515-2
|
[22] |
WANG Y, ZHOU Q, LIU J, et al. LEDNet: A lightweight encoder-decoder network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2019: 1860-1864.
|
[23] |
LI G, YUN I, KIM J, et al. DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation[EB/OL]. (2019-10-01)[2022-09-01]. http://arxiv.org/abs/1907.11357.
|
[24] |
GAO R. Rethinking dilated convolution for real-time semantic segmentation[EB/OL]. (2021-11-18)[2022-09-01]. http://arxiv.org/abs/2111.09957.
|
[25] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
|
[26] |
WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11531-11539.
|
[27] |
HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13708-13717.
|
[28] |
GAO Z L, XIE J T, WANG Q L, et al. Global second-order pooling convolutional networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3019-3028.
|
[29] |
HUANG Z L, WANG X G, HUANG L C, et al. CCNet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 603-612.
|
[30] |
QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: Frequency channel attention networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 763-772.
|
[31] |
LO S Y, HANG H M, CHAN S W, et al. Efficient dense modules of asymmetric convolution for real-time semantic segmentation[C]//Proceedings of the ACM Multimedia Asia. New York: ACM, 2019: 1-6.
|
[32] |
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 2818-2826.
|
[33] |
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2022-09-01]. http://arxiv.org/abs/1704.04861.
|
[34] |
SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 1-9.
|
[35] |
CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 3213-3223.
|
[36] |
BROSTOW G J, SHOTTON J, FAUQUEUR J, et al. Segmentation and recognition using structure from motion point clouds[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2008: 44-57.
|
[37] |
POUDEL R P K, LIWICKI S, CIPOLLA R. Fast-SCNN: Fast semantic segmentation network[EB/OL]. (2019-02-12)[2022-09-01]. http://arxiv.org/abs/1902.04502.
|
[38] |
WU T Y, TANG S, ZHANG R, et al. CGNet: A light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179. doi: 10.1109/TIP.2020.3042065
|
[39] |
ROMERA E, ALVAREZ J M, BERGASA L M, et al. ERFNet: Efficient residual factorized convNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 19(1): 263-272.
|
[40] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
|
[41] |
ZHANG X T, CHEN Z X, WU Q M J, et al. Fast semantic segmentation for scene perception[J]. IEEE Transactions on Industrial Informatics, 2019, 15(2): 1183-1192.
|
[42] |
YANG Z G, YU H S, FU Q, et al. NDNet: Narrow while deep network for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(9): 5508-5519. doi: 10.1109/TITS.2020.2987816
|
[43] |
ORŠIC M, KREŠO I, BEVANDIC P, et al. In defense of pre-trained ImageNet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 12599-12608.
|
[44] |
YE L, ZENG J X, YANG Y, et al. BSDNet: Balanced sample distribution network for real-time semantic segmentation of road scenes[J]. IEEE Access, 2021, 9: 84034-84044. doi: 10.1109/ACCESS.2021.3087510
|
[45] |
LOU A G, LOEW M. CFPNet: Channel-wise feature pyramid for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2021: 1894-1898.
|
[46] |
GAO G W, XU G A, LI J C, et al. FBSNet: A fast bilateral symmetrical network for real-time semantic segmentation[J]. IEEE Transactions on Multimedia, 2022, 25: 3273-3283.
|
[47] |
LIU J, ZHOU Q, QIANG Y, et al. FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2020: 2373-2377.
|
[48] |
HU X G, JING L Y. LDPNet: A lightweight densely connected pyramid network for real-time semantic segmentation[J]. IEEE Access, 2961, 8: 212647-212658.
|
[49] |
ARANI E, MARZBAN S, PATA A, et al. RGPNet: A real-time general purpose semantic segmentation[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 3008-3017.
|
[50] |
JIANG W H, XIE Z Z, LI Y Y, et al. LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Multimedia & Expo Workshops. Piscataway: IEEE Press, 2020: 1-6.
|
[51] |
DONG G S, YAN Y, SHEN C H, et al. Real-time high-performance semantic image segmentation of urban street scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(6): 3258-3274. doi: 10.1109/TITS.2020.2980426
|
[52] |
HAO S J, ZHOU Y, GUO Y R. Bi-direction context propagation network for real-time semantic segmentation[EB/OL]. (2022-03-19)[2022-09-01]. http://arxiv.org/abs/2005.11034.
|