Citation: | MA S G,CHEN Q M,HOU Z Q,et al. Lightweight semantic segmentation algorithm based on GLCNet[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(11):3358-3366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0822 |
Most semantic segmentation algorithms based on convolutional neural networks have massive parameters and high computational complexity, which limit their applications in real-time processing scenarios. Therefore, this paper proposed a lightweight semantic segmentation algorithm based on a global-local context network (GLCNet). The algorithm consisted of a global-local context (GLC) module and a multi-resolution fusion (MRF) module. The GLC module learned the global and local context information of the image, in which the dependencies between features were enhanced using residual connections. On this basis, the MRF module was proposed to aggregate features at different stages. First, upsampling was performed on low-resolution features, which were then fused with high-resolution features to enhance the spatial information of higher-level features. Tests were conducted on the Cityscapes and Camvid datasets, and the mean intersection over union (mIoU) of the algorithm achieved 69.89% and 68.86%, respectively, with speeds of 87 frame/s and 122 frame/s on a single NVIDIA Titan V GPU. The experimental results show that the algorithm achieves a good balance among segmentation accuracy, efficiency, and the number of parameters, and the number of the parameters is only 0.68×106.
[1] |
SIAM M, GAMAL M, ABDEL-RAZEK M, et al. A comparative study of real-time semantic segmentation for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2018: 700-710.
|
[2] |
郑宇祥, 郝鹏翼, 吴冬恩, 等. 结合多层特征及空间信息蒸馏的医学影像分割[J]. 北京航空航天大学学报, 2022, 48(8): 1409-1417.
ZHENG Y X, HAO P Y, WU D E, et al. Medical image segmentation based on multi-layer features and spatial information distillation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1409-1417(in Chinese).
|
[3] |
SHI W J, XU J W, ZHU D C, et al. RGB-D semantic segmentation and label-oriented voxelgrid fusion for accurate 3D semantic mapping[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(1): 183-197. doi: 10.1109/TCSVT.2021.3056726
|
[4] |
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
|
[5] |
RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 234-241.
|
[6] |
YU C Q, WANG J B, GAO C X, et al. Context prior for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 12413-12422.
|
[7] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. doi: 10.1109/TPAMI.2017.2699184
|
[8] |
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
|
[9] |
YUAN Y H, HUANG L, GUO J Y, et al. OCNet: Object context network for scene parsing[EB/OL]. (2021-03-15)[2022-09-01].
|
[10] |
CHENG H K, CHUNG J, TAI Y W, et al. CascadePSP: Toward class-agnostic and very high-resolution segmentation via global and local refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 8887-8896.
|
[11] |
HE J J, DENG Z Y, ZHOU L, et al. Adaptive pyramid context network for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7511-7520.
|
[12] |
NEKRASOV V, SHEN C H, REID I. Light-weight RefineNet for real-time semantic segmentation[EB/OL]. (2018-10-08)[2022-09-01].
|
[13] |
HAO S J, ZHOU Y, GUO Y R, et al. Real-time semantic segmentation via spatial-detail guided context propagation[J/OL]. IEEE Transactions on Neural Networks and Learning Systems, 2022: 1-12[2022-09-01].
|
[14] |
PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation[EB/OL]. (2016-06-07)[2022-09-01].
|
[15] |
WANG Y, ZHOU Q, XIONG J, et al. ESNet: An efficient symmetric network for real-time semantic segmentation[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision. Berlin: Springer, 2019: 41-52.
|
[16] |
MEHTA S, RASTEGARI M, CASPI A, et al. ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 561-580.
|
[17] |
ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 418-434.
|
[18] |
LI H C, XIONG P F, FAN H Q, et al. DFANet: Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 9514-9523.
|
[19] |
HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. (2021-09-01)[2022-09-01].
|
[20] |
YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 334-349.
|
[21] |
YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068. doi: 10.1007/s11263-021-01515-2
|
[22] |
WANG Y, ZHOU Q, LIU J, et al. LEDNet: A lightweight encoder-decoder network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2019: 1860-1864.
|
[23] |
LI G, YUN I, KIM J, et al. DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation[EB/OL]. (2019-10-01)[2022-09-01].
|
[24] |
GAO R. Rethinking dilated convolution for real-time semantic segmentation[EB/OL]. (2021-11-18)[2022-09-01].
|
[25] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
|
[26] |
WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11531-11539.
|
[27] |
HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13708-13717.
|
[28] |
GAO Z L, XIE J T, WANG Q L, et al. Global second-order pooling convolutional networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3019-3028.
|
[29] |
HUANG Z L, WANG X G, HUANG L C, et al. CCNet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 603-612.
|
[30] |
QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: Frequency channel attention networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 763-772.
|
[31] |
LO S Y, HANG H M, CHAN S W, et al. Efficient dense modules of asymmetric convolution for real-time semantic segmentation[C]//Proceedings of the ACM Multimedia Asia. New York: ACM, 2019: 1-6.
|
[32] |
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 2818-2826.
|
[33] |
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2022-09-01].
|
[34] |
SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 1-9.
|
[35] |
CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 3213-3223.
|
[36] |
BROSTOW G J, SHOTTON J, FAUQUEUR J, et al. Segmentation and recognition using structure from motion point clouds[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2008: 44-57.
|
[37] |
POUDEL R P K, LIWICKI S, CIPOLLA R. Fast-SCNN: Fast semantic segmentation network[EB/OL]. (2019-02-12)[2022-09-01].
|
[38] |
WU T Y, TANG S, ZHANG R, et al. CGNet: A light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179. doi: 10.1109/TIP.2020.3042065
|
[39] |
ROMERA E, ALVAREZ J M, BERGASA L M, et al. ERFNet: Efficient residual factorized convNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 19(1): 263-272.
|
[40] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
|
[41] |
ZHANG X T, CHEN Z X, WU Q M J, et al. Fast semantic segmentation for scene perception[J]. IEEE Transactions on Industrial Informatics, 2019, 15(2): 1183-1192.
|
[42] |
YANG Z G, YU H S, FU Q, et al. NDNet: Narrow while deep network for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(9): 5508-5519. doi: 10.1109/TITS.2020.2987816
|
[43] |
ORŠIC M, KREŠO I, BEVANDIC P, et al. In defense of pre-trained ImageNet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 12599-12608.
|
[44] |
YE L, ZENG J X, YANG Y, et al. BSDNet: Balanced sample distribution network for real-time semantic segmentation of road scenes[J]. IEEE Access, 2021, 9: 84034-84044. doi: 10.1109/ACCESS.2021.3087510
|
[45] |
LOU A G, LOEW M. CFPNet: Channel-wise feature pyramid for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Image Processing. Piscataway: IEEE Press, 2021: 1894-1898.
|
[46] |
GAO G W, XU G A, LI J C, et al. FBSNet: A fast bilateral symmetrical network for real-time semantic segmentation[J]. IEEE Transactions on Multimedia, 2022, 25: 3273-3283.
|
[47] |
LIU J, ZHOU Q, QIANG Y, et al. FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2020: 2373-2377.
|
[48] |
HU X G, JING L Y. LDPNet: A lightweight densely connected pyramid network for real-time semantic segmentation[J]. IEEE Access, 2961, 8: 212647-212658.
|
[49] |
ARANI E, MARZBAN S, PATA A, et al. RGPNet: A real-time general purpose semantic segmentation[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 3008-3017.
|
[50] |
JIANG W H, XIE Z Z, LI Y Y, et al. LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation[C]//Proceedings of the IEEE International Conference on Multimedia & Expo Workshops. Piscataway: IEEE Press, 2020: 1-6.
|
[51] |
DONG G S, YAN Y, SHEN C H, et al. Real-time high-performance semantic image segmentation of urban street scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(6): 3258-3274. doi: 10.1109/TITS.2020.2980426
|
[52] |
HAO S J, ZHOU Y, GUO Y R. Bi-direction context propagation network for real-time semantic segmentation[EB/OL]. (2022-03-19)[2022-09-01].
|
[1] | GUO Shenge, LI Qiao, ZHUO Yuedong. Real-Coefficient Amplify-and-Forward Method for WAIC[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0695 |
[2] | BAI He, LI Yu, HAO Ming, XU Tingting. Graph neural network recommendation algorithm fused with dual-channel attention and time encoding[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0795 |
[3] | XIE Xiaoyan, ZHANG Heng, CHEN Yuxin. Micro-expression recognition method based on capsule network in coding domain[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0841 |
[4] | ZHANG Y T,LI Q Y,LIU S K. Tabular subordination relation extraction based on graph convolutional networks[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(4):1308-1315 (in Chinese). doi: 10.13700/j.bh.1001-5965.2022.0382. |
[5] | YANG Rong-tai, SHAO Yu-bin, DU Qing-zhi, LONG Hua, QI Yu-ting, ZHANG Feng. Few-shot entity linking prediction based on Graph-Transformer network[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0023 |
[6] | WANG Zai-sheng, WANG Xiao-feng, SHEN Guo-dong, ZHANG Zeng-jie, QUAN Da-ying. Self-Supervised Learning for Community Detection Based on Deep Graph Convolutional Networks[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0408 |
[7] | WANG Zi-yi, YIN Jia-hao, HUANG Bo-bin, GAO Feng. A Rotated Content-Aware Retina Network for SAR Ship Detection[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0394 |
[8] | HOU Zhi-qiang, ZHAO Jia-xin, CHEN Yu, MA Su-gang, YU Wang-sheng, FAN Jiu-lun. Cascaded object drift determination network for long-term visual tracking[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0504 |
[9] | ZHANG Dong-dong, WANG Chun-ping, FU Qiang. Camouflaged Object Detection Network Based on Human Visual Mechanisms[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0511 |
[10] | SONG S J,WAN J Q. Gait based cross-view pedestrian tracking with camera network[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(8):2154-2166 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0610. |
[11] | MENG Guang-lei, CONG Ze-lin, SONG Bin, LI Ting-ting, WANG Chen-guang, ZHOU Ming-zhe. Review of Bayesian network structure learning[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2023.0445 |
[12] | WEN P,CHENG Y L,WANG P,et al. Ground object classification based on height-aware multi-scale graph convolution network[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(6):1471-1478 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0434. |
[13] | PU L,LI H L,HOU Z Q,et al. Siamese network tracking based on high level semantic embedding[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(4):792-803 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0319. |
[14] | GUAN X M,ZHAO S Z. Airport risk propagation network oriented to aviation network[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(6):1342-1351 (in Chinese). doi: 10.13700/j.bh.1001-5965.2021.0469. |
[15] | DONG Zeshu, YUAN Feiniu, XIA Xue. Improved spatial and channel information based global smoke attention network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1471-1479. doi: 10.13700/j.bh.1001-5965.2021.0549 |
[16] | LIU Danyang, FANG Quan, ZHANG Xiaowei, HU Jun, QIAN Shengsheng, XU Changsheng. Knowledge graph completion based on graph contrastive attention network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1428-1435. doi: 10.13700/j.bh.1001-5965.2021.0523 |
[17] | LIU Hao, YANG Xiaoshan, XU Changsheng. Long-tail image captioning with dynamic semantic memory network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1399-1408. doi: 10.13700/j.bh.1001-5965.2021.0518 |
[18] | ZHOU Lifang, LIU Jinlan, LI Weisheng, LEI Bangjun, HE Yu, WANG Yihan. Object tracking method based on IoU-constrained Siamese network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1390-1398. doi: 10.13700/j.bh.1001-5965.2021.0533 |
[19] | LI Zheyang, ZHANG Ruyi, TAN Wenming, REN Ye, LEI Ming, WU Hao. A graph convolution network based latency prediction algorithm for convolution neural network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(12): 2450-2459. doi: 10.13700/j.bh.1001-5965.2021.0149 |
[20] | ZHU Mengyuan, CHEN Zhuo, LIU Pengfei, LYU Na. Fog computing-based federated intrusion detection algorithm for wireless sensor networks[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(10): 1943-1950. doi: 10.13700/j.bh.1001-5965.2021.0766 |