留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

双注意力引导的U-Net++遥感图像语义分割模型

刘春娟 辛钰强 吴小所 闫浩文

刘春娟,辛钰强,吴小所,等. 双注意力引导的U-Net++遥感图像语义分割模型[J]. 北京航空航天大学学报,2026,52(5):1366-1377
引用本文: 刘春娟,辛钰强,吴小所,等. 双注意力引导的U-Net++遥感图像语义分割模型[J]. 北京航空航天大学学报,2026,52(5):1366-1377
LIU C J,XIN Y Q,WU X S,et al. Semantic segmentation model for remote sensing images based on U-Net++ guided by dual attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1366-1377 (in Chinese)
Citation: LIU C J,XIN Y Q,WU X S,et al. Semantic segmentation model for remote sensing images based on U-Net++ guided by dual attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1366-1377 (in Chinese)

双注意力引导的U-Net++遥感图像语义分割模型

doi: 10.13700/j.bh.1001-5965.2024.0122
基金项目: 

国家重点研发计划(2022YFB3903604);甘肃省重点研发计划(20YF8GA035);甘肃省自然科学基金(22JR5RA320); 兰州交通大学青年科学基金(2021029)

详细信息
    通讯作者:

    E-mail:wuxs_laser@lzjtu.edu.cn

  • 中图分类号: TP751

Semantic segmentation model for remote sensing images based on U-Net++ guided by dual attention

Funds: 

National Key Research and Development Program of China (2022YFB3903604); Key R & D Projects in Gansu Province (20YF8GA035); Gansu Provincial Natural Science Foundation (22JR5RA320); Lanzhou Jiaotong University Youth Science Fund (2021029)

More Information
  • 摘要:

    利用语义分割算法为遥感图像中的像素赋予地物类别标签是遥感图像智能解译中的重要内容。针对高分辨率遥感图像中不同地物类别之间尺度差异大且场景复杂导致的物体边缘分割不完整、小尺度物体分割精度低的问题,提出双注意力引导的U-Net++语义分割模型。在网络的编码阶段构建双分支骨干网络提取特征,利用互注意力捕捉不同尺度特征图像素之间的依赖关系,自适应地融合相同网络深度的不同尺度特征,提升对小尺度物体的关注度;在网络的解码阶段引入空间与通道混合的注意力机制,缩小不同深度子解码器输出之间的语义差距,同时融合其中不同层次的语义信息和空间位置表征,解决复杂场景下精细分割的问题。实验结果表明:所提算法在Potsdam数据集与Vaihingen数据集上的平均交并比(mIoU)分别达到了86.77%与82.73%,F1分数的均值分别达到了92.32%与90.79%,整体性能显著优于U-Net++、FarSeg、DMAU-Net、SAPNet等对比算法,且对小尺度物体的分割性能有明显提升。

     

  • 图 1  双注意力引导的U-Net++网络

    Figure 1.  U-Net++ guided by dual attention

    图 2  卷积块结构

    Figure 2.  Convolution block architecture

    图 3  多尺度互注意力模块

    Figure 3.  Multi-scale mutual attention module

    图 4  全局-局部注意力融合模块

    Figure 4.  Global-local attention fusion module

    图 5  Potsdam数据集的消融实验定性比较结果

    Figure 5.  Qualitative comparison results of ablation experiments on Potsdam dataset

    图 6  Vaihingen数据集的消融实验定性比较结果

    Figure 6.  Qualitative comparison results of ablation experiments on Vaihingen dataset

    图 7  Potsdam数据集的对比实验定性比较结果

    Figure 7.  Qualitative comparison results of comparative experiments on Potsdam dataset

    图 8  Vaihingen数据集的对比实验定性比较结果

    Figure 8.  Qualitative comparison results of comparative experiments on Vaihingen dataset

    表  1  消融实验策略描述

    Table  1.   Strategy description for ablation experiments

    模型 策略描述
    Baseline 骨干网络为Resnet18的U-Net++网络
    Baseline + MMA 在Baseline基础上增加双分支骨干网络输入及MMA模块
    Baseline + GLAM 在Baseline基础上增加GLAM模块
    DAU-Net++(Baseline + MMA + GLAM) 在Baseline基础上增加双分支骨干网络输入、MMA模块、GLAM模块
    下载: 导出CSV

    表  2  Potsdam数据集的消融实验定量评估结果

    Table  2.   Quantitative assessment results of ablation experiments on Potsdam dataset

    网络模型 IoU/% mIoU/% OA/% mF1/%
    不透水表面 建筑物 低矮植被 汽车
    Baseline 84.37 87.02 79.97 68.57 69.05 80.36 88.63 87.32
    Baseline + MMA 86.04 89.91 83.92 75.37 76.21 84.35 90.84 91.44
    Baseline + GLAM 85.87 90.71 84.64 76.19 75.39 84.66 91.21 91.73
    DAU-Net++ 87.72 91.85 86.49 78.21 77.42 86.77 91.97 92.32
     注:表中加粗字体表示各项指标中的最优数据,下划线标出数据为各项指标的次优数据。
    下载: 导出CSV

    表  3  Vaihingen数据集的消融实验定量评估结果

    Table  3.   Quantitative assessment results of ablation experiments on Vaihingen dataset

    网络模型 IoU/% mIoU/% OA/% mF1/%
    不透水表面 建筑物 低矮植被 汽车
    Baseline 88.06 89.64 69.82 78.17 66.02 78.49 89.66 86.58
    Baseline + MMA 90.86 93.22 75.99 81.64 73.40 80.88 90.85 89.29
    Baseline + GLAM 91.15 93.41 75.61 83.95 71.68 81.03 91.04 88.84
    DAU-Net++ 92.26 94.51 77.44 85.53 75.31 82.73 91.68 90.79
     注:表中加粗字体表示各项指标中的最优数据,下划线标出数据为各项指标的次优数据。
    下载: 导出CSV

    表  4  Potsdam数据集的对比实验定量评估结果

    Table  4.   Quantitative assessment results of comparative experiments on Potsdam dataset

    网络模型 IoU/% mIoU/% OA/% mF1/% 浮点运算速度/109 s−1
    不透水表面 建筑物 低矮植被 汽车
    PSPNet[5] 80.78 85.32 78.55 56.21 65.84 76.17 88.32 85.79 67.95
    U-Net[10] 81.63 84.51 76.57 60.38 67.57 76.75 87.69 85.26 63.59
    U-Net++[11] 84.37 87.02 79.97 68.57 69.05 80.36 88.63 87.32 90.16
    EMANet[25] 83.64 90.38 81.15 72.72 71.60 81.87 89.38 88.59 126.49
    FarSeg[26] 84.39 89.63 80.95 74.62 70.46 82.55 89.13 87.89 193.23
    DA-IMRN[21] 87.86 90.21 81.62 74.53 74.39 84.96 90.24 89.73 167.68
    MACU-Net[27] 86.64 90.36 80.69 76.58 73.37 84.76 90.18 90.86 213.58
    DMAU-Net[22] 88.89 92.03 84.94 76.39 75.46 85.68 91.40 91.15 186.74
    SAPNet[23] 88.36 92.14 83.91 78.52 76.31 85.73 91.24 91.68 317.94
    DAU-Net++ 87.72 91.85 86.49 78.21 77.42 86.77 91.97 92.32 257.41
     注:表中加粗字体表示各项指标中的最优数据,下划线标出数据为各项指标的次优数据。
    下载: 导出CSV

    表  5  Vaihingen数据集的对比实验定量评估结果

    Table  5.   Quantitative assessment results of comparative experiments on Vaihingen dataset

    网络模型 IoU/% mIoU/% OA/% mF1/%
    不透水表面 建筑物 低矮植被 汽车
    PSPNet[5] 85.30 86.44 62.53 76.44 57.39 73.78 86.55 83.18
    U-Net[10] 84.76 87.49 63.28 74.81 59.36 72.18 85.79 83.24
    U-Net++[11] 88.06 89.64 69.82 78.17 66.02 78.49 89.66 86.58
    EMANet[25] 89.76 92.21 72.57 81.63 72.08 80.37 90.63 88.26
    FarSeg[26] 90.58 91.99 73.08 82.79 71.02 79.62 89.98 88.67
    DA-IMRN[21] 88.94 93.61 74.56 82.61 72.83 80.89 90.47 89.45
    MACU-Net[27] 92.21 93.29 73.82 82.95 73.28 81.42 90.58 89.53
    DMAU-Net[22] 92.74 95.26 76.51 83.79 73.64 81.38 90.75 89.96
    SAPNet[23] 92.58 94.62 75.16 84.16 74.24 81.69 91.08 89.44
    DAU-Net++ 92.26 94.51 77.44 85.53 75.31 82.73 91.68 90.79
     注:表中加粗字体表示各项指标中的最优数据,下划线标出数据为各项指标的次优数据。
    下载: 导出CSV
  • [1] 杨军, 张金影. 嵌入自注意力机制的U型高分遥感影像语义分割网络[J]. 北京航空航天大学学报, 2025, 51(5): 1514-1527.

    YANG J, ZHANG J Y. U-shaped semantic segmentation network of high-resolution remote sensing images embedded with self-attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics, 2025, 51(5): 1514-1527(in Chinese).
    [2] 吴云华, 张泽中, 华冰, 等. 应用卷积神经网络的遥感图像云层自主检测[J]. 哈尔滨工业大学学报, 2020, 52(12): 27-34.

    WU Y H, ZHANG Z Z, HUA B, et al. Autonomous cloud detection for remote sensing images using convolutional neural network[J]. Journal of Harbin Institute of Technology, 2020, 52(12): 27-34(in Chinese).
    [3] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
    [4] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-06-17)[2024-03-01]. https://arxiv.org/abs/1706.05587.
    [5] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
    [6] WANG X, LI Z S, HUANG Y P, et al. Multimodal medical image segmentation using multi-scale context-aware network[J]. Neurocomputing, 2022, 486: 135-146.
    [7] DOU F R, ZHANG C F, HU D, et al. EASNet: a multiscale attention semantic segmentation network combined with asymmetric convolution[J]. Journal of Electronic Imaging, 2022, 31(4): 043034.
    [8] LUO J, ZHAO L, ZHU L, et al. Multi-scale receptive field fusion network for lightweight image super-resolution[J]. Neurocomputing, 2022, 493: 314-326.
    [9] LIN D, SHEN D G, SHEN S T, et al. ZigZagNet: fusing top-down and bottom-up context for object segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 7482-7491.
    [10] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 234-241.
    [11] ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. U-Net++: a nested U-Net architecture for medical image segmentation[C]//Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Berlin: Springer, 2018: 3-11.
    [12] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 833-851.
    [13] HE X, ZHOU Y, ZHAO J Q, et al. Swin Transformer embedding U-Net for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 4408715.
    [14] CUI W, WANG F, HE X, et al. Multi-scale semantic segmentation and spatial relationship recognition of remote sensing images based on an attention model[J]. Remote Sensing, 2019, 11(9): 1044.
    [15] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
    [16] QI X Q, LI K Q, LIU P K, et al. Deep attention and multi-scale networks for accurate remote sensing image segmentation[J]. IEEE Access, 2020, 8: 146627-146639.
    [17] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [18] FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 3141-3149.
    [19] HUANG Z L, WANG X G, WEI Y C, et al. CCNet: criss-crossattention for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6896-6908.
    [20] OKTAY O, SCHLEMPER J, LE FOLGOC L, et al. Attention U-Net: learning where to look for the pancreas[EB/OL]. (2018-05-20) [2024-03-01]. https://arxiv.org/abs/1804.03999.
    [21] ZOU L, ZHANG Z F, DU H J, et al. DA-IMRN: dual-attention-guided interactive multi-scale residual network for hyperspectral image classification[J]. Remote Sensing, 2022, 14(3): 530.
    [22] YANG Y, DONG J W, WANG Y H, et al. DMAU-Net: an attention-based multiscale max-pooling dense network for the semantic segmentation in VHR remote-sensing images[J]. Remote Sensing, 2023, 15(5): 1328.
    [23] LI X, XU F, LIU F, et al. A synergistical attention model for semantic segmentation of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5400916.
    [24] 刘春娟, 乔泽, 闫浩文, 等. 基于双路径监督的遥感图像语义分割网络[J]. 北京航空航天大学学报, 2025, 51(3): 732-741.

    LIU C J, QIAO Z, YAN H W, et al. Semantic segmentation network of remote sensing images based on dual path supervision[J]. Journal of Beijing University of Aeronautics and Astronautics, 2025, 51(3): 732-741(in Chinese).
    [25] LI X, ZHONG Z S, WU J L, et al. Expectation-maximization attention networks for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 9166-9175.
    [26] ZHENG Z, ZHONG Y F, WANG J J, et al. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4095-4104.
    [27] LI R, DUAN C X, ZHENG S Y, et al. MACU-Net for semantic segmentation of fine-resolution remotely sensed images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 8007205.
  • 加载中
图(8) / 表(5)
计量
  • 文章访问数:  657
  • HTML全文浏览量:  265
  • PDF下载量:  79
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-03-04
  • 录用日期:  2024-04-08
  • 网络出版日期:  2024-06-27
  • 整期出版日期:  2026-05-26

目录

    /

    返回文章
    返回
    常见问答