留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种结合全局和局部相似性的小样本分割方法

刘宇轩 孟凡满 李宏亮 杨嘉莹 吴庆波 许林峰

刘宇轩, 孟凡满, 李宏亮, 等 . 一种结合全局和局部相似性的小样本分割方法[J]. 北京航空航天大学学报, 2021, 47(3): 665-674. doi: 10.13700/j.bh.1001-5965.2020.0450
引用本文: 刘宇轩, 孟凡满, 李宏亮, 等 . 一种结合全局和局部相似性的小样本分割方法[J]. 北京航空航天大学学报, 2021, 47(3): 665-674. doi: 10.13700/j.bh.1001-5965.2020.0450
LIU Yuxuan, MENG Fanman, LI Hongliang, et al. A few shot segmentation method combining global and local similarity[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 665-674. doi: 10.13700/j.bh.1001-5965.2020.0450(in Chinese)
Citation: LIU Yuxuan, MENG Fanman, LI Hongliang, et al. A few shot segmentation method combining global and local similarity[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 665-674. doi: 10.13700/j.bh.1001-5965.2020.0450(in Chinese)

一种结合全局和局部相似性的小样本分割方法

doi: 10.13700/j.bh.1001-5965.2020.0450
基金项目: 

国家自然科学基金 61871087

四川省科技厅自然科学基金 2018JY0141

详细信息
    作者简介:

    刘宇轩   男,硕士研究生。主要研究方向:计算机视觉与图像分割

    孟凡满   男,博士, 副教授,博士生导师。主要研究方向:智能图像分析、深度学习

    通讯作者:

    孟凡满, E-mail: fmmeng@uestc.edu.cn

  • 中图分类号: G202;U285.49;G623.58

A few shot segmentation method combining global and local similarity

Funds: 

National Natural Science Foundation of China 61871087

Natural Science Foundation of Sichuan Science and Technology Department 2018JY0141

More Information
  • 摘要:

    针对小样本分割中如何提取支持图像和查询图像共性信息的问题,提出一种新的小样本分割模型,同时结合了全局相似性和局部相似性,实现了更具泛化能力的小样本分割。具体地,根据支持图像和查询图像全局特征和局部特征之间的相似性,提出了一种新型注意力谱生成器,进而实现查询图像的注意力谱生成和区域分割。所提注意力谱生成器包含2个级联模块:全局引导器和局部引导器。在全局引导器中,提出了一种新的基于指数函数的全局相似性度量,对查询图像特征和支持图像的全局特征进行关系建模,输出前景增强的查询图像特征。在局部引导器中,通过引入局部关系矩阵对支持图像特征和查询图像特征之间的局部相似性进行建模,得到与类别无关的注意力谱。在Pascal-5i数据集上做了大量的实验,在1-shot设定下mIoU达到了59.9%,5-shot设定下mIoU达到了61.9%,均优于现有方法。

     

  • 图 1  本文方法总体框架

    Figure 1.  General framework of proposed method

    图 2  注意力谱生成器结构

    Figure 2.  Structure of attention generator

    图 3  全局引导器的细节结构

    Figure 3.  Detailed structure of global guider

    图 4  局部引导器的细节结构

    Figure 4.  Detailed structure of local guider

    图 5  上采样网络架构

    Figure 5.  Framework of up-sample network

    图 6  部分分割效果较好的可视化结果

    Figure 6.  Some visualized high-quality segmentation results

    图 7  部分分割效果较差的可视化结果

    Figure 7.  Some visualized low-quality segmentation results

    表  1  Pascal-5i四个子集的划分

    Table  1.   Four subsets setting of Pascal-5i

    子集 类别
    Fold0 飞机、自行车、鸟、船、瓶子
    Fold1 公交车、轿车、猫、椅子、牛
    Fold2 餐桌、狗、马、摩托车、人
    Fold3 盆栽、山羊、沙发、火车、显示器
    下载: 导出CSV

    表  2  不同主干网络下,本文与现有方法的1-shot对比实验mIoU结果

    Table  2.   Comparative experimental results (mIoU) of proposed method and existing methods under 1-shot setting using different backbone networks   %

    主干网络 方法 mIoU 平均值
    F0 F1 F2 F3
    VGG16 OSLSM[4] 33.6 55.3 40.9 33.5 40.8
    Co-FCN[29] 36.7 50.6 44.9 32.4 41.2
    SG-One[1] 40.2 58.4 48.4 38.4 46.4
    PANet[6] 42.3 58.0 51.1 41.2 48.2
    FWB[8] 47.0 59.6 52.6 48.3 51.9
    本文 50.3 65.3 53.0 50.8 54.9
    ResNet50 CANet[2] 52.5 65.9 51.3 51.9 55.4
    LTM[3] 54.6 65.6 56.6 51.3 57.0
    CRNet[30] 55.7
    本文 54.6 67.8 57.4 52.1 58.0
    ResNet101 FWB[8] 51.3 64.5 56.7 52.2 56.2
    本文 57.5 68.7 58.7 54.5 59.9
    下载: 导出CSV

    表  3  不同主干网络下,本文与现有方法的5-shot对比实验mIoU结果

    Table  3.   Comparative experimental results (mIoU) of proposed method and existing methods under 5-shot setting using different backbone networks   %

    主干网络 方法 mIoU 平均值
    F0 F1 F2 F3
    VGG16 OSLSM[4] 35.9 58.1 42.7 39.1 44.0
    Co-FCN[29] 37.5 50.0 44.1 33.9 41.4
    SG-One[1] 41.9 58.6 48.6 39.4 47.1
    PANet[6] 51.8 64.6 59.8 46.5 55.7
    FWB[8] 50.9 62.9 56.5 50.1 55.1
    本文 50.3 66.3 54.7 55.3 56.7
    ResNet50 CANet[2] 55.5 67.8 51.9 53.2 57.1
    LTM[3] 56.4 66.6 56.9 56.8 59.2
    CRNet[30] 58.8
    本文 54.8 68.1 59.9 56.2 59.8
    ResNet101 FWB[8] 54.9 67.4 62.2 55.3 60.0
    本文 58.1 69.8 60.8 58.9 61.9
    下载: 导出CSV

    表  4  不同主干网络下,本文与现有方法的1-shot对比实验FB-IoU结果

    Table  4.   Comparative experimental results (FB-IoU) of proposed method and existing methods under 1-shot setting using different backbone networks   %

    主干网络 方法 FB-IoU 平均值
    F0 F1 F2 F3
    VGG16 OSLSM[4] 61.3
    Co-FCN[29] 60.1
    SG-One[1] 63.1
    PANet[6] 66.5
    本文 68.6 77.3 65.3 68.4 69.9
    ResNet50 CANet[2] 71.0 76.7 54.0 67.2 67.2
    CRNet[30] 66.8
    本文 71.5 78.7 70.6 69.1 72.5
    ResNet101 本文 73.7 79.4 71.8 70.2 73.8
    下载: 导出CSV

    表  5  不同主干网络下,本文与现有方法的5-shot对比实验FB-IoU结果

    Table  5.   Comparative experimental results (FB-IoU) of proposed method and existing methods under 5-shot setting using different backbone networks   %

    主干网络 方法 FB-IoU 平均值
    F0 F1 F2 F3
    VGG16 OSLSM[4] 61.5
    Co-FCN[29] 60.2
    SG-One[1] 65.9
    PANet[6] 70.7
    本文 68.0 77.6 66.5 71.8 71.0
    ResNet50 CANet[2] 74.2 80.3 57.0 66.8 69.6
    CRNet[30] 71.5
    本文 71.1 78.8 72.7 69.7 73.1
    ResNet101 本文 73.5 80.0 72.6 72.9 74.8
    下载: 导出CSV

    表  6  全局相似性度量方式的对比实验mIoU结果

    Table  6.   Comparative experimental results (mIoU) of global similarity metric   %

    全局度量方式 mIoU
    1-shot 5-shot
    余弦距离 51.3 53.2
    通道维度拼接 46.5 47.2
    所提全局相似性度量 53.8 55.7
    下载: 导出CSV

    表  7  5-shot设定方案对比实验mIoU结果

    Table  7.   Comparative experimental results (mIoU) under 5-shot setting   %

    设定方式 mIoU
    平均化注意力谱 59.1
    所提k-shot方案 59.7
    下载: 导出CSV

    表  8  全局引导器和局部引导器的消去实验mIoU结果

    Table  8.   Ablation experimental results (mIoU) of global guider and local guider   %

    全局引导器 局部引导器 mIoU
    1-shot 5-shot
    53.8 55.7
    56.8 59.0
    58.0 59.7
    下载: 导出CSV

    表  9  损失函数的消去实验mIoU结果

    Table  9.   Ablation experimental result (mIoU) of loss function   %

    Lseg La Lseg0 mIoU
    1-shot 5-shot
    55.4 57.7
    56.6 58.6
    55.9 57.9
    58.0 59.7
    下载: 导出CSV
  • [1] ZHANG X, WEI Y, YANG Y, et al. Sg-One: Similarity guidance network for one-shot semantic segmentation[J]. IEEE Transactions on Cybernetics, 2020, 50(9): 3855-3865. doi: 10.1109/TCYB.2020.2992433
    [2] ZHANG C, LIN G, LIU F, et al. CANet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5217-5226.
    [3] YANG Y, MENG F, LI H, et al. A new local transformation module for few-shot segmentation[C]//International Conference on Multimedia Modeling. Berlin: Springer, 2020: 76-87.
    [4] SHABAN A, BANSAL S, LIU Z, et al. One-shot learning for semantic segmentation[EB/OL]. [2020-07-18]. https: //arxiv.org/abs/1709.03410.
    [5] TIAN P, WU Z, QI L, et al. Differentiable meta-learning model for few-shot semantic segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI Press, 2020: 12087-12094.
    [6] WANG K, LIEW J H, ZOU Y, et al. PANet: Few-shot image semantic segmentation with prototype alignment[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9197-9206.
    [7] LIU J, QIN Y. Prototype refinement network for few-shot segmentation[EB/OL]. (2020-05-09)[2020-07-18]. https://arxiv.org/abs/2002.03579.
    [8] NGUYEN K, TODOROVIC S. Feature weighting and boosting for few-shot segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 622-631.
    [9] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[J/OL]. (2017-07-18)[2020-07-18]. https://arxiv.org/abs/1703.03400.
    [10] KIM J, KIM T, KIM S, et al. Edge-labeling graph neural network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 11-20.
    [11] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017: 4077-4087.
    [12] SUNG F, YANG Y, ZHANG L, et al. Learning to compare: Relation network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 1199-1208.
    [13] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3431-3440.
    [14] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. [2020-07-18]. https://arxiv.org/abs/1706.05587.
    [15] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. doi: 10.1109/TPAMI.2017.2699184
    [16] LIN G, MILAN A, SHEN C, et al. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1925-1934.
    [17] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted intervention. Berlin: Springer, 2015: 234-241.
    [18] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2881-2890.
    [19] 潘崇煜, 黄健, 郝建国, 等. 融合零样本学习和小样本学习的弱监督学习方法综述[J]. 系统工程与电子技术, 2020, 42(10): 2246-2256. doi: 10.3969/j.issn.1001-506X.2020.10.13

    PAN C Y, HUANG J, HAO J G, et al. Survey of weakly supervised learning integrating zero-shot and few-shot learning[J]. Journal of Systems Engineering and Electronics, 2020, 42(10): 2246-2256(in Chinese). doi: 10.3969/j.issn.1001-506X.2020.10.13
    [20] 孙文赟, 金忠, 赵海涛, 等. 基于深度特征增广的跨域小样本人脸欺诈检测算法[J/OL]. 计算机科学[2020-09-27]. http://kns.cnki.net/kcms/detail/50.1075.TP.20200911.1518.024.html.

    SUN W Y, JIN Z, ZHAO H T, et al. Cross-domain few-shot face presentation attack detection method based on deep feature augmentation[J/OL]. Computer Science, 2020[2020-09-27]. http://kns.cnki.net/kcms/detail/50.1075.TP.20200911.1518.024.html (in Chinese).
    [21] 于重重, 萨良兵, 马先钦, 等. 基于度量学习的小样本零器件表面缺陷检测[J]. 仪器仪表学报, 2020(7): 214-223. https://www.cnki.com.cn/Article/CJFDTOTAL-YQXB202007024.htm

    YU C C, SA L B, MA X Q, et al. Few-shot parts surface defect detection based on the metric learning[J]. Chinese Journal of Scientific Instrument, 2020(7): 214-223. https://www.cnki.com.cn/Article/CJFDTOTAL-YQXB202007024.htm
    [22] 雷相达, 王宏涛, 赵宗泽. 基于迁移学习的小样本机载激光雷达点云分类[J]. 中国激光, 2020, 47(11): 1110002.

    LEI X D, WANG H T, ZHAO Z Z. Small sample airborne LiDAR point cloud classification based on transfer learning[J]. Chinese Journal of Lasers, 2020, 47(11): 1110002.
    [23] 李文煜, 帅仁俊, 郭汉. 克服小样本学习中灾难性遗忘方法研究[J]. 计算机应用与软件, 2020, 37(7): 136-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ202007023.htm

    LI W Y, SHUAI R J, GUO H. Overcoming catastrophic forgetting in few-shot learning[J]. Computer Applications and Software, 2020, 37(7): 136-141(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ202007023.htm
    [24] WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
    [25] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
    [26] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2014-11-18)[2020-07-18]. https://arxiv.org/abs/1409.1556.
    [27] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [28] KRÄHENBVHL P, KOLTUN V. Efficient inference in fully connected CRFs with Gaussian edge potentials[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2011: 109-117.
    [29] RAKELLY K, SHELHAMER E, DARRELL T, et al. Conditional networks for few-shot semantic segmentation[C]. International Conference on Learning Representations, 2018.
    [30] LIU W, ZHANG C, LIN G, et al. CRNet: Cross-reference networks for few-shot segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4165-4173.
  • 加载中
图(7) / 表(9)
计量
  • 文章访问数:  881
  • HTML全文浏览量:  91
  • PDF下载量:  177
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-24
  • 录用日期:  2020-09-11
  • 网络出版日期:  2021-03-20

目录

    /

    返回文章
    返回
    常见问答