留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

多尺度感知与红外特征增强的RGB-T人群计数方法

郑棣文 石洋宇 谢承杰 卢树华

郑棣文,石洋宇,谢承杰,等. 多尺度感知与红外特征增强的RGB-T人群计数方法[J]. 北京航空航天大学学报,2026,52(6):2208-2218
引用本文: 郑棣文,石洋宇,谢承杰,等. 多尺度感知与红外特征增强的RGB-T人群计数方法[J]. 北京航空航天大学学报,2026,52(6):2208-2218
ZHENG D W,SHI Y Y,XIE C J,et al. RGB-T crowd counting method with multi-scale perception and infrared feature enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(6):2208-2218 (in Chinese)
Citation: ZHENG D W,SHI Y Y,XIE C J,et al. RGB-T crowd counting method with multi-scale perception and infrared feature enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(6):2208-2218 (in Chinese)

多尺度感知与红外特征增强的RGB-T人群计数方法

doi: 10.13700/j.bh.1001-5965.2024.0250
基金项目: 

中国人民公安大学双一流创新研究专项(2023SYL08)

详细信息
    通讯作者:

    E-mail:lushuhua@ppsuc.edu.cn

  • 中图分类号: TP391.4

RGB-T crowd counting method with multi-scale perception and infrared feature enhancement

Funds: 

Double First-Class Innovation Research Project for People’s Public Security University of China (2023SYL08)

More Information
  • 摘要:

    RGB-T人群计数旨在利用可见光与热图像的互补信息生成人群密度图,以应对低光照场景下人群计数任务。针对RGB-T人群计数在跨模态信息融合时,存在尺度变化、背景干扰等问题,提出一种基于多尺度感知与红外特征增强的RGB-T人群计数方法(MSENet)。该方法提出RGB-T特征融合机制(RTFM),通过多分支结构实现多尺度特征提取,设计红外增强结构以充分捕捉热图像中的人群信息;利用密集连接与信息发散机制将互补特征传递到各个模态中,实现互补特征表达复用及模态特征增强。所提方法在RGBT-CC数据集和ShanghaiTechRGBD数据集上进行了对比实验。结果表明:所提方法优于现有的一些先进方法,具有较好的准确性、稳健性及良好的泛化性。

     

  • 图 1  不同光线下RGB图像与热图像

    Figure 1.  RGB images and thermal images in different lighting conditions

    图 2  MSENet结构

    Figure 2.  Architecture of MSENet

    图 3  RTFM结构

    Figure 3.  Architecture of RTFM

    图 4  跨模态特征融合

    Figure 4.  Cross-modal feature fusion

    图 5  红外特征增强结构

    Figure 5.  Architecture of thermal enhanced structure

    图 6  不同光照条件下生成的人群密度图

    Figure 6.  Crowd density maps generated in different illumination conditions

    图 7  消融实验结果对比

    Figure 7.  Comparison of the results of ablation experiments

    表  1  RGBT-CC 数据集结果对比

    Table  1.   Comparison results on the RGBT-CC dataset

    方法 框架 GAME(0) GAME(1) GAME(2) GAME(3) RMSE
    IADM+BL[3] BL 15.61 19.95 24.69 32.89 28.18
    CSCA+BL[4] BL 14.32 18.91 23.81 32.47 26.01
    MAT[6] BL 12.35 16.29 20.81 29.09 22.53
    CSA-Net [8] BL 12.45 16.46 21.48 30.62 21.64
    IADM+CSRNet[3] CSRNet 17.94 21.44 26.17 33.33 30.91
    CSA-Net[8] CSRNet 15.77 19.40 24.14 30.14 29.17
    CNCTrans[5] CNN+Transformer 13.96 17.98 23.03 31.15 24.55
    TAFNet[7] VGG16 12.38 16.98 21.86 30.19 22.45
    MSENet(本文) BL 12.21 16.32 20.69 28.94 21.59
    下载: 导出CSV

    表  2  RGBT-CC数据集上不同光照条件下的实验结果

    Table  2.   The results under different illumination conditions on RGBT-CC dataset

    光照 方法 GAME(0) GAME(1) GAME(2) GAME(3) RMSE
    明亮 IADM+CSRNet [3] 20.36 23.57 28.49 36.29 32.57
    TAFNet[7] 15.57 20.65 26.67 36.17 24.25
    CNCTrans[5] 15.05 19.04 24.21 32.91 25.00
    MSENet(本文) 13.81 17.89 23.26 31.12 24.84
    黑暗 IADM+CSRNet [3] 15.44 19.23 23.79 30.28 29.11
    TAFNet[7] 14.20 19.20 24.00 31.63 27.50
    CNCTrans[5] 13.34 17.38 21.73 29.16 24.70
    MSENet(本文) 11.62 16.19 20.06 27.63 21.27
    下载: 导出CSV

    表  3  ShanghaiTechRGBD 数据集结果对比

    Table  3.   Comparison results on the ShanghaiTechRGBD dataset

    方法 GAME(0) GAME(1) GAME(2) GAME(3) RMSE
    UC-Net[26] 10.81 15.24 22.04 32.98 15.70
    HDFNet[27] 8.32 13.93 17.97 22.62 13.01
    BBS-Net[28] 6.26 8.53 11.80 16.46 9.26
    IADM+BL[3] 7.13 9.28 13.00 19.53 10.27
    CSCA+BL[4] 5.68 7.70 10.45 15.88 8.66
    CSA-Net[8] 4.57 5.82 8.02 12.47 6.83
    MSENet(本文) 4.75 6.29 8.96 12.04 7.55
    下载: 导出CSV

    表  4  消融实验结果对比

    Table  4.   Comparison results of ablation experiments

    方法GAME(0)GAME(1)GAME(2)GAME(3)RMSE
    Baseline15.4318.6124.1731.1325.22
    Baseline+Fusion12.7916.9422.0229.7022.05
    Baseline+Fusion+TES12.2116.3220.6928.9421.59
    下载: 导出CSV

    表  5  RTFM中不同分支数量在RGBT-CC数据集上的结果对比

    Table  5.   Comparison results of different branch numbers in RTFM on the RGBT-CC dataset

    分支数量GAME(0)GAME(1)GAME(2)GAME(3)RMSE
    1分支17.9523.6130.8439.4232.26
    2分支16.2221.8128.0338.5128.54
    3分支14.2718.9524.1634.9224.37
    4分支12.7416.9421.6730.6321.81
    5分支13.4718.1123.0633.0322.24
    6分支14.5319.4625.2335.6223.68
    下载: 导出CSV
  • [1] LIU C H, CHEN Y F, HE X Y, et al. A scale aggregation and spatial-aware network for multi-view crowd counting[J]. IEEE Acces, 2022, 10: 108604-108613.
    [2] GAO G S, GAO J Y, LIU Q J, et al. CNN-based density estimation and crowd counting: a survey[EB/OL]. (2020-03-28)[2024-01-03]. https://doi.org/10.48550/arXiv.2003.12783.
    [3] LIU L, CHEN J, WU H, et al. Cross-modal collaborative representation learning and a large-scale RGBT benchmark for crowd counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 4821-4831.
    [4] ZHANG Y, CHOI S, HONG S. Spatio-channel attention blocks for cross-modal crowd counting[C]//Proceedings of the Computer Vision–ACCV 2022. Berlin: Springer, 2023: 22-40.
    [5] ZHANG S H, WANG W, ZHAO W B, et al. A cross-modal crowd counting method combining CNN and cross-modal transformer[J]. Image and Vision Computing, 2023, 129: 104592.
    [6] WU Z T, LIU L B, ZHANG Y, et al. Multimodal crowd counting with mutual attention transformers[C]//Proceedings of the IEEE International Conference on Multimedia and Expo. Piscataway: IEEE Press, 2022: 1-6.
    [7] TANG H H, WANG Y, CHAU L P. TAFNet: A three-stream adaptive fusion network for RGB-T crowd counting[C]//Proceedings of the IEEE International Symposium on Circuits and Systems. Piscataway: IEEE Press, 2022: 3299-3303.
    [8] LI H, ZHANG J G, KONG W H, et al. CSA-Net: cross-modal scale-aware attention-aggregated network for RGB-T crowd counting[J]. Expert Systems with Applications, 2023, 213: 119038.
    [9] CHENG J H, CHEN Z J, ZHANG X Y, et al. Exploit the potential of multi-column architecture for crowd counting[EB/OL]. (2020-07-28)[2024-01-30]. https://doi.org/10.48550/arXiv.2007.05779.
    [10] BAI S, HE Z Q, QIAO Y, et al. Adaptive dilated network with self-correction supervision for counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4593-4602.
    [11] LIU L B, QIU Z L, LI G B, et al. Crowd counting with deep structured scale integration network[C]//Proceedings of the IEEE/CVF IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE Press, 2020: 1774-1783.
    [12] LIU L B, ZHEN J J, LI G B, et al. Dynamic spatial-temporal representation learning for traffic flow prediction[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(11): 7169-7183.
    [13] LIU W Z, SALZMANN M, FUA P. Contextaware crowd counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 5094-5103.
    [14] DENG L J, ZHOU Q H, WANG S H, et al. Deep learning in crowd counting: a survey[J]. CAAI Transactions on Intelligence Technology, 2024, 9(5): 1043-1077.
    [15] LI B, HUANG H B, ZHANG A, et al. Approaches on crowd counting and density estimation: a review[J]. Pattern Analysis and Applications, 2021, 24(3): 853-874.
    [16] JIANG X L, XIAO Z H, ZHANG B C, et al. Crowd counting and density estimation by trellis encoder-decoder networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6126-6135.
    [17] 余鹰, 朱慧琳, 钱进, 等. 基于深度学习的人群计数研究综述[J]. 计算机研究与发展, 2021, 58(12): 2724-2747.

    YU Y, ZHU H L, QIAN J, et al. Survey on deep learning based crowd counting[J]. Journal of Computer Research and Development, 2021, 58(12): 2724-2747(in Chinese).
    [18] ZHANG Y Y, ZHOU D S, CHEN S Q, et al. Single-image crowd counting via multi-column convolutional neural network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 589-597.
    [19] SAM D B, SURYA S, BABU R V. Switching convolutional neural network for crowd counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 4031-4039.
    [20] LI Y H, ZHANG X F, CHEN D M. CSRNet: dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 1091-1100.
    [21] MA Z H, WEI X, HONG X P, et al. Bayesian loss for crowd count estimation with point supervision[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6141-6150.
    [22] PENG T, LI Q, ZHU P F. RGB-T crowd counting from drone: a benchmark and MMCCN network[C]//Proceedings of the Computer Vision-ACCV 2020. Berlin: Springer, 2021: 497-513.
    [23] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the Computer Vision-ECCV 2018. Berlin: Springer, 2018: 3-19.
    [24] LIAN D Z, LI J, ZHENG J, et al. Density map regression guided detection network for RGB-D crowd counting and localization[C]//Proceedings of the IEEE/CVF Conference on Computer Cision and Pattern Recognition. Piscataway: IEEE Press, 2020: 1821-1830.
    [25] GUERRERO-GÓMEZ-OLMEDO R, TORRE-JIMÉNEZ B, LÓPEZ-SASTRE R, et al. Extremely overlapping vehicle counting[C]//Proceedings of the Pattern Recognition and Image Analysis. Berlin: Springer, 2015: 423-431.
    [26] ZHANG J, FAN D P, DAI Y C, et al. UC-Net: uncertainty inspired RGB-D saliency detection via conditional variational autoencoders[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 8579-8588.
    [27] PANG Y W, ZHANG L H, ZHAO X Q, et al. Hierarchical dynamic filtering network for RGB-D salient object detection[C]//Proceedings of the Computer Vision-ECCV 2020. Berlin: Springer, 2020: 235-252.
    [28] FAN D P, ZHAI Y J, BORJI A, et al. BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network[C]//Proceedings of the Computer Vision-ECCV 2020. Berlin: Springer, 2020: 275-292.
  • 加载中
图(7) / 表(5)
计量
  • 文章访问数:  11
  • HTML全文浏览量:  3
  • PDF下载量:  4
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-04-24
  • 录用日期:  2024-09-13
  • 网络出版日期:  2024-09-19
  • 整期出版日期:  2026-06-30

目录

    /

    返回文章
    返回
    常见问答