留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多特征图像视觉显著性的视频摘要化生成

金海燕 曹甜 肖聪 肖照林

金海燕, 曹甜, 肖聪, 等 . 基于多特征图像视觉显著性的视频摘要化生成[J]. 北京航空航天大学学报, 2021, 47(3): 441-450. doi: 10.13700/j.bh.1001-5965.2020.0479
引用本文: 金海燕, 曹甜, 肖聪, 等 . 基于多特征图像视觉显著性的视频摘要化生成[J]. 北京航空航天大学学报, 2021, 47(3): 441-450. doi: 10.13700/j.bh.1001-5965.2020.0479
JIN Haiyan, CAO Tian, XIAO Cong, et al. Video summary generation based on multi-feature image and visual saliency[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 441-450. doi: 10.13700/j.bh.1001-5965.2020.0479(in Chinese)
Citation: JIN Haiyan, CAO Tian, XIAO Cong, et al. Video summary generation based on multi-feature image and visual saliency[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 441-450. doi: 10.13700/j.bh.1001-5965.2020.0479(in Chinese)

基于多特征图像视觉显著性的视频摘要化生成

doi: 10.13700/j.bh.1001-5965.2020.0479
基金项目: 

陕西省技术创新引导计划 2020CGXNG-026

陕西省自然科学基础研究计划 2019JM-221

详细信息
    作者简介:

    金海燕   女,博士,教授,博士生导师,CCF会员。主要研究方向:计算机视觉、图像处理、智能信息处理等

    曹甜   女,硕士研究生。主要研究方向:计算机视觉、图像处理等

    肖聪   男, 硕士研究生。主要研究方向:计算机视觉、图像处理等

    肖照林   男,博士,副教授,硕士生导师,CCF会员。主要研究方向:计算机视觉、计算摄影学等

    通讯作者:

    肖照林, E-mail:xiaozhaolin@xaut.edu.cn

  • 中图分类号: TP391.41

Video summary generation based on multi-feature image and visual saliency

Funds: 

Technology Innovation Leading Program of Shaanxi 2020CGXNG-026

Natural Science Basic Research Program of Shaanxi 2019JM-221

More Information
  • 摘要:

    如何高效提取视频内容即视频摘要化,一直是计算机视觉领域研究的热点。简单通过图像颜色、纹理等特征进行检测已无法有效、完整地获取视频摘要。基于视觉注意力金字塔模型,提出了一种改进的可变比例及双对比度计算的中心-环绕视频摘要化方法。首先,以超像素方法对视频图像序列进行像素块划分以加速图像计算;然后,检测不同颜色背景下的图像对比度特征差异并进行融合;最后,结合光流运动信息,合并静态图像与动态图像显著性结果提取视频关键帧,在提取关键帧时,利用感知哈希函数进行相似性判断完成视频摘要化生成。在Segtrack V2、ViSal及OVP数据集上进行仿真实验,结果表明:所提方法可以有效提取图像感兴趣区域,得到以关键帧图像序列表示的视频摘要。

     

  • 图 1  动态显著图调整效果前后对比

    Figure 1.  Effect comparison of dynamic saliency map before and after adjustment

    图 2  显著结果自适应融合

    Figure 2.  Adaptive fusion of saliency results

    图 3  关键帧提取主要方法内容和整体技术框架

    Figure 3.  Main method content and overall technical framework of key frame extraction

    图 4  显著性检测效果增强结果

    Figure 4.  Enhancement results of saliency detection effect

    图 5  数据集在不同方法上的显著性图比较

    Figure 5.  Comparison of saliency maps of datasets among different methods

    图 6  F-measure在不同数据集上的情况

    Figure 6.  F-measure on different datasets

    图 7  视频“v20.flv”及“v101.flv”在不同摘要算法下的结果

    Figure 7.  Results of video "v20.flv" and "v101.flv" under different summarization algorithms

    图 8  运动视频在不同摘要算法下的结果

    Figure 8.  Results of sports video under different summarization glgorithms

    表  1  运动视频在不同摘要算法下的对比

    Table  1.   Comparison of sports videos under various summarization algorithms

    算法 准确率 错误率 漏检率 精度 召回率 F-measure
    OV 0.58 0.08 0.42 0.88 0.58 0.7
    VSUMM 0.42 0.08 0.58 0.83 0.42 0.56
    STIMO 0.67 0.08 0.33 0.89 0.67 0.76
    SD 0.33 0.25 0.67 0.57 0.33 0.42
    KBKS 0.5 0.08 0.5 0.86 0.5 0.63
    本文 0.92 0 0.08 0.92 0.92 0.92
    下载: 导出CSV
  • [1] 唐铭谦. 基于对象的监控视频摘要算法研究[D]. 西安: 西安电子科技大学, 2018: 1-3.

    TANG M Q. The research of surveillance video synopsis algorithm based on objects[D]. Xi'an: Xidian University, 2018: 1-3(in Chinese).
    [2] 刘全, 翟建伟, 钟珊, 等. 一种基于视觉注意力机制的深度循环Q网络模型[J]. 计算机学报, 2017, 40(6): 1353-1366. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201706008.htm

    LIU Q, ZHAI J W, ZHONG S, et al. A deep recurrent Q-network based on visual attention mechanism[J]. Chinese Journal of Computers, 2017, 40(6): 1353-1366(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX201706008.htm
    [3] 郎洪, 丁朔, 陆键, 等. 复杂场景下的交通视频显著性前景目标提取[J]. 中国图象图形学报, 2018, 24(1): 50-63. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201901006.htm

    LANG H, DING S, LU J, et al. Traffic video significance foreground target extraction in complex scenes[J]. Journal of Image and Graphics, 2018, 24(1): 50-63(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201901006.htm
    [4] 张芳, 王萌, 肖志涛, 等. 基于全卷积神经网络与低秩稀疏分解的显著性检测[J]. 自动化学报, 2019, 45(11): 2149-2158. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911011.htm

    ZHANG F, WANG M, XIAO Z T, et al. Saliency detection via full convolution neural network and low rank sparse decomposition[J]. Acta Automatica Sinica, 2019, 45(11): 2149-2158(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911011.htm
    [5] 李庆武, 马云鹏, 周亚琴, 等. 基于无监督栈式降噪自编码网络的显著性检测算法[J]. 电子学报, 2019, 47(4): 871-879. doi: 10.3969/j.issn.0372-2112.2019.04.015

    LI Q W, MA Y P, ZHOU Y Q, et al. Saliency detection based on unsupervised SDAE network[J]. Acta Electronica Sinica, 2019, 47(4): 871-879(in Chinese). doi: 10.3969/j.issn.0372-2112.2019.04.015
    [6] 陈炳才, 陶鑫, 陈慧, 等. 融合边界连通性与局部对比性的图像显著性检测[J]. 计算机学报, 2020, 43(1): 16-28. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202001002.htm

    CHEN B C, TAO X, CHEN H, et al. Saliency detection via fusion of boundary connectivity and local contrast[J]. Chinese Journal of Computers, 2020, 43(1): 16-28(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202001002.htm
    [7] ABLAVATSKI A, LU S, CAI J. Enriched deep recurrent visual attention model for multiple object recognition[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2017: 971-978.
    [8] QU S, XI Y, DING S. Visual attention based on long-short term memory model for image caption generation[C]//Proceedings of Chinese Control and Decision Conference (CCDC). Piscataway: IEEE Press, 2017: 4789-4794.
    [9] LIU G H, YANG J Y. Exploiting color volume and color difference for salient region detection[J]. IEEE Transactions on Image Processing, 2019, 28(1): 6-16. doi: 10.1109/TIP.2018.2847422
    [10] LI Z, TANG J, WANG X, et al. Multimedia news summarization in search[J]. ACM Transactions on Intelligent Systems and Technology, 2016, 7(3): 1-20.
    [11] HU T L, LI Z C. Video summarization via exploring the global and local importance[J]. Multimedia Tools and Applications, 2018, 77(17): 22083-22098. doi: 10.1007/s11042-017-5479-y
    [12] MENG J, WANG S, WANG H, et al. Video summarization via multi-view representative selection[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 1189-1198.
    [13] ACHANTA R, SHAJI A, SMITH K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11): 2274-2282. doi: 10.1109/TPAMI.2012.120
    [14] YANG C, ZHANG L, LU H, et al. Saliency detection via graph-based manifold ranking[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2013: 3166-3173.
    [15] PERAZZI F, KRAHENBUHL P, PRITCH Y, et al. Saliency filters: Contrast based filtering for salient region detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2012: 733-740.
    [16] ACHANTA R, HEMAMI S, ESTRADA F, et al. Frequency-tuned salient region detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2009: 1597-1604.
    [17] WEI Y C, WEN F, ZHU W, et al. Geodesic saliency using background priors[C]//Proceedings of European Conference on Computer Vision (ECCV). Berlin: Springer, 2012: 29-42.
    [18] DEMENTHON D, KOBLA V, DOERMANN D. Video summarization by curve simplification[C]//Proceedings of ACM International Conference on Multimedia. New York: ACM Press, 1998: 211-218.
    [19] DE AVILA S E F, LOPES A P B, DA LUZ A, et al. VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern Recognition Letters, 2011, 32(1): 56-68. doi: 10.1016/j.patrec.2010.08.004
    [20] FURINI M, GERACI F, MONTANGERO M, et al. STIMO: Still and moving video storyboard for the web scenario[J]. Multimedia Tools and Applications, 2010, 46(1): 47-69. doi: 10.1007/s11042-009-0307-7
    [21] CONG Y, YUAN J, LUO J. Towards scalable summarization of consumer videos via sparse dictionary selection[J]. IEEE Transactions on Multimedia, 2012, 14(1): 66-75. doi: 10.1109/TMM.2011.2166951
    [22] GUAN G, WANG Z, LU S, et al. Keypoint based keyframe selection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 23(4): 729-734.
  • 加载中
图(8) / 表(1)
计量
  • 文章访问数:  563
  • HTML全文浏览量:  176
  • PDF下载量:  118
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-31
  • 录用日期:  2020-10-27
  • 网络出版日期:  2021-03-20

目录

    /

    返回文章
    返回
    常见问答