留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

低延迟视频编码技术

宋利 刘孝勇 武国庆 朱辰 黄琰 解蓉 张文军

宋利, 刘孝勇, 武国庆, 等 . 低延迟视频编码技术[J]. 北京航空航天大学学报, 2021, 47(3): 558-571. doi: 10.13700/j.bh.1001-5965.2020.0463
引用本文: 宋利, 刘孝勇, 武国庆, 等 . 低延迟视频编码技术[J]. 北京航空航天大学学报, 2021, 47(3): 558-571. doi: 10.13700/j.bh.1001-5965.2020.0463
SONG Li, LIU Xiaoyong, WU Guoqing, et al. Low-latency video coding techniques[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 558-571. doi: 10.13700/j.bh.1001-5965.2020.0463(in Chinese)
Citation: SONG Li, LIU Xiaoyong, WU Guoqing, et al. Low-latency video coding techniques[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 558-571. doi: 10.13700/j.bh.1001-5965.2020.0463(in Chinese)

低延迟视频编码技术

doi: 10.13700/j.bh.1001-5965.2020.0463
基金项目: 

国家重点研发计划 2019YFB1802701

国家自然科学基金 61671296

详细信息
    作者简介:

    宋利  男,博士,教授,博士生导师。主要研究方向:新型视频编码、大数据压缩、移动计算视觉

    刘孝勇  男,博士研究生。主要研究方向:可伸缩视频编码、码率控制

    武国庆  男,硕士研究生。主要研究方向:视频编码

    朱辰  男,博士研究生。主要研究方向:视频编码

    黄琰  男,博士研究生。主要研究方向:视频编码

    解蓉  女,博士,副教授,硕士生导师。主要研究方向:视频编码与转码、图像/视频处理

    张文军  男,博士,教授,博士生导师。主要研究方向:图像通信与数字电视、宽带无线传输、系统芯片设计

    通讯作者:

    宋利, E-mail: song_li@sjtu.edu.cn

  • 中图分类号: TN919.8

Low-latency video coding techniques

Funds: 

National Key R & D Program of China 2019YFB1802701

National Natural Science Foundation of China 61671296

More Information
  • 摘要:

    随着视频编码和视频传输技术的广泛应用,视频需求量剧增,实时视频通信成为视频行业的一项重要研究内容,核心目标是提供更好的用户体验和更低的延迟。低延迟视频编码是实时视频通信应用的关键部分,通过降低编码延迟可以有效地降低系统的整体延迟。首先,分析了视频传输系统的延迟来源,从通用的视频编码框架出发着重介绍了编码延迟的产生机制;其次,概述了国内外主流的视频编码标准,介绍了率失真优化技术的原理和模型,为低延迟视频编码器的设计提供了理论基础;最后,从参考结构、流水线设计、编码模式搜索、码率控制和硬件加速多个维度描述了优化编码延迟的技术手段,并总结了业界具有代表性的低延迟视频编码方案,简要说明了现有低延迟视频编码技术的局限性,并对未来的发展方向做了展望。

     

  • 图 1  HEVC的RA模式的参考关系

    Figure 1.  Reference structure of RA mode for HEVC

    图 2  HEVC的LDP模式的参考关系

    Figure 2.  Reference structure of LDP mode for HEVC

    图 3  帧级编码和条/宏块级编码的端到端延迟[22]

    Figure 3.  End-to-end latency of frame-level encoding and slice/macroblock-level encoding[22]

    图 4  帧、条、块级编码任务分解[23]

    Figure 4.  Encoding processes at frame, slice and block levels[23]

    图 5  片级并行和波前并行处理[22]

    Figure 5.  Tile parallelization and wavefront parallel processing[22]

    图 6  刷新和未刷新区域的编码块参考[45]

    Figure 6.  Encoding block reference structure for refreshed and unrefreshed regions[45]

    图 7  编码方案对比

    Figure 7.  Comparison of different encoding schemes

    表  1  不同方法的码率控制精度对比[52]

    Table  1.   Rate control accuracy comparison of different methods[52]

    序列 HM16.19 文献[51] 文献[52]
    F/% Seq/% F/% Seq/% F/% Seq/%
    Mobisode 2.55 0.010 1.98 0.010 1.01 0.003
    RaceHorses 1.53 0.008 1.43 0.007 0.66 0.004
    PartyScene 2.33 0.009 2.11 0.008 0.29 0.003
    Chromakey 2.55 0.010 2.32 0.009 0.33 0.003
    Wave 2.22 0.009 2.01 0.008 0.28 0.002
    Parkrun 2.73 0.011 2.38 0.009 0.41 0.003
    Kimono 1.22 0.007 1.34 0.008 0.67 0.004
    Tennis 2.76 0.011 2.41 0.009 0.44 0.004
    Average 2.24 0.009 2.00 0.009 0.51 0.003
    下载: 导出CSV

    表  2  编码方案配置

    Table  2.   Configuration of different encoding schemes

    编码方案 SVT-HEVC TPCast JPEG-XS WHDI
    编码模块 帧间/帧内预测 × ×
    变换
    量化
    熵编码 极简 ×
    延迟优化技术 参考关系简化 × ×
    硬件加速 ×
    帧内刷新 × × ×
    并行 × ×
    下载: 导出CSV
  • [1] CULLEN C. The global internet phenomena report[EB/OL]. [2020-08-26]. https://www.sandvine.com/press-releases/sandvine-releases-2019-global-internet-phenomena-report.
    [2] 高文, 赵德斌, 马思伟. 数字视频编码技术原理[M]. 2版. 北京: 科学出版社, 2018: 17.

    GAO W, ZHAO D B, MA S W. Principles of digital video coding technology[M]. 2nd ed. Beijing: Science Press, 2018: 17(in Chinese).
    [3] SCHREIER R M, ROTHERMEL A. A latency analysis on H. 264 video transmission systems[C]//Proceedings of the 2008 Digest of Technical Papers-International Conference on Consumer Electronics. Piscataway: IEEE Press, 2008: 307-308.
    [4] SULLIVAN G J, WIEGAND T. Rate-distortion optimization for video compression[J]. IEEE Signal Processing Magazine, 1998, 15(6): 74-90. doi: 10.1109/79.733497
    [5] BICHON M, LE T J, ROPERTOPERT M, et al. Inter-block dependencies consideration for intra coding in H. 264/AVC and HEVC standards[C]//Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2017: 1537-1541.
    [6] BICHON M, LE T J, ROPERT M, et al. Low complexity joint RDO of prediction units couples for HEVC intra coding[C]//Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2018: 1733-1737.
    [7] PANG C, AU O C, ZOU F, et al. Optimal distortion redistribution in block-based image coding using successive convex optimization[C]//Proceedings of the 2011 IEEE International Conference on Multimedia and Expo Workshops. Piscataway: IEEE Press, 2011: 1-5.
    [8] WU Q, XIONG J, LUO B, et al. A novel joint rate distortion optimization scheme for intra prediction coding in H. 264/AVC[J]. IEICE Transactions on Information and Systems, 2014, 97(4): 989-992. http://ci.nii.ac.jp/naid/130003394928
    [9] YANG T, ZHU C, FAN X, et al. Source distortion temporal propagation model for motion compensated video coding optimization[C]//Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops. Piscataway: IEEE Press, 2012: 85-90.
    [10] LI S, ZHU C, GAO Y, et al. Lagrangian multiplier adaptation for rate-distortion optimization with inter-frame dependency[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(1): 117-129. doi: 10.1109/TCSVT.2015.2450131
    [11] GONZÁLEZ-DE-SUSO J L, MARTÍNEZ-ENRÍQUEZ E, DÍAZ-DE-MARÍA F. Adaptive Lagrange multiplier estimation algorithm in HEVC[J]. Signal Processing: Image Communication, 2017, 56: 40-51. doi: 10.1016/j.image.2017.04.010
    [12] YANG K, WAN S, GONG Y, et al. An efficient Lagrangian multiplier selection method based on temporal dependency for rate-distortion optimization in H. 265/HEVC[J]. Signal Processing: Image Communication, 2017, 57: 68-75. doi: 10.1016/j.image.2017.05.006
    [13] ZHANG F, BULL D R. An adaptive Lagrange multiplier determination method for rate-distortion optimisation in hybrid video codecs[C]//Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE Press, 2015: 671-675.
    [14] WANG X, SONG L, LUO Z, et al. Lagrangian method based rate-distortion optimization revisited for dependent video coding[C]//Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE Press, 2017: 3021-3025.
    [15] LI C, WU D, XIONG H. Delay-power-rate-distortion model for wireless video communication under delay and energy constraints[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(7): 1170-1183. doi: 10.1109/TCSVT.2014.2302517
    [16] LI C, XIONG H, WU D. Delay-rate-distortion optimized rate control for wireless video communication[C]//Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE Press, 2014: 5996-6000.
    [17] HUANG B, CHEN Z, CAI Q, et al. Rate-distortion-complexity optimized coding mode decision for HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(3): 795-809. doi: 10.1109/TCSVT.2019.2893396
    [18] CARBALLEIRA P, CABRERA J, ORTEGA A, et al. A graph-based approach for latency modeling and optimization in multiview video encoding[C]//Proceedings of the 20113DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON). Piscataway: IEEE Press, 2011: 1-4.
    [19] WENGER S. Temporal scalability using P-pictures for low-latency applications[C]//Proceedings of the 1998 IEEE Second Workshop on Multimedia Signal Processing. Piscataway: IEEE Press, 1998: 559-564.
    [20] PAN Z, JIN P, LEI J, et al. Fast reference frame selection based on content similarity for low complexity HEVC encoder[J]. Journal of Visual Communication and Image Representation, 2016, 40: 516-524. doi: 10.1016/j.jvcir.2016.07.018
    [21] PARK S H, DONG T, JANG E S. Low complexity reference frame selection in QTBT structure for JVET future video coding[C]//Proceedings of the 2018 International Workshop on Advanced Image Technology (IWAIT). Piscataway: IEEE Press, 2018: 1-4.
    [22] DESHPANDE S, HANNUKSELA M M, KAZUI K, et al. An improved hypothetical reference decoder for HEVC[C]//International Society for Optics and Photonics. Bellingham: SPIE-INT SOC Optical Engineering, 2013: 866608.
    [23] SCHREIER R M, RAHMAN A M T I, KRISHNA-MURTHY G, et al. Architecture analysis for low-delay video coding[C]//Proceedings of the 2006 IEEE International Conference on Multimedia and Expo Workshops. Piscataway: IEEE Press, 2006: 2053-2056.
    [24] MEENDERINCK C, AZEVEDO A, JUURLINK B, et al. Parallel scalability of video decoders[J]. Journal of Signal Processing Systems, 2009, 57(2): 173-194. doi: 10.1007/s11265-008-0256-9
    [25] CHI C C, ALVAREZ-MESA M, JUURLINK B, et al. Parallel scalability and efficiency of HEVC parallelization approaches[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1827-1838. doi: 10.1109/TCSVT.2012.2223056
    [26] MOHAMED M, NEJMEDDINE B, NOUREDDINE B, et al. Performance evaluation of frame-level parallelization in HEVC intra coding using heterogeneous multicore platforms[C]//Proceedings of the 2018 International Conference on Applied Smart Systems (ICASS). Piscataway: IEEE Press, 2018: 1-6.
    [27] CHEN K, SUN J, DUAN Y, et al. A novel wavefront-based high parallel solution for HEVC encoding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(1): 181-194. doi: 10.1109/TCSVT.2015.2418651
    [28] WANG H, XIAO B, WU J, et al. A collaborative scheduling-based parallel solution for HEVC encoding on multicore platforms[J]. IEEE Transactions on Multimedia, 2018, 20(11): 2935-2948. doi: 10.1109/TMM.2018.2830120
    [29] OHM J R, SULLIVAN G J, SCHWARZ H, et al. Comparison of the coding efficiency of video coding standards-Including high efficiency video coding (HEVC)[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1669-1684. doi: 10.1109/TCSVT.2012.2221192
    [30] BOSSEN F, BROSS B, SUHRING K, et al. HEVC complexity and implementation analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1685-1696. doi: 10.1109/TCSVT.2012.2221255
    [31] TOPIWALA P, KRISHNAN M, DAI W. Performance comparison of VVC, AV1 and EVC[C]//Proceedings of Applications of Digital Image Processing XLⅡ. Bellingham: SPIE-INT SOC Optical Engineering, 2019.
    [32] SULLIVAN G J, OHM J R, HAN W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649-1668. doi: 10.1109/TCSVT.2012.2221191
    [33] ZHANG T, SUN M T, ZHAO D, et al. Fast intra-mode and CU size decision for HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(8): 1714-1726. doi: 10.1109/TCSVT.2016.2556518
    [34] MIN B, CHEUNG R C C. A fast CU size decision algorithm for the HEVC intra encoder[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2015, 25(5): 892-896. doi: 10.1109/TCSVT.2014.2363739
    [35] XIONG J, LI H, WU Q, et al. A fast HEVC inter CU selection method based on pyramid motion divergence[J]. IEEE Transactions on Multimedia, 2014, 16(2): 559-564. doi: 10.1109/TMM.2013.2291958
    [36] PAN Z, KWONG S, SUN M T, et al. Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC[J]. IEEE Transactions on Broadcasting, 2014, 60(2): 405-412. doi: 10.1109/TBC.2014.2321682
    [37] LIU Z, YU X, GAO Y, et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network[J]. IEEE Transactions on Image Processing, 2016, 25(11): 5088-5103. doi: 10.1109/TIP.2016.2601264
    [38] RYU S, KANG J. Machine learning-based fast angular prediction mode decision technique in video coding[J]. IEEE Transactions on Image Processing, 2018, 27(11): 5525-5538. doi: 10.1109/TIP.2018.2857404
    [39] XU M, LI T, WANG Z, et al. Reducing complexity of HEVC: A deep learning approach[J]. IEEE Transactions on Image Processing, 2017, 27(10): 5044-5059. http://ieeexplore.ieee.org/abstract/document/8384310/
    [40] AMESTOY T, MERCAT A, HAMIDOUCHE W, et al. Tunable VVC frame partitioning based on lightweight machine learning[J]. IEEE Transactions on Image Processing, 2019, 29: 1313-1328. http://ieeexplore.ieee.org/document/8826595
    [41] TRAN T D, LIU L K, WESTERINK P H. Low-delay MPEG-2 video coding[C]//Proceedings of the SPIE-The International Society for Optical Engineering. Bellingham: SPIE-INT SOC Optical Engineering, 1997: 510-516.
    [42] TOURAPIS A M, LEONTARIS A, SUHRING K, et al. H. 264/14496-10 AVC reference software manual[EB/OL]. [2020-07-04]. http://iphone.hhi.de/suehring/tml/JM%20Reference%20Software%20Manual%(JVT-AE010).pdf.
    [43] DELA CRUZ A R, CAJOTE R D. Low complexity adaptive intra-refresh rate for real-time wireless video transmission[C]//Proceedings of the Signal and Information Processing Association Annual Summit and Conference (APSIPA). Piscataway: IEEE Press, 2014: 1-5.
    [44] ZHOU Y R, LI G Q, NING S S. A new feedback-based intra refresh method for robust video coding[C]//Proceedings of the 2015 International Conference on Computer Science and Applications (CSA). Piscataway: IEEE Press, 2015: 218-221.
    [45] HENDRY, WANG Y K, CHEN J L, et al. CE11/AHG14: Test 3.1-An approach for support of GRA/GDR[EB/OL]. [2020-07-04]. http://phenix.it-sudparis.eu/jvet/doc_end_user/documents/15_Gothenburg/wg11/JVET-O124-v3.zip.
    [46] LEE Y G, SONG B C. An intra-frame rate control algorithm for ultralow delay H. 264/advanced video coding (AVC)[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(5): 747-752. doi: 10.1109/TCSVT.2009.2017413
    [47] CHOI H, NAM J, YOO J, et al. Rate control based on unified RQ model for HEVC[EB/OL]. [2020-07-04]. http://phenix.int-evry.fr/jct/doc_end_user/documents/8_San%20Jose/wg11/JCTVC-H0213-v3.zip.
    [48] CHOI H, YOO J, NAM J, et al. Pixel-wise unified rate-quantization model for multi-level rate control[J]. IEEE Journal of Selected Topics in Signal Processing, 2013, 7(6): 1112-1123. doi: 10.1109/JSTSP.2013.2272241
    [49] WANG S, MA S, WANG S, et al. Quadratic ρ-domain based rate control algorithm for HEVC[C]//Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processin(ICASSP). Piscataway: IEEE Press, 2013: 1695-1699.
    [50] LI B, LI H, LI L, et al. λ domain rate control algorithm for high efficiency video coding[J]. IEEE Transactions on Image Processing, 2014, 23(9): 3841-3854. doi: 10.1109/TIP.2014.2336550
    [51] GAO W, KWONG S, JIA Y. Joint machine learning and game theory for rate control in high efficiency video coding[J]. IEEE Transactions on Image Processing, 2017, 26(12): 6074-6089. doi: 10.1109/TIP.2017.2745099
    [52] KWONG S, ZHOU M, XUEKAI W E I, et al. Rate control method based on deep reinforcement learning for dynamic video sequences in HEVC[J]. IEEE Transactions on Multimedia, 2020, 99: 1. http://www.researchgate.net/publication/341198848_Rate_Control_Method_Based_on_Deep_Reinforcement_Learning_for_Dynamic_Video_Sequences_in_HEVC
    [53] JIANG C, NOOSHABADI S. A scalable massively parallel motion and disparity estimation scheme for multiview video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(2): 346-359. doi: 10.1109/TCSVT.2015.2402853
    [54] SHAHID M U, AHMED A, MARTINA M, et al. Parallel H. 264/AVC fast rate-distortion optimized motion estimation by using a graphics processing unit and dedicated hardware[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2015, 25(4): 701-715. doi: 10.1109/TCSVT.2014.2351111
    [55] XIAO B, WANG H, WU J, et al. A multi-grained parallel solution for HEVC encoding on heterogeneous platforms[J]. IEEE Transactions on Multimedia, 2019, 21(12): 2997-3009. doi: 10.1109/TMM.2019.2916462
    [56] MOMCILOVIC S, ROMA N, SOUSA L. Exploiting task and data parallelism for advanced video coding on hybrid CPU + GPU platforms[J]. Journal of Real-Time Image Processing, 2016, 11(3): 571-587. doi: 10.1007/s11554-013-0357-y
    [57] INTEL. Scalable video technology for HEVC encoder (SVT-HEVC Encoder)[EB/OL]. (2020-08-20)[2020-08-26]. https://github.com/OpenVisualCloud/SVT-HEVC.
    [58] SURUR. TPCAST's wireless Adapter coming to the Oculus Rift next week[EB/OL]. (2017-11-08)[2020-08-26]. https://mspoweruser.com/tpcasts-wireless-adapter-coming-oculus-rift-next-week/.
    [59] BUYSSCHAERT C, DESCAMPE A, FÖßEL S, et al. Overview of JPEG XS[EB/OL]. 2018[2020-08-26]. https://jpeg.org/jpegxs/index.html.
    [60] AMIMON, Ltd. Wireless home digital interfaceTM specification v1.0 Revision 33[EB/OL]. [2020-07-04]. https://www.amimon.com/proavl.
    [61] NETINT. Codensity T408 video transcoder product brief[EB/OL]. [2020-08-26]. https://www.netint.ca/product/t400_transcoder/.
    [62] INTEL. Scalable video technology for the visual cloud[EB/OL]. [2020-08-26]. https://01.org/sites/default/files/documentation/svt_aws_wp.pdf.
    [63] CAST. H264-E-BPF ultra-fast AVC/H. 264 baseline profile encoder[EB/OL]. [2020-08-26]. https://www.cast-inc.com/compression/avc-hevc-video-compression/h264-e-bpf/.
    [64] RICHTER T, KEINERT J, FOESSEL S, et al. JPEG-XS-A high-quality mezzanine image codec for video over IP[J]. SMPTE Motion Imaging Journal, 2018, 127(9): 39-49. doi: 10.5594/JMI.2018.2862098
    [65] REZNIC Z. Successive refinement video compression: USA, WO2018025211A1[P]. 2018-02-08.
  • 加载中
图(7) / 表(2)
计量
  • 文章访问数:  752
  • HTML全文浏览量:  92
  • PDF下载量:  106
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-26
  • 录用日期:  2020-09-19
  • 网络出版日期:  2021-03-20

目录

    /

    返回文章
    返回
    常见问答