北京航空航天大学学报 ›› 2019, Vol. 45 ›› Issue (10): 2044-2050.doi: 10.13700/j.bh.1001-5965.2019.0039

• 论文 • 上一篇    下一篇

一种基于HXDSP的移位器查找表技术

叶鸿1, 顾乃杰1, 林传文2, 张孝慈1, 陈瑞1   

  1. 1. 中国科学技术大学 计算机科学与技术学院, 合肥 230027;
    2. 合肥学院 计算机科学与技术系, 合肥 230601
  • 收稿日期:2019-01-29 出版日期:2019-10-20 发布日期:2019-10-31
  • 通讯作者: 顾乃杰 E-mail:gunj@ustc.edu.cn
  • 作者简介:叶鸿 男,博士研究生。主要研究方向:并行计算、体系架构优化;顾乃杰 男,博士,教授,博士生导师。主要研究方向:网络计算、大数据处理和分析、云计算与应用,在线算法、软件和代码优化、大型软件代码检测,深度学习软硬件系统、并行和分布式计算;林传文 男,博士。主要研究方向:编译优化、深度学习训练优化及应用;陈瑞 男,硕士。主要研究方向:体系架构优化。
  • 基金资助:
    安徽省科技重大专项(18030901011);合肥学院科研发展基金(19ZR03ZDA)

A shifter look-up table technique based on HXDSP

YE Hong1, GU Naijie1, LIN Chuanwen2, ZHANG Xiaoci1, CHEN Rui1   

  1. 1. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China;
    2. Department of Computer Science and Technology, Hefei University, Hefei 230601, China
  • Received:2019-01-29 Online:2019-10-20 Published:2019-10-31
  • Supported by:
    Anhui Province Science and Technology Major Project (18030901011); Scientific Research and Development Fund Project of Hefei University (19ZR03ZDA)

摘要: 高性能信号处理应用的快速发展,对相应处理器的运算速度及吞吐效率提出了巨大挑战。移位器是数字信号处理器(DSP)上的重要部件,通过为移位器设计额外专用随机存取存储器(RAM)和查找表(LUT),并对其指令集及架构进行优化调整,从而达到提高处理器使用效率和传输速率的目的。此外,基于移位器与相应查找表指令,可在数据暂存的同时进行移位、提取、算术与逻辑运算处理,将部分数据运算的过程直接合并在对移位器RAM的数据存读取过程中,显著地提高了运算部件的使用效率。结果表明:基于移位器查找表的暂存技术可以达到与传输总线接近的吞吐率,对信号处理算法快速傅里叶变换(FFT)可以达到加速比约为1.15~1.20的性能提升效果。

关键词: 数字信号处理器(DSP), 移位器, 查找表(LUT), 单指令多数据流(SIMD), 超长指令字(VLIW)

Abstract: With the development of digital signal processing technology, the application of high-performance signal processing has attracted more and more attention, which also poses great challenges to the computing speed and throughput efficiency of the corresponding processors. The shifter unit is an important component on the digital signal processor (DSP). By designing additional dedicated random access memory (RAM) and look-up table (LUT) for the shifter unit, this paper optimizes and adjusts its instruction set and architecture, so as to improve the use efficiency and transmission rate of the processor. In addition, based on the shifter and the corresponding look-up table instruction, it can carry out shift, extraction, arithmetic and logical operation processing at the same time of data temporary storage. And the process of the partial data operation is directly merged into the data read/write process of the shifter RAM, which greatly improves the efficiency of arithmetic unit. Experiments show that the temporary storage technology based on the shifter look-up table can achieve the throughput rate close to the transmission bus, and the signal processing algorithm fast Fourier transformation (FFT) can achieve the performance improvement of the acceleration ratio of 1.15 to 1.20.

Key words: digital signal processor (DSP), shifter, look-up table (LUT), single instruction multiple data (SIMD), very long instruction word (VLIW)

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发