留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

卫星时序数据挖掘节点级并行与优化方法

鲍军鹏 杨科 周静

鲍军鹏, 杨科, 周静等 . 卫星时序数据挖掘节点级并行与优化方法[J]. 北京航空航天大学学报, 2018, 44(12): 2470-2478. doi: 10.13700/j.bh.1001-5965.2018.0334
引用本文: 鲍军鹏, 杨科, 周静等 . 卫星时序数据挖掘节点级并行与优化方法[J]. 北京航空航天大学学报, 2018, 44(12): 2470-2478. doi: 10.13700/j.bh.1001-5965.2018.0334
BAO Junpeng, YANG Ke, ZHOU Jinget al. Node level parallel and optimization method of satellite time serial data mining[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(12): 2470-2478. doi: 10.13700/j.bh.1001-5965.2018.0334(in Chinese)
Citation: BAO Junpeng, YANG Ke, ZHOU Jinget al. Node level parallel and optimization method of satellite time serial data mining[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(12): 2470-2478. doi: 10.13700/j.bh.1001-5965.2018.0334(in Chinese)

卫星时序数据挖掘节点级并行与优化方法

doi: 10.13700/j.bh.1001-5965.2018.0334
基金项目: 航天器在轨故障诊断与维修重点实验室课题
详细信息
    作者简介:

    鲍军鹏  男, 博士, 副教授, 博士生导师。主要研究方向:机器学习、数据挖掘、人工智能

    杨科 男, 硕士研究生。主要研究方向:机器学习、数据挖掘

    周静 女, 硕士研究生。主要研究方向:机器学习、数据挖掘

    通讯作者:

    鲍军鹏, E-mail: baojp@mail.xjtu.edu.cn

  • 中图分类号: V19;TP311.11

Node level parallel and optimization method of satellite time serial data mining

Funds: Supported by the Key Laboratory for Fault Diagnosis and Maintenance of Spacecraft in Orbit of China
More Information
  • 摘要:

    智能卫星技术对卫星时间序列数据挖掘提出了越来越多的需求。通常卫星数据计算量都非常大,若串行执行则需要较长时间。以卫星异变过程多类型特征分析过程为典型代表,针对窗口划分与向量相似度计算、特征提取、傅里叶变换、聚类等常见数据挖掘操作,探讨了在多核CPU和GPU的典型异构计算节点中对时序数据挖掘过程进行并行优化的多种策略,包括向量化方法、多进程方法、GPU计算等方法。对这几种优化策略的适用情况进行了实验分析对比。结果表明,针对不同任务情况综合使用多种优化策略具有显著提升效果。

     

  • 图 1  异变过程多类型特征分析流程

    Figure 1.  MFAP analysis flowchart

    图 2  偏移向量局部更新示意图

    Figure 2.  Partial update diagram of offset vector

    图 3  基于多核CPU并行的相似度列表获取流程

    Figure 3.  Acquisition flowchart of similarity list based on multi-core CPU parallelization

    图 4  多窗口向量相似度获取耗时

    Figure 4.  Time consumption of similarity acquisition of multiple window vectors

    图 5  基于GPU并行的相似度计算流程

    Figure 5.  Flowchart of similarity calculation based onGPU parallelization

    图 6  单一窗口向量相似度计算耗时

    Figure 6.  Time consumption of single window vector similarity calculation

    图 7  不同大小窗口的特征提取耗时

    Figure 7.  Time consumption of feature extraction from different sizes of window

    图 8  不同方法优化后的特征计算效率

    Figure 8.  Characteristic calculation efficiency after optimization by different methods

    图 9  并行优化前后的聚类效率

    Figure 9.  Clustering efficiency before and after parallel optimization

    图 10  Cuda自适应聚类流程

    Figure 10.  Cuda adaptive clustering flowchart

    图 11  Cuda自适应聚类过程示例

    Figure 11.  An example of Cuda adaptive clustering process

    表  1  自适应获取周期算法串行代码优化前后耗时对比

    Table  1.   Comparison of adaptive cycle achieving algorithm's time consumption before and after serial optimization

    数据大小 优化前耗时/s 优化后耗时/s 加速比
    221 280 25.150 8 1.625 6 15.5
    490 440 127.252 5 2.181 4 58.3
    1 028 760 562.171 5 2.592 3 216.9
    2 105 400 2 306.469 6 3.755 2 614.2
    6 094 920 19 213.119 3 6.079 7 3 160.2
    12 238 320 78 204.042 6 16.800 1 4 655.0
    下载: 导出CSV

    表  2  不同方法优化前后耗时结果对比

    Table  2.   Comparison of time consuming results before and after different optimization methods

    偏移量δ 耗时/s
    无优化 串行优化 并行最优
    10 95.285 1 33.591 8 18.403 6
    30 58.576 9 11.697 2 10.048 8
    90 19.269 7 4.038 3 5.716 7
    180 9.632 6 1.975 2 4.020 8
    下载: 导出CSV
  • [1] CASELLA G, FIENBERG S, OLKIN I.Time series analysis and its applications:With R examples[J].Publications of the American Statistical Association, 2006, 97(458):656-657. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0231076500/
    [2] LI H, YANG L, GUO C.Improved piecewise vector quantized approximation based on normalized time subsequences[J].Measurement, 2013, 46(9):3429-3439. doi: 10.1016/j.measurement.2013.05.012
    [3] WANG J, LI H, HUANG J, et al.Association rules mining based analysis of consequential alarm sequences in chemical processes[J].Journal of Loss Prevention in the Process Industries, 2016, 41:178-185. doi: 10.1016/j.jlp.2016.03.022
    [4] LI H.Distance measure with improved lower bound for multivariate time series[J].Physica A:Statistical Mechanics and Its Applications, 2017, 468:622-637. doi: 10.1016/j.physa.2016.10.062
    [5] MATTIOLI G, ANABLE J, VROTSOU K.Car dependent practices:Findings from a sequence pattern mining study of UK time use data[J].Transportation Research Part A:Policy and Practice, 2016, 89:56-72. doi: 10.1016/j.tra.2016.04.010
    [6] DENG W, WANG G, XU J.Piecewise two-dimensional normal cloud representation for time-series data mining[J].Information Sciences, 2016, 374(C):32-50. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=e57ddb6e2e16ddd06c306b4103a1f153
    [7] GUAN X, SUN G, YI X, et al.A novel data association algorithm for unequal length fluctuant sequence[J].Procedia Engineering, 2015, 99:1190-1202. doi: 10.1016/j.proeng.2014.12.648
    [8] SUN Z Y, TSAI M C, TSAI H P.Mining uncertain sequence data on hadoop platform[C]//Pacific-Asia Conference on Know-ledge Discovery and Data Mining.Berlin: Springer, 2014: 204-215.
    [9] LAM H T, MORCHEN F, FRADKIN D, et al.Mining compressing sequential patterns[J].Statistical Analysis and Data Mining, 2014, 7(1):34-52. doi: 10.1002/sam.11192
    [10] GONG X Y, FONG S, WONG R K, et al.Discovering sub-pa-tterns from time series using a normalized cross-match algorithm[J].The Journal of Supercomputing, 2016, 72(10):1-18.
    [11] JEYABHARATHI J, SHANTHI D.An efficient mining for app-roximate frequent items in protein sequence database[J].Journal of Emerging Technologies in Web Intelligence, 2014, 6(3):324-330.
    [12] 巨涛, 朱正东, 董小社.异构众核系统及其编程模型与性能优化技术研究综述[J].电子学报, 2015, 43(1):111-119. doi: 10.3969/j.issn.0372-2112.2015.01.018

    JU T, ZHU Z D, DONG X S.The feature, programming model and performance optimization strategy of heterogeneous many-core systems:A review[J].Acta Electronica Sinica, 2015, 43(1):111-119(in Chinese). doi: 10.3969/j.issn.0372-2112.2015.01.018
    [13] 戴春娥, 陈维斌, 傅顺开, 等.通过GPU加速数据挖掘的研究进展和实践[J].计算机工程与应用, 2015, 51(16):109-116. doi: 10.3778/j.issn.1002-8331.1411-0027

    DAI C E, CHEN W B, FU S K, et al.Research progress and practice of accelerating data mining based on GPU[J].Computer Engineering and Applications, 2015, 51(16):109-116(in Chinese). doi: 10.3778/j.issn.1002-8331.1411-0027
    [14] CAVUOTI S, GAROFALO M, BRESCIA M, et al.Astrophysical data mining with GPU.A case study:Genetic classification of globular clusters[J].New Astronomy, 2014, 26(1):12-22. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1304.0597
    [15] 顾文恺.基于GPU的脉冲压缩并行化研究[J].航空计算技术, 2017, 47(2):121-124. doi: 10.3969/j.issn.1671-654X.2017.02.030

    GU W K.Study on parallel pulse compression based on GPU[J].Aeronautical Computing Technology, 2017, 47(2):121-124(in Chinese). doi: 10.3969/j.issn.1671-654X.2017.02.030
    [16] SCHALKWIJK J, JONKER H J J, SIEBESMA A P, et al.Weather forecasting using GPU-based large-eddy simulations[J].Bulletin of the American Meteorological Society, 2015, 96(5):715-723. doi: 10.1175/BAMS-D-14-00114.1
    [17] VACONDIO R, MIGNOSA P, PAGANI S.3D SPH numerical simulation of the wave generated by the vajont rockslide[J].Advances in Water Resources, 2013, 59(11):146-156. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c765c71c68366f110dce11e73298f152
    [18] 黄曦, 陈伟, 张建奇.基于GPU的实时红外场景仿真系统研究[J].航空兵器, 2015(6):49-54. doi: 10.3969/j.issn.1673-5048.2015.06.012

    HUANG X, CHEN W, ZHANG J Q.Study on real-time infrared scene simulation system based on GPU[J].Aviation Weapon, 2015(6):49-54(in Chinese). doi: 10.3969/j.issn.1673-5048.2015.06.012
    [19] SU X, WANG X, JING G, et al.GPU-Meta-Storms:Computing the structure similarities among massive amount of microbial community samples using GPU[J].Bioinformatics, 2014, 30(7):1031-1033. doi: 10.1093/bioinformatics/btt736
    [20] 刘志文.并行算法设计与性能优化[M].北京:机械工业出版社, 2015:162.

    LIU Z W.Parallel computing and performance optimization[M].Beijing:China Machine Press, 2015:162(in Chinese).
  • 加载中
图(11) / 表(2)
计量
  • 文章访问数:  668
  • HTML全文浏览量:  98
  • PDF下载量:  363
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-07
  • 录用日期:  2018-07-27
  • 网络出版日期:  2018-12-20

目录

    /

    返回文章
    返回
    常见问答