留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种增量并行式动态图异常检测算法

韩涛 兰雨晴 肖利民 刘艳芳

韩涛, 兰雨晴, 肖利民, 等 . 一种增量并行式动态图异常检测算法[J]. 北京航空航天大学学报, 2018, 44(1): 117-124. doi: 10.13700/j.bh.1001-5965.2017.0019
引用本文: 韩涛, 兰雨晴, 肖利民, 等 . 一种增量并行式动态图异常检测算法[J]. 北京航空航天大学学报, 2018, 44(1): 117-124. doi: 10.13700/j.bh.1001-5965.2017.0019
HAN Tao, LAN Yuqing, XIAO Limin, et al. Incremental and parallel algorithm for anomaly detection in dynamic graphs[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(1): 117-124. doi: 10.13700/j.bh.1001-5965.2017.0019(in Chinese)
Citation: HAN Tao, LAN Yuqing, XIAO Limin, et al. Incremental and parallel algorithm for anomaly detection in dynamic graphs[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(1): 117-124. doi: 10.13700/j.bh.1001-5965.2017.0019(in Chinese)

一种增量并行式动态图异常检测算法

doi: 10.13700/j.bh.1001-5965.2017.0019
详细信息
    作者简介:

    韩涛 女, 博士研究生。主要研究方向:社交网络、数据挖掘、大数据

    兰雨晴 男, 博士, 副教授, 硕士生导师。主要研究方向:操作系统、大数据、数据安全

    肖利民 男, 博士, 教授, 博士生导师。主要研究方向:高性能计算机系统、大数据

    刘艳芳 女, 博士研究生。主要研究方向:可信计算、软件自动化测试、大数据

    通讯作者:

    兰雨晴, E-mail: lanyuqing@buaa.edu.cn

  • 中图分类号: TP391

Incremental and parallel algorithm for anomaly detection in dynamic graphs

More Information
  • 摘要:

    图结构异常检测可以发现金融欺诈行为、网络入侵和可疑的社交行为。针对当前检测图异常算法的计算复杂度高、不能处理大规模动态图的缺点,研究并提出了一种增量并行式的算法以便更有效地发现和检测大规模动态图中的异常。该算法使用时间滑动窗口对图进行划分,在初始化阶段选取N个子图,使用最小描述长度(MDL)原理并行检测正常模式和异常模式,并行迭代地检测其他子图中的正常结构和异常结构。在多个大规模图数据集上的实验结果表明,检测动态图结构异常准确率达到96%,召回率达到85%,运行时间减少了一个数量级。同时还讨论了滑动窗口大小和并行数量对算法运行时间的影响。

     

  • 图 1  DPADS算法并行处理子图

    Figure 1.  Parallel processing subgraphs of DPADS algorithm

    图 2  DPADS算法流程图

    Figure 2.  Flowchart of DPADS algorithm

    图 3  DPADS与PLAD算法运行时间比较

    Figure 3.  Comparison of running time betweenDPADS and PLAD algorithms

    图 4  准确率和召回率

    Figure 4.  Accurate rate and recall rate

    图 5  ROC曲线

    Figure 5.  ROC curve

    图 6  M值对准确率和召回率的影响

    Figure 6.  Influence of M value on accurate rate and recall rate

    图 7  正常模式和异常模式

    Figure 7.  Normal pattern and abnormal pattern

    图 8  DPADS算法在每个子图上的运行时间并标记发现异常的子图

    Figure 8.  Running time of DPADS algorithm in each subgraph and marked subgraphs with abnormal pattern

    图 9  DPADS算法在合成数据集上随窗口大小变化的运行时间

    Figure 9.  Running time of DPADS algorithm on synthetic datasets with different windows size

    表  1  实验数据集

    Table  1.   Experimental data sets

    数据名称类型结点个数边个数
    YouTube无向图1 134 8902 987 624
    LiveJ有向图484 75168 993 773
    Math有向时序图24 81850 650
    下载: 导出CSV
  • [1] AHMED N K, NEVILLE J, KOMPELLA R.Network sampling:From static to streaming graphs[J].ACM Transactions on Knowledge Discovery from Data(TKDD), 2014, 8(2):7:1-7:56. https://www.researchgate.net/profile/Nesreen_Ahmed3/publication/233409333_Network_Sampling_From_Static_to_Streaming_Graphs/links/5772cb1d08ae2b93e1a7cd80.pdf
    [2] EBERLE W, HOLDER L.Anomaly detection in data represented as graphs[J].Intelligent Data Analysis, 2007, 11(6):663-689. https://content.iospress.com/articles/intelligent-data-analysis/ida00309
    [3] EBERLE W, HOLDER L, GRAVES J.Insider threat detection using a graph-based approach[J].Journal of Applied Security Research, 2011, 6(1):32-81. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.720.715
    [4] NOBLE C C, COOK D J. Graph-based anomaly detection[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 631-636.
    [5] AKOGLU L, MCGLHON M, FALOUSTSOS C. OddBall: Spotting anomalies in weighted graphs[C]//Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining. Berlin: Springer-Verlag, 2010, 3: 410-421.
    [6] FEIGENBAUM J, KANNAN S, MCGREGOR A, et al.On graph problems in a semi-streaming model[J].Theoretical Computer Science, 2005, 348(2-3):207-216. doi: 10.1016/j.tcs.2005.09.013
    [7] DEMETRESCU C, FINOCCHI I, RIBICHINI A.Trading off space for passes in graph streaming problems[J].ACM Transactions on Algorithms(TALG), 2009, 6(1):6:1-6:17. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.527.8557
    [8] AGGARWAL G, DATAR M, RAJAGOPALAN S, et al. On the streaming model augmented with a sorting primitive[C]//Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science(FOCS). Washington, D. C. : IEEE Computer Society, 2004: 540-549.
    [9] SARMA A, GOLLAPUDI S, PANIGRAHY R. Estimating PageRank on graph streams[C]//Proceedings of the 27th ACM Sigmod-Sigact-Sigart Symposium on Principles of Database Systems. New York: ACM Press, 2008: 69-78.
    [10] SHIN K, ELIASSI-RAD T, FALOUTSOS C. CoreScope: Graph mining using k-core analysis-Patterns, anomalies and algorithms[C]//2016 IEEE 16th International Conference on Data Mining (ICDM). Washington, D. C. : IEEE Computer Society, 2017: 469-478.
    [11] BRIDGES R A, COLLINS J P, FERRAGUT E M, et al. Multi-level anomaly detection on time-varying graph data[C]//2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). New York: ACM Press, 2016: 579-583.
    [12] EBERLE W, HOLDER L. A partitioning approach to scaling anomaly detection in graph streams[C]//2014 IEEE International Conference on Big Data. Washington, D. C. : IEEE Computer Society, 2014: 17-24.
    [13] AKOGLU L, TONG H, KOUTRA D.Graph based anomaly detection and description:A survey[J].Data Mining and Knowledge Discovery, 2015, 29(3):626-688. doi: 10.1007/s10618-014-0365-y
    [14] 吴烨, 钟志农, 熊伟, 等.一种高效的属性图聚类算法[J].计算机学报, 2013, 36(8):1704-1713. http://www.cqvip.com/QK/90818X/201308/46956448.html

    WU Y, ZHONG Z N, XIONG W, et al.An efficient method for attributed graph clustering[J].Chinese Journal of Computers, 2013, 36(8):1704-1713(in Chinese). http://www.cqvip.com/QK/90818X/201308/46956448.html
    [15] EBERLE W, HOLDER L. Incremental anomaly detection in graphs[C]//2013 IEEE 13th International Conference on Data Mining Workshops. Washington, D. C. : IEEE Computer Society, 2013: 521-528.
    [16] EPASTO A, LATTANZI S, SOZIO M. Efficient densest subgraph computation in evolving graphs[C]//Proceedings of the 24th International Conference on World Wide Web. Geneva: International World Wide Web Conferences Steering Committee, 2015: 300-310.
    [17] YANG J, LESKOVEC J. Defining and evaluating network communities based on ground-truth[C]//Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics. New York: ACM Press, 2012, 3: 1-3: 8.
  • 加载中
图(9) / 表(1)
计量
  • 文章访问数:  541
  • HTML全文浏览量:  21
  • PDF下载量:  457
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-01-16
  • 录用日期:  2017-02-06
  • 刊出日期:  2018-01-20

目录

    /

    返回文章
    返回
    常见问答