Volume 45 Issue 9
Sep.  2019
Turn off MathJax
Article Contents
XIA Bang, EMILION Richard, WANG Huiwenet al. Comparison between EM algorithm and dynamical clustering algorithm for Dirichlet mixture samples[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(9): 1805-1811. doi: 10.13700/j.bh.1001-5965.2018.0752(in Chinese)
Citation: XIA Bang, EMILION Richard, WANG Huiwenet al. Comparison between EM algorithm and dynamical clustering algorithm for Dirichlet mixture samples[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(9): 1805-1811. doi: 10.13700/j.bh.1001-5965.2018.0752(in Chinese)

Comparison between EM algorithm and dynamical clustering algorithm for Dirichlet mixture samples

doi: 10.13700/j.bh.1001-5965.2018.0752
Funds:

National Natural Science Foundation of China 71420107025

More Information
  • Corresponding author: XIA Bang, E-mail: 15011176214@163.com
  • Received Date: 25 Dec 2018
  • Accepted Date: 15 Mar 2019
  • Publish Date: 20 Sep 2019
  • Dirichlet distribution is a kind of continuous multivariate probability distribution with positive parameter vectors, which is widely used in proportional structure problems. Expectation maximization (EM) algorithm and dynamical clustering algorithm of Dirichlet mixture samples are presented, their mathematical process is deduced, and the iteration steps of the algorithms are given. Then, using digital simulation experiments, the clustering effects of the two machine learning algorithms with Dirichlet samples are compared. By calculating six evaluation factors which are log-likelihood function value, program running time, convergence iteration times, clustering accuracy, true positive rate (TPR) and false positive rate (FPR), the simulation results show that EM algorithm has higher clustering accuracy but lower operational efficiency, while dynamical clustering algorithm has higher operational efficiency but loses some clustering accuracy. Therefore, in practical application, it is suggested to weigh the relative requirements of accuracy and operational efficiency before selecting a suitable algorithm to cluster Dirichlet samples.

     

  • loading
  • [1]
    FRALEY C, RAFTERY A E.Model-based clustering, discriminant analysis, and density estimation[J]. Publications of the American Statistical Association, 2002, 97(458):611-631. doi: 10.1198/016214502760047131
    [2]
    DEMPSTER A P.Maximum likelihood from incomplete data via the EM algorithm[J].Journal of Royal Statistical Society B, 1977, 39(1):1-38. http://www.ams.org/mathscinet-getitem?mr=501537
    [3]
    CELEUX G, GOVAERT G.Stochastic algorithms for clustering[M].Heidelberg:Physica-Verlag, 1990.
    [4]
    CELEUX G, GOVAERT G.A classification EM algorithm for clustering and two stochastic versions[J].Computational Statistics & Data Analysis, 1992, 14(3):315-332. http://dl.acm.org/citation.cfm?id=146608
    [5]
    TANNER M A, WONG W H.The calculation of posterior distributions by data augmentation[J].Publications of the American Statistical Association, 1987, 82(398):528-540. doi: 10.1080/01621459.1987.10478458
    [6]
    RENDER E.Mixture densities, maximum likelihood and the EM algorithm[J].SIAM Review, 1984, 26(2):195-239. doi: 10.1137/1026034
    [7]
    PEEL D, MCLACHLAN G J.Robust mixture modelling using the t distribution[J].Statistics and Computing, 2000, 10(4):339-348. doi: 10.1023/A:1008981510081
    [8]
    DIDAY E, SCHROEDER A, OK Y.The dynamic clusters method in pattern recognition[C]//Proceedings of International Federation for Information Processing Congress, 1974: 691-697.
    [9]
    SCOTT A J, SYMONS M J.Clustering methods based on likelihood ratio criteria[J].Biometrics, 1971, 27(1):387-397. doi: 10.2307-2529003/
    [10]
    SYMONS M J.Clustering criteria and multivariate normal mixtures[J].Biometrics, 1981, 37(1):35-43. doi: 10.2307/2530520
    [11]
    WONG M A, LANE T.A kTH nearest neighbour clustering procedure[J].Journal of the Royal Statistical Society, 1981, 45(3):362-368. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=10.1111/j.2517-6161.1983.tb01262.x
    [12]
    BROSSIER G.Piecewise hierarchical clustering[J].Journal of Classification, 1990, 7(2):197-216. doi: 10.1007/BF01908716
    [13]
    BISHOP C M.Neural networks for pattern recognition[M].Oxford:Oxford University Press, 1995.
    [14]
    BOCK H H.Clustering and neural networks[M]//RIZZI A, VICHI M, BOCK H H.Advances in data science and classification.Berlin: Springer, 1998: 265-277.
    [15]
    ARABIE P, CARROLL J D.Mapclus:A mathematical programming approach to fitting the adclus model[J].Psychometrika, 1980, 45(2):211-235. doi: 10.1007/BF02294077
    [16]
    DIDAY E.An intuitive form of expression: Pyramids[D].Paris: INRIA, 1984.
    [17]
    JAMES G M, SUGAR C A.Clustering for sparsely sampled functional data[J].Publications of the American Statistical Association, 2003, 98(462):397-408. doi: 10.1198/016214503000189
    [18]
    WINSBERG S, SOETE G D.Latent class models for time series analysis[J].Applied Stochastic Models in Business & Industry, 1999, 15(15):183-194. doi: 10.1002/(SICI)1526-4025(199907/09)15:3<183::AID-ASMB373>3.0.CO;2-T/full
    [19]
    VRAC M, BILLARD L, DIDAY E.Copula analysis of mixture models[J].Computational Statistics, 2012, 27(3):427-457. doi: 10.1007/s00180-011-0266-0
    [20]
    KOSMIDIS I, KARLIS D.Model-based clustering using copulas with applications[J].Statistics and Computing, 2016, 26(5):1079-1099. doi: 10.1007/s11222-015-9590-5
    [21]
    MARBAC M, BIERNACKI C, VANDEWALLE V.Model-based clustering of Gaussian copulas for mixed data[J].Communications in Statistics-Theory and Methods, 2017, 46(23):11635-11656. doi: 10.1080/03610926.2016.1277753
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(2)  / Tables(2)

    Article Metrics

    Article views(650) PDF downloads(430) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return