Modifiable Squeezer cluster algorithm used in large-scale matrix
-
摘要: 对Squeezer算法进行分析研究,在定义2个矩阵之间距离的基础上,提出了一种改进的Squeezer算法,用于对维数相同的大规模矩阵进行聚类分析.改进的算法在设定距离阈值的基础上,对类别的半径设定阈值来控制分类精度,给出具体的算法步骤来实现针对大量矩阵的聚类分析.对聚类后所得矩阵集合,给出集合质心和半径的定义,来描述矩阵集合的特性.所提算法能使聚类结果避免受到链条效应的影响而使类不断扩容,从而导致聚类精度下降的问题.仿真实验分析验证了所提算法具有良好的聚类效果和适用性.
-
关键词:
- 矩阵 /
- 聚类分析 /
- Squeezer算法 /
- 阈值
Abstract: To solve the clustering method to the large-scale matrixes in the same dimension, the modifiable Squeezer cluster algorithm was proposed, based on the analysis of Squeezer cluster algorithm and the definition of the distance between the matrixes. The modifiable algorithm set a distance threshold, put forward a threshold of radius to control the accuracy of classification, and gave the detailed algorithm steps to realize cluster analysis for a large number of matrices. When the matrix cluster set was obtained, the modifiable algorithm provided the definition of center and radius to describe the properties of the matrix set. The proposed method could control the accuracy of classification in order to prevent chain effect in the course of clustering. The simulation experiment was addressed to validate the rationality and effectiveness of the modifiable algorithm.-
Key words:
- matrix /
- cluster analysis /
- Squeezer /
- threshold
-
[1] 胡庆林,叶念渝,朱明富.数据挖掘中聚类算法的综述[J].计算机与数字工程,2007,35(2):17-20 Hu Qinglin, Ye Nianyu, Zhu Mingfu. Survey of cluster analysis in data mining[J]. Computer & Digital Engineering, 2007,35(2):17-20(in Chinese) [2] He Zengyou, Xu Xiaofei, Deng Shengchun. Squeezer: An efficient algorithm for clustering categorical data[J]. Journal of Computer Science and Technology,2002,17(5):611-624 [3] Ye Ming, Wang Huiwen, Wang Lanhui. Application of improved hierarchical clustering method to classification of curves The 9th International Conference on Industrial Management. Beijing:China Aviation Industry Press,2008:325-330 [4] Oyanagi S, Kubota K, Nakase A. Application of matrix clustering to web log analysis and access prediction //Proceedings of the ACM Web KDD Workshop on Mining Log Data across all Customer Touch Points. Berlin:Springer-Verlag,2001 [5] 陈祖民,周家胜.矩阵论引论[M].北京:北京航空航天大学出版社,1998: 281-288 Chen Zumin, Zhou Jiasheng. Introduction of matrix theroy[M]. Beijing: Beijing University of Aeronautics and Astronautics Press,1998:281-288 (in Chinese) [6] Li Yan, Ye Ming, Wang Huiwen, et al. A data streams clustering algorithm based on interval data Beijing: The 38th International Conference on Computers and Industrial Engineering.Beijing:Publishing House of Electronics Industry,2008:2775-2778
点击查看大图
计量
- 文章访问数: 2802
- HTML全文浏览量: 122
- PDF下载量: 1557
- 被引次数: 0