A Comparison Study of Two Methods for Principal Component Analysis of Interval Data
-
摘要: 针对顶点主成分分析算法(VPCA)计算量会随着变量个数的增加而按指数速度增长的问题,Cazes P提出一种简化算法,通过直接计算VPCA的相关系数矩阵,可以消除大量的冗余计算,解决VPCA的维数灾难问题。文章通过对这两种方法的计算过程和计算结果进行比较,说明这两种方法在计算结果上是完全等价的,但是,Cazes P提出的简化算法的计算过程更简单、所占据的存储空间更小、计算速度更快,实验分析进一步验证了理论分析的相关结论。Abstract: A simplified method of VPCA was raised by Cazes P. The proposed method eliminates large amounts of redundant computation by calculating correlation matrix of the vertices matrix directly. A comparison study of VPCA and the simplified method shows that the two methods lead to the same results. However, the simplified method has higher speed and smaller occupied-space. An empirical analysis verified the conclusion of theoretical analysis.
-
Keywords:
- symbolic data analysis /
- interval data /
- principal component analysis /
- VPCA /
- correlation matrix
-
-
[1] Diday E, Noirhomme-Traiture M. Symbolic data analysis and the SODAS software [2] Cazes P, Chouakria A, Diday E, et al. Extension de l’analyse en composantes principales à des données de type intervalle [3] Chouakria A, Diday E, Cazes P. An improved factorial representation of symbolic objects //Studies and Research, Proceedings of the Conference on Knowledge Extraction and Symbolic Data Analysis: KESDA’98. Luxembourg: Office for Official Publications of the European Communities, 1998: 276-289. [4] Lauro C, Palumbo F. Principal components analysis of interval data: a symbolic data analysis approach [5] Palumbo F, Lauro C. A PCA for interval valued data based on midpoints and radii [6] Irpino A. ‘Spaghetti’ PCA analysis: an extension of principal components analysis to time dependent interval data [7] Gioia F, Lauro C. Principal component analysis on interval data
计量
- 文章访问数: 1450
- HTML全文浏览量: 0
- PDF下载量: 1238