Data Reduction Based on the Equivalence Class Partition of Attribute Set
-
摘要: 由于大型数据库中数据的高维、巨量,使得数据约简在数据库知识发现中起着越来越重要的作用.现有的数据约简方法有3类:穷举选择法、启发式选择法和随机选择法.这些方法效率低,可能丢失重要信息,效果都不理想.属性集等价类划分可以在本质上标识出冗余属性和无关属性,并且可以快速、准确地计算等价类划分个数,使得数据约简更有效,数据挖掘效率更高.Abstract: Duing to the high dimension and huge data of large database, data reduction plays more important roles in the knowledge discovery of database. Existing data reduction method for identify redundant and irrelevant attributes can be grouped into three type: enumeration search, heuristic search and randomized search,all have low efficient and may lost important information, and need improvement in order to satisfy the data mining requirement. Equivalence class partition based on the attribute set can identify the redundant and irrelevant attributes in essence, and can rapid and accurately compute the number of equivalence class partition, make data reduction more effective and data mining more efficient.
-
Key words:
- data reduction /
- databases /
- artificial intelligence /
- equivalence class partition /
- data mining
-
[1] Weiss S M, Indurkhya N. Predictive data mining a practical guide[M]. Vermont :Morgan Kaufmann Publishers Inc, 1998. [2] Lu H,Sung S Y, Lu Y. On preprocessing data for effective classification . Singapore:Research Report of National University Singapore, 1996. [3] Dash M, Liu H. Feature selection for calssification[J]. Intelligent Data Analysis,1997,(3):17~30. [4] 王 军.数据库知识发现研究 .北京:中国科学院计算技术研究所, 1997. [5] Kumar A. New techniques for data reduction in a database system for knowledge discovery applications[J]. Journal of Intelligent Information Systems ,1998,(10):31~48. [6] Urman S. Oracle8 PL/SQL程序设计[M].北京:机械工业出版社,1998.
点击查看大图
计量
- 文章访问数: 2875
- HTML全文浏览量: 224
- PDF下载量: 1076
- 被引次数: 0