Key-feature-based clustering algorithm for search engine results
-
摘要: 为了解决用户在搜索引擎结果列表中寻找所需信息困难的问题,帮助用户快速有效地定位有价值的Web文档,与向量空间模型方法不同,采用基于关键特征的聚类算法(KFC).首先从搜索引擎返回结果的关键词里选择重要的词作为关键特征,然后通过分析特征间的关系对特征聚类,最后基于特征聚类结果实现文档的聚类.通过对实验结果的测试表明了算法的有效性.Abstract: To solve the problem that users of web search engines are often forced to sift through the long ordered list of document, a new key-feature clustering (KFC) algorithm was presented to help locate the valuable search results that the users really needed, which was different from VSM. The algorithm firstly extracted some key features from the keywords in the search results. Then the relationships between key features were analyzed and features were clustered. Finally, the documents were clustered based on these clusters of key features. The algorithm was tested and validated by the results of experiments.
-
Key words:
- search engines /
- algorithm /
- feature extraction /
- document clustering /
- vector space model /
- KFC algorithm
-
[1] Yitong Wang, Masaru Kitsuregawa. Use link-based clustering to improve web search results Proceedings of the 2nd International Conference on Web Information Systems Engineering. Washington:WISE,2001:119-128 [2] Zeng Huajun, He Qicai, Che Zheng,et al. Learning to cluster web search results Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval. New York:ACM Press,2004:210-217 [3] Andreas Hotho, Alexander Maedche , Steffen Staab. Ontology-based text document clustering Klopotek MA, Wierzchon ST, Trojanowski K. Proc of the Conf on Intelligent Information Systems. Zakopane:Springer-Verlag, 2003 [4] Wang Po-Hsiang, Wang Jung-Ying, Lee Hahn-Ming. Queryfind:search ranking based on users- feedback and expert-s agreement IEEE International Conference on e-Technology, e-Commerce, and e-Service. :IEEE, 2004:299-304 [5] 耿玉良,陈家琪,王咏梅.中文Web检索中聚类算法的改进[J].计算机工程与设计,2005,26(10):2685-2687 Geng Yuliang, Chen Jiaqi, Wang Yongmei. Improvement of clustering algorithm in chinese web retrieva [J]. Computer Engineering and Design, 2005,26(10):2685-2687(in Chinese) [6] 姚莉秀,杨杰,叶晨洲,等.用于特征筛选的最近邻(KNN)法[J].计算机与应用化学,2001,18(2):135-138 Yao Lixiu, Yang Jie, Ye Chenzhou, et al. K nearest neighbor(KNN) method used in feature selection [J]. Computer and Applied Chemistry,2001,18(2):135-138(in Chinese)
点击查看大图
计量
- 文章访问数: 3504
- HTML全文浏览量: 234
- PDF下载量: 1241
- 被引次数: 0