���¿��ټ��� �߼�����
   ��ҳ  �ڿ�����  ��ί��  Ͷ��ָ��  �ڿ�����  ��������  �� �� ��  ��ϵ����
�������պ����ѧѧ�� 2010, Vol. 36 Issue (4) :500-503    DOI:
���� ����Ŀ¼ | ����Ŀ¼ | ����� | �߼����� << [an error occurred while processing this directive] | [an error occurred while processing this directive] >>
���ٷ�, ��ï��, ������*
�������պ����ѧ �����ѧԺ, ���� 100191
Query clustering using user-query logs
Jia Rongfei, Jin Maozhong, Wang Xiaobo*
School of Computer Science and Technology, Beijing University of Aeronautics and Astronautics, Beijing 100191, China

Download: PDF (320KB)   HTML 1KB   Export: BibTeX or EndNote (RIS)      Supporting Info
ժҪ �����û���ѯ��־������µIJ�ѯ�����㷨.�û���ѯ��־�������,��ͨ�����ڲ�ѯ����IJ�ѯչ����־�Ͳ�ѯ�����־��ӳ���,���ײ������С������,�������,�����״���.Ϊ�������Ʋ�ѯ����������Ӱ��,ͬһ�û�ͬһʱ�εĶ�β�ѯ(���ֲ�ѯ)֮����Ϊ���нϸ����Ƹ���.����һ�������,���ò�ѯ���ֹ�ϵ������ѯ���ھӲ�ѯ�����ռ�.����ѯ���ھӲ�ѯ������ʾ,�ھӲ�ѯ���������ƶ���Ϊ�����еIJ�ѯ���ƶ�.Ӧ�øĽ�Ļ����ܶȾ����㷨��ɾ���.ʵ��֤��,95�X262����ѯ�����ݼ���,�����㷨ʵ�ֲ�׼��79.77%����ȫ��48.21%,ƽ������С�ﵽ51.
Email Alert
�ؼ���� �����㷨   ��������   ��־�ھ�     
Abstract�� A new query clustering method on user-query log was presented. Traditional clustering techniques focused on queries and click-through logs, which are often sparse. The average cluster size is often small. In contrast, the user-query log is much denser as well as noisier. To reduce the influence of the noises and discover similar queries, queries visited by the same user at the same session were assumed to be mostly similar. Based on the assumption, a new similarity measure using query co-occurrence relations was calculated to create query neighbor vector space. The queries were represented by vectors consisting of their neighbors. The similarity function for clustering was calculated based on the query neighbor vectors. An adjusted clustering method of density-based spatial clustering of applications with noise(DBSCAN) was applied to generate the clusters. Experiments on a real dataset of 95�X262 queries show that 79.77% precision and ��48.21%�� recall is achieved and the average cluster size achieves 51.
Keywords�� clustering algorithms   search engines   data mining     
Received 2009-07-10;

���863�ƻ�������Ŀ(2007AA010302); �����Ȼ��ѧ���������Ŀ(60603039,90718018)

About author: ���ٷ�(1981-),��, ����������,��ʿ��,cjrf@sei.buaa.edu.cn.
���ٷ�, ��ï��, ������.�����û���ѯ��־�IJ�ѯ����[J]  �������պ����ѧѧ��, 2010,V36(4): 500-503
Jia Rongfei, Jin Maozhong, Wang Xiaobo.Query clustering using user-query logs[J]  JOURNAL OF BEIJING UNIVERSITY OF AERONAUTICS AND A, 2010,V36(4): 500-503
http://bhxb.buaa.edu.cn//CN/     ��     http://bhxb.buaa.edu.cn//CN/Y2010/V36/I4/500
Copyright 2010 by �������պ����ѧѧ��