[an error occurred while processing this directive]
���¿��ټ��� �߼�����
   ��ҳ  �ڿ�����  ��ί��  Ͷ��ָ��  �ڿ�����  ��������  �� �� ��  ��ϵ����
�������պ����ѧѧ�� 2004, Vol. 30 Issue (09) :835-838    DOI:
���� ����Ŀ¼ | ����Ŀ¼ | ������� | �߼����� << | >>
��ΰ, ������*
�Ͼ����պ����ѧ �����Ӧ���о���, �Ͼ� 210016
Study on an XML approximately duplicated data cleaning method
Chen Wei, Ding Qiulin*
Computer Application Institute, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

Download: PDF (363KB)   HTML 1KB   Export: BibTeX or EndNote (RIS)      Supporting Info
ժҪ ��԰�ṹ������XML�����������е���Ҫ��,�о����������XML�����ظ�����,��Ҫ������:���һ����Ч��XML�����ظ�����������,�÷������н�ǿ����Ӧ��,�κ�XML���Ƽ���㷨�������ڴ�;����һ�ֻ������༭��������Ƽ���㷨,���㷨����Ч�ؼ��XML�����ظ�����;�������༭������������Ż��������༭��������Ƽ���㷨,�����˲���Ҫ�����༭�������,���������Ƽ�����ĸ��Ӷ�,���������Ч��.�˹���Ϊ�о�XML�����ظ�����������»���.
Email Alert
�ؼ����� �����   �㷨��   ��������   ����չ�������   �����ظ�����     
Abstract�� Aiming at the importance of semi-structured data XML in data cleaning, how to clean XML approximately duplicated data was studied. An efficient XML approximately duplicated data cleaning method was proposed. This method is adaptive, because any other approximately detecting algorithm can be used in it. An efficient approximately detecting algorithm based on tree edit distance was presented. This algorithm can detect approximately duplicated data efficiently. The lower and upper bounds of tree edit distance were used to optimize the approximately duplicated data detecting algorithm. The improved algorithm can avoid computing the tree edit distance that is not needed between a pair of XML data, and reduce the approximate computation complexity. So, foundations are built for researching XML approximately duplicated data cleaning.
Keywords�� rules library   algorithms library   data cleaning   extensible markup language(XML)   approximately duplicated data     
Received 2003-06-02;
About author: �� ΰ (1976-),��,ɽ��������,��ʿ��, chenweich@tom.com.
��ΰ, ������.һ��XML�����ظ����ݵ��������о�[J]  �������պ����ѧѧ��, 2004,V30(09): 835-838
Chen Wei, Ding Qiulin.Study on an XML approximately duplicated data cleaning method[J]  JOURNAL OF BEIJING UNIVERSITY OF AERONAUTICS AND A, 2004,V30(09): 835-838
http://bhxb.buaa.edu.cn//CN/     ��     http://bhxb.buaa.edu.cn//CN/Y2004/V30/I09/835
Copyright 2010 by �������պ����ѧѧ��