Key technology of deep web search
-
摘要: 深度web资源是指通过web搜索等形式能够访问的网络数据库资源,由于它与静态网页存在着一些差异,传统的搜索引擎技术不能够很好的实现深度web资源搜索.研究了深度web资源搜索中的若干关键技术,包括深度web资源的自动发现和基于本体的深度web数据抽取.通过实验验证了所提出技术的可用性与高效性.设计和实现了一种新的针对深度web资源的搜索引擎系统,能够获取深度web资源信息,并且利用这些信息抽取出结构化数据,从而为用户或者其他应用系统提供服务.已经应用在国家重大工程项目"国家科技基础条件平台门户应用系统"中,并取得了很好的应用效果.Abstract: Deep web is web database accessed by the web search, it is different from surface web, so technology of traditional search engines is not applied to deep web research very well. Some key technology of deep web search was researched, including automatic discovery of deep web resource and ontology-based information extraction of deep web. Effectiveness of these technologies and relevant algorithms were verified through experiments. Based on these technologies and relevant algorithms, a deep web search system was implemented, it could access deep web resource and extract structured data for various applications. The system has been applied to national important project "national portal of science and technology infrastructure" and has achieved obvious effect.
-
Key words:
- search engines /
- information retrieval /
- semantics
-
[1] Chang K C C,He B, Li C,et al.Structured databases on the web: observations and implications[J]. Sigmod Record, 2004,33(3):61-70 [2] Chen Peng,Li Tao,Wei Kun,et al.Research on automatic discovery of deep web[J]. Compute Science,2007,34(11A):32-35 [3] Chen Peng,Su Liliang,Wei Kun,et al.Towards automatic discovery of deep web based on machine learning[J].Journal of Computational Information Systems,2007,3(3):1033-1042 [4] Chang K C C,He B,Zhang Z. Toward large-scale integration: building a metaquerier over databases on the web[J]. CIDR 2005: 44-55 [5] Chang C H,Kayed M,Girgis M R,et al.A survey of web information extraction systems[J].IEEE Transacitons on Knowledge and Data Engineering, 2006,18(10):1411-1428
点击查看大图
计量
- 文章访问数: 3733
- HTML全文浏览量: 82
- PDF下载量: 1424
- 被引次数: 0