期刊文献+

主题分布在Deep Web数据库选择中的应用

Deep Web database selection using topic distribution
下载PDF
导出
摘要 随着越来越多的信息隐藏在Deep Web中,针对用户查询找出最相关的Web数据库成为亟待解决的问题。提出了一种基于Web数据库主题分布的方法用于Deep Web数据集成中的Web数据库选择。获取主题覆盖度形式的Web数据库内容描述,而后利用选定的Web数据库获取查询主题,最终由查询主题和主题分布矩阵来选择Web数据库。在真实Web数据库上的实验结果表明,该方法既取得了较高的查询召回率,也可有效降低数据库内容描述建立的代价。 Because of more and more data nestled in Deep Web, how to find the most relevant Web databases for user’s query requirements has become a problem demanding prompt solution. An approach based on topic distribution of Web database is proposed for Web database selection of Deep Web data integration. It acquires the content summary of Web database in the form of topic coverage, and then gets the topics of user query by using the appointed Web database. The database selection is made under query topics and topic coverage distribution matrix. The experiments on the real Web database have proved that this approach can not only achieve high recall, but also reduce price of building database content summary.
作者 郑东 施化吉
出处 《计算机工程与应用》 CSCD 2013年第10期136-139,215,共5页 Computer Engineering and Applications
基金 国家自然科学基金(No.60572112)
关键词 DEEP WEB Web数据库选择 主题分布 主题覆盖度 Deep Web Web database selection topic distribution topic coverage
  • 相关文献

参考文献11

  • 1Chang K C,He B,Li C, et al.Structured databases on the web : Observations and implications[J].ACM SIGMOD Record, 2004: 61-70.
  • 2刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 3Kabra G, Li C, Chang K C.Query routing: finding ways inthe maze of the deep Web[C]//WIKI' 05.[S.1.] : IEEE,2005.
  • 4Nie Z, Kambhampati S.A frequency-based Approach for min- ing coverage statistics in data integration[C].[S.1.]: IEEE Computer Society, 2004.
  • 5Balakrishnan R, Kambhampati S.SourceRank..relevance and trust assessment for deep web sources based on inter-source agreement[C].New York,NY, USA : ACM, 2009.
  • 6Fang W,Hu P,Zhao P,et al.Ontology-based deep Web data sources selection[M]//Corchado E, Abraham A, Pedrycz W. Hybrid Artificial Intelligence Systems.Berlin/Heidelberg: Springer, 2008.
  • 7Ping W, Ji-Rong W, Huan L, et al.Query selection techniques for efficient crawling of structured Web sources[Z].Atlanta, 3A, United States, 2006.
  • 8Dasgupta A,Das G,Mannila H.A random walk approach to sampling hidden databases[C].Beijing,China:ACM,2007.
  • 9刘伟,孟小峰,凌妍妍.一种基于图模型的Web数据库采样方法[J].软件学报,2008,19(2):179-193. 被引量:29
  • 10Subramaniam L V,Nanavati A A, Mukherjea S.Enriching one taxonomy using another[J].IEEE Transactions on Knowl- edge and Data Engineering,2010,22(10) : 1415-1427.

二级参考文献79

  • 1.[EB/OL].http://www.cogsci.Princeton.edu,.
  • 2Fetterly D,Manasse M,Najork M,Wiener J L.A largescale study of the evolution of Web pages//Proceedings of the 12th International World Wide Web Conference.Budapest,2003:669-678
  • 3Chang K C,He B,Li C,Patel M,Zhang Z.Structured databases on the Web:Observations and Implications.SIGMOD Record,2004,33(3):61-70
  • 4Cope J,Craswell N,Hawking D.Automated discovery of search interfaces on the Web//Proceedings of the 14th Australasian Database Conference(ADC 2003).Adelaide,2003:181-189
  • 5Zhang Z,He B,Chang K C.Understanding Web query interfaces:Best-effort parsing with hidden syntax//Proceedings of the 23rd ACM SIGMOD International Conference on Management of Data.Paris,2004:107-118
  • 6Arasu A,Garcia-Molina H.Extracting structured data from Web pages//Proceedings of the 22nd ACM SIGMOD International Conference on Management of Data.San Diego,2003:337-348
  • 7Crescenzi V,Mecca G,Merialdo P.RoadRunner:Towards automatic data extraction from large Web sites//Proceedings of the 27th International Conference on Very Large Data Bases.Italy,2001:109-118
  • 8Wittenburg K,Weitzman L.Visual grammars and incremental parsing for interface languages//Proceedings of the IEEE Symposium on Visual Languages (VL).Skokie,1990:111-118
  • 9He H,Meng W,Yu C T,Wu Z.WISE-integrator:An automatic integrator of Web search interfaces for e-commerce//Proceedings of the 29th International Conference on Very Large Data Bases.Berlin,2003:357-368
  • 10Peng Q,Meng W,He H,Yu C T.WISE-cluster:Clustering e-commerce search engines automatically//Proceedings of the 6th ACM International Workshop on Web Information and Data Management.Washington,2004:104-111

共引文献158

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部