期刊文献+

科技文献元数据自动抽取研究述评 被引量:6

Literature Review on Automatic Metadata Extraction of Scientific Paper
下载PDF
导出
摘要 首先从元数据的属性和元数据的粒度两个角度对科技文献元数据进行了分析,在此基础上,从科技文献元数据自动抽取的理论研究和应用实践研究两个方面对国内外科技文献元数据自动抽取研究成果进行分析和综合,最后指出了现有研究的特点和存在的不足. From the perspectives of metadata attributes and metadata granularity, the metadata of scientific paper is analyzed. On this basis, the research on metadata extraction of scientific paper in domestic and international are analyzed and synthesized from two aspects of the theoretical research and application in practice. Finally, the features and shortcomings of the current research are pointed out.
出处 《计算机系统应用》 2013年第3期11-15,共5页 Computer Systems & Applications
基金 教育部人文社会科学研究规划基金(09XJA870003)
关键词 科技文献 元数据自动抽取 基于规则的抽取 基于模板的抽取 基于机器学习的抽取 scientific paper automatic metadata extraction rule-based extraction template-based extraction machine-learning extraction
  • 相关文献

参考文献27

  • 1Wei W, King I, Lee JHM. Bibliographic attributes extraction with layer-upon-layer tagging. Proc of the ICDAR'07. Curitiba, 2007: 804-808.
  • 2Besagni D, Belaid A, Benet N. A segmentation method for bibliographic references by contextual tagging of fields. Proc. of the ICDAR'03. Edinburgh, 2003: 384-388.
  • 3李朝光,张铭,邓志鸿,杨冬青,唐世渭.论文元数据信息的自动抽取[J].计算机工程与应用,2002,38(21):189-191. 被引量:38
  • 4张铭,邓志鸿,陈捷,杨冬青,唐世渭.数字图书馆科技文献知识导航[J].计算机工程与应用,2002,38(17):1-3. 被引量:9
  • 5陈俊林,张文德.基于XSLT的PDF论文元数据的优化抽取[J].现代图书情报技术,2007(2):18-23. 被引量:9
  • 6Ding Y, Chowdhury G, Foo S. Template mining for the extra- ction of citation fi'om digital documents. Proc. of the Second Asian Digital Library Conference. Taiwan, 1999: 47-62.
  • 7Day MY, Tsai RTH, Sung CL, Hsieh CC, Lee CW, Wu SH, Wu KP, Ong CS, Hsu WL. Reference Metadata Extraction Using a Hierarchical Knowledge representation framework. Decision Support Systems, 2007,43:152-167.
  • 8Eli C, da Silva AS, Marcos AG, Filipe M, de Moura ES. FLUX-CIM: flexible unsupervised extraction of citation metadata. Proc. of the JCDL'07. New York: ACM Press, 2007:215-224.
  • 9Chen CC, Yang KH, Kao HY, Ho JM. BibPro: A citation parser based on sequence alignment techniques. Proc. of the IEEE AINA'08. Okinawa, Japan, 2008: 1175-1180.
  • 10郭志鑫,金海,陈汉华.SemreX中基于语义的文档参考文献元数据信息提取[J].计算机研究与发展,2006,43(8):1368-1374. 被引量:8

二级参考文献62

  • 1陈汉华,金海,宁小敏,袁平鹏,武浩,郭志鑫.SemreX:一种基于语义相似度的P2P覆盖网络[J].软件学报,2006,17(5):1170-1181. 被引量:41
  • 2陈俊林,张文德.基于XSLT的PDF论文元数据的优化抽取[J].现代图书情报技术,2007(2):18-23. 被引量:9
  • 3陈云榕,刘立柱,丁志鸿.PDF文件中关键信息的提取与组织方法研究[J].计算机工程与设计,2007,28(7):1688-1690. 被引量:12
  • 4中国图书馆图书分类法编辑部.中国图书馆分类法(第四版)[M].北京:北京图书馆出版社,1998..
  • 5T Berners-Lee, J Hendler, O Lassila. The semantic Web [J].Scientific American, 2001, 284(5): 34-43
  • 6J Broekstra, A Kampman, F van Harmelen. Sesame: A generic architecture for storing and querying RDF and RDF schema[C]. The 1st Int'l Semantic Web Conference (ISWC' 02),Sardinia, Italy, 2002
  • 7T R Gruber. A translation approach to portable ontologies [J ].Knowledge Acquisition, 1993, 5(2) : 199-220
  • 8Kristie Seymore, Andrew McCallum, Ronald Rosenfeld.Learning hidden Markov model structure for information extraction [C]. AAAI99 Workshop on Machine Learning for Information Extraction, Orlando, Florida, USA, 1999
  • 9H Han, C Giles, E Manavoglu, et al. Automatic document metadata extraction using support vector machines [C]. Joint Conf on Digital Libraries, Houston, Texas, USA, 2003
  • 10Mike Jewell. ParaTools reference parsing toolkit-version 1.0 released [OL]. http://www. dlib. org/dlib/february03/02contents.html, 2005-06-14

共引文献76

同被引文献62

引证文献6

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部