期刊文献+

基于文本分块的多模板隐马尔可夫模型的文本信息抽取 被引量:4

Using text blocks based on multiple templates hidden markov model for text information extraction
下载PDF
导出
摘要 针对文本信息抽取中训练数据来源的多样化,不利于学习到最优的模型参数的问题,提出了一种基于多模板隐马尔可夫模型的文本信息抽取算法.新算法利用文本排版格式、分隔符等信息,对文本进行分块,在此基础上,通过对训练数据分成多个形式模板训练隐马尔可夫初始概率及转移概率参数,最后,结合统一训练的释放概率参数,对文本信息进行抽取.实验结果表明,新算法在精确度和召回率指标上比简单隐马尔可夫模型具有更好的性能. Since varied training data sources are not profitable for the learning of optimal model parameters, then a novel text information extraction algorithm based on hidden Markov model with multiple templates is proposed, which makes use of the information of format and list separators to segment text, and then extracts text information through combining the parameters of releasing probability for universal training, using multiple form templates to train the parameters of initial probability and transition probability for hidden Markov mode. Experimental results show better performance in precision and recall over simple hidden Markov model.
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2006年第3期25-28,共4页 Journal of Shandong University(Natural Science)
基金 福建省青年科技人才创新资助项目(2005J051) 福建省自然科学基金资助项目(A0510024) 广东省关键领域重点突破资助项目(2005A10207003)
关键词 文本信息抽取 隐马尔可夫模型 多模板 文本分块 text information extraction hidden markov model multiple ten,plates text block
  • 相关文献

参考文献11

  • 1马亮,陈群秀,蔡莲红.一种改进的自适应文本信息过滤模型[J].计算机研究与发展,2005,42(1):79-84. 被引量:18
  • 2Yi Liu, Rong Jin, Joyce Y. A maximum coherence model for dictionary-based cross-language information retrieval[A]. Proceedings of the 28^th Annual International ACM SIGIR Conference[C]. Salvador, Brazil: ACM Press, 2005. 536-543.
  • 3Kushmerick N. Wrapper induction: Efficiency and expressiveness[J]. Artificial Intelligence Journal, 2000, 118(12): 15-68.
  • 4刘云中,林亚平,陈治平.基于隐马尔可夫模型的文本信息抽取[J].系统仿真学报,2004,16(3):507-510. 被引量:51
  • 5Dayne Frietag, Andrew McCallum. Information extraction with HMMs and shrinkage[A]. Proceedings of the, AAAI'99 Workshop on Machine Learning for Information Extraction[C]. Orlando, US: AAAI Press/The MIT Press, 1999. 31 - 36.
  • 6Freitag D, McCallum A. Information extraction with HMM structures learned by stochastic optimization[A]. Proceedings of the 18^th Joint Conference on Artificial Intelligence [C].Acapulco, Mexico: Morgan Kaufmann Publisher, 2000. 584 -589
  • 7Souyma Ray, Mark Graven. Representing sentence structure in hidden markov models for information extraction[A]. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence[C].Washington, US: Morgan Kaufmann Publishers, 2001. 1273 - 1279.
  • 8T Cheffer, C Decomain, S Wrobel. Active hidden markav models for information extraction [A]. Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis[C]. Cascais, Portugal: Springer-Verlag, 2001. 309 -318.
  • 9Lawrence E Rabiner. A tutorial on hidden markov models and selected application in speech recognition[J]. Proceedings of the IEEE, 1989, 77(2) : 257 - 286.
  • 10林亚平,刘云中,周顺先,陈治平,蔡立军.基于最大熵的隐马尔可夫模型文本信息抽取[J].电子学报,2005,33(2):236-240. 被引量:48

二级参考文献40

  • 1[1]A. McCallum, K. Nigam, J. Rennie, and K. Seymore. A machine learning approach to building Domain-Specific Search Engines [A]. In Proceedings of IJCAI-99 [C]. 622-667.
  • 2[2]Ellien Riloff. Automatically Constructing a Dictionary for Information Extraction Task [A]. Proceeding for the Eleventh National Conference on Artificial Intelligence [C]. 1993. 811-816.
  • 3[3]E. Riloff , R. Jones. Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping [A]. Proceedings of the Sixteenth National Conference on Artificial Intelligence [C]. 1999. 811-816.
  • 4[4]S. Soderland. Learning information extraction rules for semi-structured and free text [J]. Machine Learning, 1999, 1-44.
  • 5[5]Kushmerick, N. Wrapper induction: efficiency and Expressiveness [J]. Artificial Intelligence,2000, Vol. 118, pp. 15--68.
  • 6[6]Leek,T. R. Information Extraction Using Hidden Markov Models [D]. Master's thesis, UC san Diego,1997.
  • 7[7]Kristie Seymore, Andrew McCallum, Ronal Rosenfel. Learning Hidden Markov Model Structure for Information Extract [A]. AAAI' 99 Workshop on Machine Learning for Information Extraction [C]. 1999. 37-42.
  • 8[8]Dayne Frietag, Andrew McCallum. Information Extraction with HMMs and shrinkage [A]. In Proceedings of the AAAI'99 Workshop on Machine Learning for Information Extraction [C], 1999, pp. 31-36.
  • 9[9]Freitag, D., & McCallum, A. Information extraction with HMM structures learned by stochastic optimization [A]. Proceedings of the Eighteenth Conference on Artificial Intelligence [C]. 2000.584-589.
  • 10[10]Freitag, D., McCallum, A., and Pereira F. Maximum Entropy Markov Models for Information Extraction and Segmentation [A]. In proceedings of ICML-2000 [C]. 591-598.

共引文献111

同被引文献26

引证文献4

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部