语义分析中谓词标识的特征工程被引量：7

Feature engineering for predicate identification and classification in semantic analysis

下载PDF

导出

摘要谓词是句子中的最重要的成分,它的正确与否对语义分析的影响非常大。而众多的特征直接影响到谓词标识的性能,如何组织这些特征显得尤为重要。选取了7个基本特征和30多个新特征以及它们的组合,使用最大熵分类器,在基本特征的基础上通过增加有利特征的方法,使得谓词标注的F1值增长了约5%(由84.7%增加到89.8%),词义识别的F1值增长了约2%(由80.3%增加到82.1%),结果表明,这些新特征及其组合大大提高了性能。 Predicate is the most important component in a sentence,which greatly influences the identification of the semantic analysis.The performance of predicate identification and classification relies on lots of features,but how to combine those features is more important.This paper picks out 7 basic features and over 30 new features with different combinations.By adding useful combinations of the features into the baseline system with the maximum entropy classifier,it improves by 5% of F1-score（from 84.7% up to 89.8%）on predicate identification and also gains about 2% increase of F1-score（from 80.3% up to 82.1%）on predicate classification.It shows that those new features and the combination of them can much improve the performance of the system.

作者汪红林王红玲周国栋

机构地区苏州大学计算机科学与技术学院江苏省计算机信息处理技术重点实验室

出处《计算机工程与应用》 CSCD 北大核心 2010年第9期134-137,共4页 Computer Engineering and Applications

基金国家自然科学基金(No.60673041) 国家高技术研究发展计划(863)(No.2006AA01Z147) 高等院校博士学科点专项科研基金(No.20060285008)~~

关键词谓词标注和词义识别语义分析特征工程最大熵分类器 predicate identification and predicate classification semantic analysis feature engineering maximum entropy classifier

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献10

1Lluis X,Marquez LA joint model for parsing syntactic and semantic dependencies[C]//Proceedings of the 12th Conference on Computational Natural Language Learning, Manchester, 2008 :188-192.
2Wang Hong-ling,Wang Hong-lin,Zhou Guo-dong.Dependeucy treebased SRL with proper pruning and extensive feature engineering[C]// Proceedings of the 12th Conference on Computational Natural Language Learuing, Manchester, 2008: 253-257.
3Yuret D,Yatbaz M A,Ural A E.Diseriminative vs generative approaches in semantic role labeling[C]//Proceedings of the 12th Conference on Computational Natural Language Learning,Manchester, 2008 : 223-227.
4Ciaramita M,Attardi G,Dell'Orletta F.DeSRL:A linear-time semantic role labeling system[C]//Proceedings of the 12th Conference on Computational Natural Language Learnlng,Manchester,2008: 258-262.
5Che Wan-xiang,Li Zheng-hua,Hu Yu-xuan,et al.A cascaded syntactic and semantic dependency parsing system[C]//Proceedings of the 12th Conference on Computational Natural Language Learning, Manchester, 2008 : 238-242.
6Morante R,Daelemans W,Van Asch V.A combined memory-based semantic role labeler of English[C]//Proceedings of the 12th Conference on Computational Natural Language Learning,Manchester,2008: 208-212.
7Watanabe Y,Iwatate M,Asahara M,et al.A pipeline approach for syntactic and semantic dependency parsing[C]//Proceedings of the 12th Conference on Computational Natural Language Learning,Manchester, 2008 : 228-232.
8Gildea D,Jurafsky D.Automatic labeling of semantic roles[J].Computational Linguistics, 2002,28 (3) : 245-288.
9Gildea D,Palmer M.The necessity of syntactic parsing for predicate argument recognition[C]//Proceedings of ACL-2002,Philadelphia,PA, 2002: 239-246.
10Surdeanu M,Harabagiu S,Williams J,et al.Using predicate-argument structures for information extraction[C]//Proceedings of ACL- 2003, Sapporo, Japan, 2003.

同被引文献44

1李国臣,孟静.利用主语和谓语的句法关系识别谓语中心词[J].中文信息学报,2005,19(1):1-7. 被引量：11
2罗振声,郑碧霞.汉语句型自动分析和分布统计算法与策略的研究[J].中文信息学报,1994,8(2):1-19. 被引量：21
3费洪晓,胡海苗,巩燕玲.基于Hash结构的机械统计分词系统研究[J].计算机工程与应用,2006,42(5):159-161. 被引量：8
4谌志群.汉语句子谓词的自动识别方法研究[J].计算机工程与应用,2007,43(17):176-178. 被引量：1
5周强.汉语语料库的短语自动划分和标注研究[D].北京:北京大学,2002.
6贾彦德.汉语语义学[M].北京:北京大学出版社,2005:117-130.
7XUE N. Labeling chinese predicates with semantic roles[J]. Computational Linguistics, 2008, 34(2): 225-255.
8QUINLAN J R. Induction of decision trees[J]. Machine Learning, 1986(1): 81-106.
9HALL M, FRANK E, HOLMES G, et al. The WEKA data mining software: an update[J]. ACM Sigkdd Explorations Newsletter, 2009, 11(1): 1018.
10Adomavicius G,Tuzhilin A.Toward the next generation of recommender systems:a survey of the state-of-the-art and possible extensions[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(6):734-749.

引证文献7

1张翠萍.基于规则的英汉翻译系统的设计与实现[J].三明学院学报,2011,28(6):36-41.
2韩磊,罗森林,潘丽敏,魏超.融合词法和句法特征的汉语谓词高精度识别方法[J].浙江大学学报（工学版）,2014,48(12):2107-2114. 被引量：5
3叶文玲.基于语义分析的档案智能检索技术研究[J].办公室业务,2014(8S):69-71. 被引量：2
4曹孟毅,黄穗,王会进,何杰,龙舜.基于内容相似度的运动路线推荐[J].计算机工程与应用,2016,52(9):33-38. 被引量：6
5李婷,秦永彬,黄瑞章,程欣宇,陈艳平.基于神经网络的中文谓语动词识别研究[J].数据采集与处理,2020,35(3):582-590. 被引量：8
6黄瑞章,靳文繁,陈艳平,秦永彬,郑庆华.基于Highway-BiLSTM网络的汉语谓语中心词识别研究[J].通信学报,2021,42(1):100-107. 被引量：4
7郭晓,陈艳平,唐瑞雪,黄瑞章,秦永彬.边界回归的谓语中心词识别[J].计算机工程与应用,2023,59(22):144-150. 被引量：1

二级引证文献18

1古秦弋,杨瑞娟,黄美荣.基于内容相似度的雷达情报筛选技术[J].空军预警学院学报,2017,31(3):190-193. 被引量：5
2佟玉军,吕行,李煜,何俊.一种新的行程推荐算法研究[J].中小企业管理与科技,2017,1(19):142-143.
3廖闻剑,田小虎,邱秀连.基于轨迹相似度的伴随人员推荐[J].计算机系统应用,2018,27(4):157-161. 被引量：7
4古秦弋,杨瑞娟,黄美荣,杨云飞,叶伟,李玥.ReliefF内容相似度的雷达情报按需分发[J].现代防御技术,2018,46(3):184-190. 被引量：4
5王睿怡,罗森林,吴舟婷,潘丽敏.深度学习在汉语语义分析的应用与发展趋势[J].计算机技术与发展,2019,29(9):110-116. 被引量：3
6叶光辉,杨金庆.基于城市地名实体双向链接分析的路线推荐研究[J].数据分析与知识发现,2019,3(11):79-88. 被引量：1
7蒋红健.高校数字档案资源智能语义检索技术策略研究[J].兰台世界,2020(12):57-60. 被引量：2
8黄瑞章,靳文繁,陈艳平,秦永彬,郑庆华.基于Highway-BiLSTM网络的汉语谓语中心词识别研究[J].通信学报,2021,42(1):100-107. 被引量：4
9孙倩,秦永彬,黄瑞章,刘丽娟,陈艳平.结合案件要素序列的罪名预测方法[J].大数据,2021,7(6):30-40. 被引量：4
10刘宇川,张朋柱.基于模糊Petri网的个性化运动方案智能生成[J].系统管理学报,2022,31(1):159-166. 被引量：2

1丁金涛,王红玲,周国栋,朱巧明,钱培德.语义角色标注中特征优化组合研究[J].计算机应用与软件,2009,26(5):17-21. 被引量：7
2刘虎,刘卫东,杨萍.一种基于装备画像的武器装备数据化方法[J].兵器装备工程学报,2016,37(3):59-62.
3刘怀军,车万翔,刘挺.中文语义角色标注的特征工程[J].中文信息学报,2007,21(1):79-84. 被引量：39
4孟令恩,何彦青,李颖.中文语义角色标注在情报分析领域的应用研究[J].情报工程,2016,2(1):43-52.
5曾德胜,黄发良,潘传迪.面向产品垃圾评论识别的特征工程[J].福建师范大学学报（自然科学版）,2017,33(2):25-31. 被引量：1
6李泽魁,赵妍妍,秦兵,刘挺.中文微博情感倾向性分析特征工程[J].山西大学学报（自然科学版）,2014,37(4):570-579. 被引量：11
7安强强,张蕾.基于依存树的中文语义角色标注[J].计算机工程,2010,36(4):161-163. 被引量：7
8李林,吴跃,叶茂.基于概率图模型的图像整体场景理解特征工程综述[J].计算机应用研究,2015,32(12):3542-3550. 被引量：5
9刘欢欢,李寿山,周国栋,李逸薇.中文情绪识别方法研究[J].江西师范大学学报（自然科学版）,2013,37(2):120-124. 被引量：5
10鞠久朋,王红玲,周国栋.依存关系语义角色标注研究[J].计算机工程与应用,2010,46(14):158-161.

计算机工程与应用

2010年第9期

浏览历史

内容加载中请稍等...

语义分析中谓词标识的特征工程被引量：7

参考文献10

同被引文献44

引证文献7

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

语义分析中谓词标识的特征工程 被引量：7

参考文献10

同被引文献44

引证文献7

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

语义分析中谓词标识的特征工程被引量：7