摘要
针对维吾尔语名词短语待消解项识别任务,该文提出一种利用栈式非负约束自编码器(Stacked Nonnegative Constrained Autoencoder,SNCAE)完成基于语义特征的待消解项识别方法。为了提高自动编码器隐藏层激活度的稀疏性和重构数据的质量,利用NCAE非负约束算法,为连接权值施加非负性约束。通过分析维吾尔语名词短语语言指代现象,提取出15个特征,利用SNCAE提取出深层语义特征,引入Softmax分类器,进而完成待消解项识别任务。该方法在维吾尔语名词短语待消解项识别中,正例准确率和负例准确率分别比SVM高出8.259%和4.158%,比栈式自编码(SAE)高出1.884%和1.590%,表明基于SNCAE的维吾尔语名词短语待消解项识别方法比SVM和SAE更适合维吾尔文的待消解项识别任务。
Focusedon Uyghur noun phrase coreference identification task,this paper proposed a Stacked Nonnegative Constrained Autoencoder(SNCAE)for anaphoricity determination based on semantic feature.Through the analysis of Uyghur noun phrase language phenomenon,15 kinds of semantic features are extracted,and then input into SNCAE to extract the deep semantic features.Finally,the Softmax classifier is used to complete the recognition task.Compared with Support Vector Machine(SVM),the positive accuracy and negative accurate increased by 8.259%and 4.158%,respectively,and increased by 1.884% and 1.590%,respectively,than the Stacked Autoencoder(SAE).
出处
《中文信息学报》
CSCD
北大核心
2017年第5期92-98,113,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金(61563051
61662074)
国家自然科学基金(61262064)
国家自然科学基金(61331011)
自治区科技人才培养项目(QN2016YX0051)