基于监督学习的HSK阅读文本自动分级模型研究

Research on automatic grading model of HSK reading texts based on supervised learning

下载PDF

导出

摘要针对HSK(汉语水平考试)各类阅读材料难度判定与等级对应中缺乏有效参照标准和分析工具的问题,以历年HSK真题阅读文本为研究对象,提取文本可读性特征,采用支持向量机、随机森林、极端梯度增强等9种监督学习算法,建立可将自选文本自动归类于相应HSK等级的模型,采用准确率、AUC等多项指标评价各模型的分级效果,并选择最佳模型制成在线工具。结果表明,监督学习在HSK阅读材料文本分析及分级方面具有较高性能,9种模型中极端梯度增强的分级效果最好,准确率为0.913,AUC为0.994。建立的分级模型和在线工具能够以较高的准确率对HSK自选文本进行分级,帮助用户有针对性地遴选文本,提高学习效率。 Aiming at the problem that there are few effective reference standards and analysis tools available in classifying and grading Hanyu Shuiping Kaoshi(HSK)reading materials,with HSK reading texts in the past years as study object,the text readability features were extracted,and nine supervised learning algorithms,such as support vector machine,decision tree and extreme gradient enhancement,etc.,were employed to build a model that could automatically classify self-selected text to the corresponding HSK level.Multiple indicators such as accuracy and AUC were adopted to evaluate the grading effect of each model,and the best model was chosen to design an online tool.The results show that supervised learning has high performance in analyzing and grading HSK reading materials.Among the nine supervised learning models,extreme gradient enhancement is the best,with an accuracy of 0.913 and an AUC of 0.994.The grading model and online tool can grade HSK self-selected texts with high accuracy,help users select texts pertinently and improve learning efficiency.

作者任梦王方伟 REN Meng;WANG Fangwei(College of Chinese and Literature,Hebei Normal University,Shijiazhuang,Hebei 050024,China;College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang,Hebei 050024,China)

机构地区河北师范大学文学院河北师范大学计算机与网络空间安全学院

出处《河北科技大学学报》 CAS 北大核心 2024年第2期150-158,共9页 Journal of Hebei University of Science and Technology

基金国家自然科学基金(61572170) 河北师范大学2023年度人文社会科学校内科研基金(S23AI001)。

关键词自然语言处理监督学习 HSK阅读文本可读性特征分级模型 natural language processing supervised learning HSK reading text readability feature grading model

分类号 TP391.77 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1张彤,沈倩,王琼.基于模糊聚类与改进遗传算法的异常电力工程数据识别技术[J].电子设计工程,2024,32(6):100-103.
2熊炎林,陈冠甫,刘晓丽.基于极限树机器学习算法的岩爆预测[J].地下空间与工程学报,2023,19(S02):908-919.
3刘岩,张宏飞,卢鑫,曲可佳.数学概率问题解决学习中的测试效应及有效促进[J].心理发展与教育,2024,40(2):215-223.
4徐诗语,张谦,邬依林.面向英文阅读难度分类的神经网络设计与实现[J].现代计算机,2024,30(2):52-59.
5许莹.“关注为本采纳模式”对专业混合式教育实施水平的实践研究[J].现代商贸工业,2024,45(8):207-209.
6李秋洁,李相程.基于移动激光扫描的行道树树冠点云逐点检测[J].南京林业大学学报（自然科学版）,2024,48(1):205-213.
7王惠笛.基于分级阅读理念如何开展校园阅读活动[J].阅读与成才,2023(6):129-130.
8朱忠明.义务教育数学学业质量标准与数学测评[J].课程．教材．教法,2024,44(3):107-112.
9聂文凯,徐颖,于宗睿,杨慧文,刘冰,王桂香.基于网络药理学探讨绿原酸治疗特异性皮炎的作用机制[J].广东药科大学学报,2024,40(2):91-103.
10房晓娇.综合工时制下加班工资在延长工时超上限时的认定[J].人民司法,2024(7):49-52.

河北科技大学学报

2024年第2期

浏览历史

内容加载中请稍等...

基于监督学习的HSK阅读文本自动分级模型研究

相关作者

相关机构

相关主题

浏览历史