期刊文献+

数据挖掘技术在学习者作文特征分析中的应用研究 被引量:7

An Analysis of Learner Compositions through the Application of Data Mining Technology
原文传递
导出
摘要 本文探讨了数据挖掘技术在日语作文特征分析中的应用方式。词汇密度和文本特征分析显示,作文是一种独特的文体,与其他本族语语料差异显著。其特点表现为,词汇密度低,名词、数词等使用偏少,动词、形容词等占比高,句子短,书面语程度低。学习者与本族语使用者产出的作文之间存在明显差异。前者中状态描写偏多,动态描写较少,动词、助动词等占比低。相比之下,八级作文更加接近本族语使用者作文尤其是高年级组作文,但部分词汇的使用能力仍显不足。词语共现网络显示,随着等级的提高,学习者的描述逐渐细致、具体,词汇逐渐接近本族语使用者的产出,错误明显减少,但始终无法完全摆脱母语的干扰。 This paper discusses the application of data mining technology in the analysis of composition. The analysis of vocabulary density and text characteristics shows that composition is a unique style, which is obviously different from other native language materials. Its characteristics are low vocabulary density, less use of nouns and numerals, a high proportion of verbs, adjectives, etc., short sentences, and a low level of written language. There are obvious differences between the compositions produced by learners and native speakers. In the former, there are more state descriptions, less dynamic descriptions, and a lower proportion of verbs and auxiliary verbs. In contrast, level-eight composition is closer to the composition of native language users, but the ability to use some vocabulary is still insufficient. The co-occurrence network shows that as the level increases, learners’ descriptions become more detailed and specific, vocabulary gradually approaches the output of native speakers, and errors are significantly reduced. But they still cannot completely avoid the interference of their mother language.
作者 毛文伟 Mao Wenwei(Shanghai International Studies University,China)
出处 《日语学习与研究》 CSSCI 2022年第2期72-81,共10页 Journal of Japanese Language Study and Research
基金 2019年国家社科基金项目“基于数据挖掘技术的中国日语学习者认知机制研究”(项目编号:19BYY201)的阶段研究成果。项目主持人:毛文伟。
关键词 二语习得 词汇密度 文本特征 方差分析 词语共现网络 SLA Lexical Density Text Features One-Way ANOVA Co-Occurrence Network
  • 相关文献

二级参考文献115

共引文献453

引证文献7

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部