摘要
针对SQL语言编程能力评估的多因素影响、界限模糊特性造成的难度和偏差问题,本文提出了基于结构相似度匹配的评估模型(SQL-APAM)。结合静态评估与动态评估方法,给出了模型的整体框架;模型对提交的SQL语句进行规范化、分词处理后,将其转换成等价的单词序列对,进而构建对应的结构树S-tree;使用于代价模型、子结构贡献因子两方面上有所改进的树编辑距离算法计算与目标树的相似性值;最后利用正态分布思想将相似度值映射到成绩区间,并通过相似度阈值来调整影响因素所带来的偏差,给出SQL程序的定量评判结果。最后对模型作了基于数据的实验分析与验证,训练数据集进行参数调整,对模型进行优化。
In view of the difficulty and the diviation caused by the features of multi-factor and fuzzy boundaries of the au- tomated programming assessment model for SQL languages(SQLAPAM), this paper introduces an assessment model based on structure similarity matching. The overall framework of the model is proposed with the combination of static and dynamic assessment methods. After being processed by standardization and tokenization, the submitted SQL statements are trans- formed into the equivalent sequence of token pairs with which the model constructs a corresponding structure tree(S^tree). Next the model calculates similarity between the acquired tree and the target tree using the tree edit distance improved in the cost model and the sub-structure contribution factor,and gains a similarity threshold. Finally, the model maps similarity to the score intervals with reference to the normal distribution theory and adjusts the deviation brought by the impact factors with the help of the similarity threshold. Meanwhile the final assessment result for the SQL program is provided.
出处
《计算机工程与科学》
CSCD
北大核心
2010年第11期92-96,共5页
Computer Engineering & Science
基金
江苏省高技术研究资助项目(BG2007028)
关键词
相似性分析
自动评估
分词
树编辑距离
正态分布
similarily analysis
automated assessment
tokenization
tree edit distance
normal distribution