摘要
自动化项目生成(Automatic Item Generation)中的项目参数是基于认知项目设计的刺激特征集预测的,在不确定性来源上较之用经验数据标定的参数更为复杂。文章通过实证研究分析了在计算机适应性测验条件下基于认知设计系统法生成的抽象推理测验(ART)项目预测参数对能力参数估计的精确性。研究表明,项目预测参数比相应标定参数分布更为趋中。这种回归效应既影响到能力参数估计误差大小,也导致适应性测验过程中项目选择的差异。在控制了项目选择差异之后,能力参数估计误差较之基于项目标定参数的能力估计误差大,但差别并不明显。两者相应的能力估计值相关很高,对应能力值之间的差异很小,且几乎贯彻整个能力分布区间。
Automatic item generation has become a promising area in recent studies. In automatic item generation, items with targeted psychometric properties are generated during testing. The feasibility of automatic item generation lies in the fact that items are generated from a set of observable item stimulus features, which are mapped onto the cognitive variables underlying the item solution and are calibrated through cognitive psychometric models. Parameters of a generated item can then be predicted from the specific combination of the calibrated item stimulus features in the item. Predicted item parameters, compared to those calibrated from empirical data, involve more complex sources of uncertainty. Although the relationship between sufficiency of the cognitive model of item solving and the adequacy of item parameter prediction can be theoretically justified, the degree to which such predicted parameters impact various aspects of testing, however, is an empirical question and needs to be explored. This paper investigated the impact of predicted item parameters on ability estimates in a computerized adaptive testing environment, based on abstract reasoning test (ART) items which were generated using cognitive design system approach (Embretson, 1998). The item bank contained 150 items with two sets of item difficulties, of which one is predicted from the item design features and the other is calibrated from sample data. Each of the 263 subjects participated in the study received two subtests, of which one was based on predicted parameters, the other calibrated parameters. The item bank was split into two parallel halves based on predicted item parameters to prevent items in the bank from repeat administrations within subjects. Subjects were randomly assigned into one of the four testing procedures resulting from the combinations between parameter types (predicted versus calibrated) and item bank (first half versus second half). Results of the study showed a clear regression-to-the-mean effect of the predicted item parameters, compare to its counterpart of calibrated item parameters. Inward biases of ability estimates from subtest using predicted item parameters were observed when ability estimates were compared across different subtests within subjects. Compared to its counterpart using calibrated parameters, standard errors of ability estimates were larger for those from the subtest using predicted item parameters in the mid-range of the scale, where regression-to-the-mean effect of the predicted item parameters is minimal, and were smaller in the rest of the scale, possibly due to joint impact of increased uncertainty of predicted item parameters, estimation biases and limitation of the item bank at various levels of ability scale. When ability was estimated from the same subtest using different types of item parameters, very high correlation (.995) were obtained and no biases were observed throughout almost the entire scale. Standard errors of ability estimates were larger for predicted parameters yet the differences were small.
出处
《心理学报》
CSSCI
CSCD
北大核心
2010年第7期802-812,共11页
Acta Psychologica Sinica
基金
上海市浦江人才计划项目资助