期刊文献+

计算机适应性测验条件下认知设计项目预测参数的影响 被引量:1

The Impact on Ability Estimates of Predicted Parameters from Cognitively Designed Items in a Computerized Adaptive Testing Environment
下载PDF
导出
摘要 自动化项目生成(Automatic Item Generation)中的项目参数是基于认知项目设计的刺激特征集预测的,在不确定性来源上较之用经验数据标定的参数更为复杂。文章通过实证研究分析了在计算机适应性测验条件下基于认知设计系统法生成的抽象推理测验(ART)项目预测参数对能力参数估计的精确性。研究表明,项目预测参数比相应标定参数分布更为趋中。这种回归效应既影响到能力参数估计误差大小,也导致适应性测验过程中项目选择的差异。在控制了项目选择差异之后,能力参数估计误差较之基于项目标定参数的能力估计误差大,但差别并不明显。两者相应的能力估计值相关很高,对应能力值之间的差异很小,且几乎贯彻整个能力分布区间。 Automatic item generation has become a promising area in recent studies. In automatic item generation, items with targeted psychometric properties are generated during testing. The feasibility of automatic item generation lies in the fact that items are generated from a set of observable item stimulus features, which are mapped onto the cognitive variables underlying the item solution and are calibrated through cognitive psychometric models. Parameters of a generated item can then be predicted from the specific combination of the calibrated item stimulus features in the item. Predicted item parameters, compared to those calibrated from empirical data, involve more complex sources of uncertainty. Although the relationship between sufficiency of the cognitive model of item solving and the adequacy of item parameter prediction can be theoretically justified, the degree to which such predicted parameters impact various aspects of testing, however, is an empirical question and needs to be explored. This paper investigated the impact of predicted item parameters on ability estimates in a computerized adaptive testing environment, based on abstract reasoning test (ART) items which were generated using cognitive design system approach (Embretson, 1998). The item bank contained 150 items with two sets of item difficulties, of which one is predicted from the item design features and the other is calibrated from sample data. Each of the 263 subjects participated in the study received two subtests, of which one was based on predicted parameters, the other calibrated parameters. The item bank was split into two parallel halves based on predicted item parameters to prevent items in the bank from repeat administrations within subjects. Subjects were randomly assigned into one of the four testing procedures resulting from the combinations between parameter types (predicted versus calibrated) and item bank (first half versus second half). Results of the study showed a clear regression-to-the-mean effect of the predicted item parameters, compare to its counterpart of calibrated item parameters. Inward biases of ability estimates from subtest using predicted item parameters were observed when ability estimates were compared across different subtests within subjects. Compared to its counterpart using calibrated parameters, standard errors of ability estimates were larger for those from the subtest using predicted item parameters in the mid-range of the scale, where regression-to-the-mean effect of the predicted item parameters is minimal, and were smaller in the rest of the scale, possibly due to joint impact of increased uncertainty of predicted item parameters, estimation biases and limitation of the item bank at various levels of ability scale. When ability was estimated from the same subtest using different types of item parameters, very high correlation (.995) were obtained and no biases were observed throughout almost the entire scale. Standard errors of ability estimates were larger for predicted parameters yet the differences were small.
作者 杨向东
出处 《心理学报》 CSSCI CSCD 北大核心 2010年第7期802-812,共11页 Acta Psychologica Sinica
基金 上海市浦江人才计划项目资助
关键词 自动化项目生成 认知设计系统法 抽象推理测验 计算机适应性测验 预测参数 automatic item generation cognitive design system approach ART computerized adaptive test predicted item parameter
  • 相关文献

参考文献21

  • 1Bejar, I. I. (1990). A generative analysis of a three-dimensional spatial task. Applied Psychological Measurement, 21, 1-24.
  • 2Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2(3). Available from http://www.jtla.org.
  • 3Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of processing in the Raven's Progressive Matrices Test. Psychological Review, 97, 404-431.
  • 4de Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York, NY: Springer-Verlag.
  • 5Embretson, S. E. (1980). Multicomponent latent trait models for ability tests. Psychometrika, 45, 479-494.
  • 6Embretson, S. E. (1983). Construct validity: construct representation versus nomothetic span. Psychological Bulletin, 93,179- 197.
  • 7Embretson, S. E. (1985). Introduction to the problem of test design. In S. Embretson (Eds.), Test design: developments in psychology and psychometrics (pp.3-17). New York, NY:Academic Press.
  • 8Embretson, S. E. (1994). Application of cognitive design systems to test development. In C.R. Reynolds (Eds.). Cognitive assessment: A multidisciplinary perspective (pp. 107-135). New York:Plenum Press.
  • 9Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380-396.
  • 10Embretson, S. E. (1999). Generating items during testing: psychometric issues and models. Psychometrika, 4, 407-433.

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部