计算机适应性测验条件下认知设计项目预测参数的影响被引量：1

The Impact on Ability Estimates of Predicted Parameters from Cognitively Designed Items in a Computerized Adaptive Testing Environment

下载PDF

导出

摘要自动化项目生成(Automatic Item Generation)中的项目参数是基于认知项目设计的刺激特征集预测的,在不确定性来源上较之用经验数据标定的参数更为复杂。文章通过实证研究分析了在计算机适应性测验条件下基于认知设计系统法生成的抽象推理测验(ART)项目预测参数对能力参数估计的精确性。研究表明,项目预测参数比相应标定参数分布更为趋中。这种回归效应既影响到能力参数估计误差大小,也导致适应性测验过程中项目选择的差异。在控制了项目选择差异之后,能力参数估计误差较之基于项目标定参数的能力估计误差大,但差别并不明显。两者相应的能力估计值相关很高,对应能力值之间的差异很小,且几乎贯彻整个能力分布区间。 Automatic item generation has become a promising area in recent studies. In automatic item generation, items with targeted psychometric properties are generated during testing. The feasibility of automatic item generation lies in the fact that items are generated from a set of observable item stimulus features, which are mapped onto the cognitive variables underlying the item solution and are calibrated through cognitive psychometric models. Parameters of a generated item can then be predicted from the specific combination of the calibrated item stimulus features in the item. Predicted item parameters, compared to those calibrated from empirical data, involve more complex sources of uncertainty. Although the relationship between sufficiency of the cognitive model of item solving and the adequacy of item parameter prediction can be theoretically justified, the degree to which such predicted parameters impact various aspects of testing, however, is an empirical question and needs to be explored. This paper investigated the impact of predicted item parameters on ability estimates in a computerized adaptive testing environment, based on abstract reasoning test （ART） items which were generated using cognitive design system approach （Embretson, 1998）. The item bank contained 150 items with two sets of item difficulties, of which one is predicted from the item design features and the other is calibrated from sample data. Each of the 263 subjects participated in the study received two subtests, of which one was based on predicted parameters, the other calibrated parameters. The item bank was split into two parallel halves based on predicted item parameters to prevent items in the bank from repeat administrations within subjects. Subjects were randomly assigned into one of the four testing procedures resulting from the combinations between parameter types （predicted versus calibrated） and item bank （first half versus second half）. Results of the study showed a clear regression-to-the-mean effect of the predicted item parameters, compare to its counterpart of calibrated item parameters. Inward biases of ability estimates from subtest using predicted item parameters were observed when ability estimates were compared across different subtests within subjects. Compared to its counterpart using calibrated parameters, standard errors of ability estimates were larger for those from the subtest using predicted item parameters in the mid-range of the scale, where regression-to-the-mean effect of the predicted item parameters is minimal, and were smaller in the rest of the scale, possibly due to joint impact of increased uncertainty of predicted item parameters, estimation biases and limitation of the item bank at various levels of ability scale. When ability was estimated from the same subtest using different types of item parameters, very high correlation （.995） were obtained and no biases were observed throughout almost the entire scale. Standard errors of ability estimates were larger for predicted parameters yet the differences were small.

作者杨向东

机构地区华东师范大学课程与教学研究所

出处《心理学报》 CSSCI CSCD 北大核心 2010年第7期802-812,共11页 Acta Psychologica Sinica

基金上海市浦江人才计划项目资助

关键词自动化项目生成认知设计系统法抽象推理测验计算机适应性测验预测参数 automatic item generation cognitive design system approach ART computerized adaptive test predicted item parameter

分类号 B841 [哲学宗教—基础心理学]

引文网络
相关文献

参考文献21

1Bejar, I. I. (1990). A generative analysis of a three-dimensional spatial task. Applied Psychological Measurement, 21, 1-24.
2Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2(3). Available from http://www.jtla.org.
3Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of processing in the Raven's Progressive Matrices Test. Psychological Review, 97, 404-431.
4de Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York, NY: Springer-Verlag.
5Embretson, S. E. (1980). Multicomponent latent trait models for ability tests. Psychometrika, 45, 479-494.
6Embretson, S. E. (1983). Construct validity: construct representation versus nomothetic span. Psychological Bulletin, 93,179- 197.
7Embretson, S. E. (1985). Introduction to the problem of test design. In S. Embretson (Eds.), Test design: developments in psychology and psychometrics (pp.3-17). New York, NY:Academic Press.
8Embretson, S. E. (1994). Application of cognitive design systems to test development. In C.R. Reynolds (Eds.). Cognitive assessment: A multidisciplinary perspective (pp. 107-135). New York:Plenum Press.
9Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380-396.
10Embretson, S. E. (1999). Generating items during testing: psychometric issues and models. Psychometrika, 4, 407-433.

同被引文献8

1李中权,张厚粲.计算机自动化项目生成概述[J].心理科学进展,2008,16(2):348-352. 被引量：7
2马德良,陆昌辉,王小乐.基于改进遗传算法的智能组卷方法[J].计算机应用,2009,29(7):1884-1886. 被引量：25
3李中权,孔明,张厚粲,周仁来.矩阵推理测验中的错误类型分析[J].心理科学,2010,33(3):663-665. 被引量：2
4周骏,戴海琦,徐淑媛.矩阵完成问题的项目生成研究[J].心理与行为研究,2010,8(3):166-171. 被引量：5
5李中权,王力,张厚粲,周仁来.不同认知成分在图形推理测验项目难度预测中的作用[J].心理学报,2011,43(9):1087-1094. 被引量：5
6杨向东.代数应用题项目生成中的认知过程与任务特征分析[J].心理科学进展,2013,21(1):175-189. 被引量：8
7Harutyun Terteryan.Reducing Item Exposure in Computerized Adaptive Testing Systems Using Automatic Item Generation[J].Computer Technology and Application,2014,5(1):21-24. 被引量：1
8黄贤英,刘广峰,刘小洋,阳安志.基于word2vec和双向LSTM的情感分类深度模型[J].计算机应用研究,2019,36(12):3583-3587. 被引量：44

引证文献1

1孙婷婷,杨涛.自动化项目生成及其在教育与心理测评中的应用[J].考试研究,2022,18(2):74-83.

1刘晓华,郝兴昌.错误记忆的影响因素[J].黑龙江教育学院学报,2011,30(3):102-103. 被引量：2
2李中权,张厚粲.计算机自动化项目生成概述[J].心理科学进展,2008,16(2):348-352. 被引量：7
3李银河.快乐与内疚[J].喜剧世界（下）,2016,0(8):1-1.
4李银河.快乐与内疚[J].国学（吉林）,2016,0(6):72-72.
5沐守宽.CTT与IRT测量原理之比较[J].上海师范大学学报（基础教育版）,2006,35(4):6-9. 被引量：4
6杨向东.代数应用题项目生成的结构分析方法[J].心理科学进展,2014,22(3):558-570. 被引量：1
7周海亮.不断升级你的目标[J].才智（才情斋版）,2009(4):20-20.
8丁树良,熊建华,罗芬,吴锐,甘小方,涂白.一种新的等值准则及其适用范围的探讨[J].心理学报,2005,37(5):674-680. 被引量：5
9许跃进,王琦君,王哲.行为观察的数据取样、数据标定和分析工具进展[J].实验室研究与探索,2013,32(11):104-106. 被引量：3
10李中权,王力,张厚粲,周仁来.不同认知成分在图形推理测验项目难度预测中的作用[J].心理学报,2011,43(9):1087-1094. 被引量：5

心理学报

2010年第7期

浏览历史

内容加载中请稍等...

计算机适应性测验条件下认知设计项目预测参数的影响被引量：1

参考文献21

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

计算机适应性测验条件下认知设计项目预测参数的影响 被引量：1

参考文献21

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

计算机适应性测验条件下认知设计项目预测参数的影响被引量：1