摘要
针对测验中高能力被试答错容易试题的睡眠现象,可使用四参数Logistic模型分析数据。研究选取了来自心理测验和成就测验的实际数据,分别采用传统模型和四参数Logistic模型进行拟合,对不同模型的拟合指标及参数估计结果进行比较。结果表明,四参数Logistic模型能够提高拟合程度,增强估计结果的准确性,有效纠正高能力被试能力被低估的现象。建议在必要时使用四参数Logistic模型进行数据分析。
High-ability test-takers may on occasion answer an easy question incorrectly,which is called sleeping phenomenon(Wright,1977). In these situations,four parameter logistic model(4 PM)may be uniquely suited for characterizing the data. The 4 PM was proposed by Barton and Lord(1981),which added the d parameter to allow upper asymptotes to be less than 1. 00. The more general formulation of the 4 PM( Waller Reise,2010) suggestedd as an item-specific upper asymptote. Besides,a three parameter logistic model for reversed data(3 PMR) was discussed,which was suited for the situations with no guessing phenomenon but sleeping phenomenon. In the previous researches,the4 PM provided good fit for some psychological tests,such as MMPI and so on. However,for achievement tests,Barton and Lord in their earlier work found that the 4 PM failed to improve the likelihood or to significantly change any ability estimates for the datasets collected by ETS. Therefore,is it really inappropriate to use the 4 PM in achievement tests? Moreover,most previous researches focused on the differences of parameter estimations based on simulated data. However,how often the sleeping phenomenon happen in real situations is still worth studying. In our research,we fitted seven models to the Taylor Manifest Anxiety Scale(TMA)and the large-scale Maths test. Meanwhile,the dataset of Maths tests was used to construct two different distributions:approximately normal distribution( skewness is 0. 097) and negatively skewed distribution( skewness is-0. 199). The models compared were Rasch model,two parameter logistic model(2 PM),three parameter logistic model(3 PM),3 PM with reversing scores on each item(3 PM_R),4 PM,4 PM with equal guessing parameters(4 PM_c) and 4 PM with equal d parameters(4 PM_d). The R package sirt was used to estimate all the models in our study. In order to investigate the differences of these models,we computed:(1) the model fit index AIC,BIC;(2) the correlations of the item parameter estimations of the best fitted logistic model with d parameter and the second best model without d parameter,for all the items and after the easiest 5,10,and 10 items were deleted;(3) the correlations of the ability parameter estimations of the two models discussed in(2),for all and the top 1000,500,300,200,100 respondents. The results indicated that(1)the Rasch model showed the worst fit for all the datasets. For TMA data,the 3 PMR showed the best fit,for the Maths tests,the 4 PM showed the best fit;(2) the difficulty parameters were quite similar inthe two compared models,however,there was lager difference between the discrimination parameters,the negatively skewed Standard Maths test data showed similar results;when the easiest items were deleted,the correlation of the discrimination parameters became larger,especially for the negatively skewed Standard Maths test;(3) the ability parameters of two compared models correlated highly across all groups of respondents,however,the correlations for the top 1000,500,300,200,100 groups were relatively small,especially for the top 100 respondents. In conclusion,the 4 PM is necessary in both psychological tests and achievement tests. For practitioners who should make a decision about whether to choose the 4 PM,the type of the tests,the purpose of the tests,and the complexity of the computation should be considered at the same time.
作者
刘玥
刘红云
Liu Yue;Liu Hongyun(School of Psychology, Beijing Normal University, Beijing 100875)
出处
《心理学探新》
CSSCI
北大核心
2018年第3期228-235,共8页
Psychological Exploration