摘要
探讨基因表达式编程对自陈量表测量数据的建模方法。运用威廉斯创造力测验和认知需求量表获得400位中学生的测量分数,通过数据清洗,保留383个被试的分数作为建模的数据集。运用哈曼单因素检验方法没有发现共同方法偏差。采用均匀设计方法对基因表达式编程中的5个参数进行优化配置,在测试拟合度最大的试验条件下,找到了测试误差最小的模型。比较基因表达式编程和BP神经网络、支持向量回归机、多元线性回归、二次多项式回归所建模型的预测精度。研究表明,基因表达式编程能用于自陈量表测量数据的建模,该模型比传统方法所建的模型具有更高的预测精度,而且模型是稳健的。
It is often difficult to represent the complex relations among psychological variables with traditional analytical models like regressions. Supposedly, neural networks and support vector regression machine can be used instead. However, the limitation is that these models are recessive. Gene expression programming (GEP) can be used to handle these models with observable variables. At present, most of the data using GEP models are obtained with objective methods. But a lot of the psychological measurement data are obtained from self-report instruments and are affected by many subjective factors. Could these kinds of data be used in GEP models? How large is the modeling error? Is there any advantage in using the GEP modeling as compared with the multivariate linear regression or the polynomial regression modeling? Is the GEP modeling more accurate than neural networks and support vector regression machine modeling? All the above issues would be explored in this paper. The responses of 400 middle school students were obtained with the Williams creativity assessment packet and the need for cognition scale. A total of 17 students were deleted because of the abnormality in responses and the data from 383 students were retained for modeling. Common method biases had not been found with the Harman's single-factor test. Five parameters of gene expression programming were optimized with the uniform design. These parameters were head length, gene number, fitness function, chromosome number and mutation probability. There were nine levels for each parameter, each established under different testing conditions respectively. The condition with maximum fitness was obtained through experiments. The GEP program was repeated 10 times under this condition. The accuracy of the models was calculated and the model with the minimum error was found, of which the expression tree was drawn. The models of the relations between need for cognition and creativity personality traits were established using BP neural networks, support vector regression machine, multivariate linear regression and polynomial regression respectively. These models were compared with the model using gene expression programming. The results showed that: (a) the accuracy of model 10, with four independent variables, was the highest; (b) the expressions of these ten models were different but their predictive errors were very close, thus supporting the robustness of the GEP modeling method; and (c) the predictive errors of different models were: GEP, 1.28; BP networks 2.76; support vector regression machine 2.31; polynomial regression 3.21; multivariate linear regression 3.86 respectively. It can be concluded that: (a) data from self-reported instruments can still be modeled with gene expression programming even though these data are affected by many subjective factors; (b) the GEP modeling is more accurate than the other intelligent computing methods (neural networks, support vector regression machine, etc.) and traditional statistical methods (multivariate linear regression, polynomial regression, etc.), and (c) the models established with GEP are robust; their predictive accuracy is similar even though their mathematical formulae are quite different.
出处
《心理学报》
CSSCI
CSCD
北大核心
2013年第6期704-714,共11页
Acta Psychologica Sinica
基金
国家社会科学基金教育学课题(BBA080050)
国家自然科学基金项目(71071065
71131004)
江苏省一级重点学科"心理学"资助成果
关键词
自陈量表
基因表达式编程
建模
创造力
均匀设计
self-reported instrument
gene expression programming
modeling
creativity
uniform design