Traditional linear statistical methods cannot provide effective prediction results due to the complexity of human mind.In this paper,we apply machine learning to the field of funding allocation decision making,and try...Traditional linear statistical methods cannot provide effective prediction results due to the complexity of human mind.In this paper,we apply machine learning to the field of funding allocation decision making,and try to explore whether personal characteristics of evaluators help predict the outcome of the evaluation decision?and how to improve the accuracy rate of machine learning methods on the imbalanced dataset of grant funding?Since funding data is characterized by imbalanced data distribution,we propose a slacked weighted entropy decision tree(SWE-DT).We assign weight to each class with the help of slacked factor.The experimental results show that the SWE decision tree performs well with sensitivity of 0.87,specificity of 0.85 and average accuracy of 0.75.It also provides a satisfied classification accuracy with Area Under Curve(AUC)=0.87.This implies that the proposed method accurately classified minority class instances and suitable to imbalanced datasets.By adding evaluator factors into the model,sensitivity is improved by over 9%,specificity improved by nearly 8%and the average accuracy also increased by 7%.It proves the feasibility of using evaluators’characteristics as predictors.And by innovatively using machine learning method to predict evaluation decisions based on the personal characteristics of evaluators,it enriches the literature in the field of decision making and machine learning field.展开更多
Due to its outstanding ability in processing large quantity and high-dimensional data,machine learning models have been used in many cases,such as pattern recognition,classification,spam filtering,data mining and fore...Due to its outstanding ability in processing large quantity and high-dimensional data,machine learning models have been used in many cases,such as pattern recognition,classification,spam filtering,data mining and forecasting.As an outstanding machine learning algorithm,K-Nearest Neighbor(KNN)has been widely used in different situations,yet in selecting qualified applicants for winning a funding is almost new.The major problem lies in how to accurately determine the importance of attributes.In this paper,we propose a Feature-weighted Gradient Decent K-Nearest Neighbor(FGDKNN)method to classify funding applicants in to two types:approved ones or not approved ones.The FGDKNN is based on a gradient decent learning algorithm to update weight.It updates the weight of labels by minimizing error ratio iteratively,so that the importance of attributes can be described better.We investigate the performance of FGDKNN with Beijing Innofund.The results show that FGDKNN performs about 23%,20%,18%,15%better than KNN,SVM,DT and ANN,respectively.Moreover,the FGDKNN has fast convergence time under different training scales,and has good performance under different settings.展开更多
基金This research project is supported by the Science Foundation of Beijing Language and Culture University(supported by the Fundamental Research Funds for the Central Universities)(21YBB35)the Hainan Provincial Natural Science Foundation of China(620RC562)+1 种基金the Program of Hainan Association for Science and Technology Plans to Youth R&D Innovation(Grant No.QCXM201910)the Postdoctoral Science Foundation of China(2021M690338).
文摘Traditional linear statistical methods cannot provide effective prediction results due to the complexity of human mind.In this paper,we apply machine learning to the field of funding allocation decision making,and try to explore whether personal characteristics of evaluators help predict the outcome of the evaluation decision?and how to improve the accuracy rate of machine learning methods on the imbalanced dataset of grant funding?Since funding data is characterized by imbalanced data distribution,we propose a slacked weighted entropy decision tree(SWE-DT).We assign weight to each class with the help of slacked factor.The experimental results show that the SWE decision tree performs well with sensitivity of 0.87,specificity of 0.85 and average accuracy of 0.75.It also provides a satisfied classification accuracy with Area Under Curve(AUC)=0.87.This implies that the proposed method accurately classified minority class instances and suitable to imbalanced datasets.By adding evaluator factors into the model,sensitivity is improved by over 9%,specificity improved by nearly 8%and the average accuracy also increased by 7%.It proves the feasibility of using evaluators’characteristics as predictors.And by innovatively using machine learning method to predict evaluation decisions based on the personal characteristics of evaluators,it enriches the literature in the field of decision making and machine learning field.
基金J.Yao would like to thank the support of Program of Hainan Association for Science and Technology Plans to Youth R&D Innovation[QCXM201910]Scientific Research Setup Fund of Hainan University[KYQD(ZR)1837]+1 种基金the National Natural Science Foundation of China[61802092]G.Hu would like to thank the support of Fundamental Research Project of Shenzhen Municipality[JCYJ20170817115335418].
文摘Due to its outstanding ability in processing large quantity and high-dimensional data,machine learning models have been used in many cases,such as pattern recognition,classification,spam filtering,data mining and forecasting.As an outstanding machine learning algorithm,K-Nearest Neighbor(KNN)has been widely used in different situations,yet in selecting qualified applicants for winning a funding is almost new.The major problem lies in how to accurately determine the importance of attributes.In this paper,we propose a Feature-weighted Gradient Decent K-Nearest Neighbor(FGDKNN)method to classify funding applicants in to two types:approved ones or not approved ones.The FGDKNN is based on a gradient decent learning algorithm to update weight.It updates the weight of labels by minimizing error ratio iteratively,so that the importance of attributes can be described better.We investigate the performance of FGDKNN with Beijing Innofund.The results show that FGDKNN performs about 23%,20%,18%,15%better than KNN,SVM,DT and ANN,respectively.Moreover,the FGDKNN has fast convergence time under different training scales,and has good performance under different settings.