期刊文献+

基于K折交叉验证Beta分布的AUC度量的置信区间 被引量:13

Confidence Interval of AUC Measure Based on K-Fold Cross-Validated Beta Distribution
原文传递
导出
摘要 在统计机器学习研究中,基于K折交叉验证的AUC(Area Under ROC Curve)度量常常被用作分类算法性能的评价.然而,点估计显然没有考虑方差的信息,为此,基于正态假定的K折交叉验证t分布构造的AUC度量的通用对称置信区间(区间估计)被提出.但是,这些对称置信区间往往表现出低的置信度或长的区间长度,从而容易导致激进的(liberal)统计推断结果.通过对AUC度量的理论分析,发现AUC度量的真实分布实际上是非对称的,此时简单使用对称分布去近似它显然是不合适的.因此,针对二类分类问题,本文提出了一种新的基于K折交叉验证Beta分布的AUC度量的非对称置信区间,在模拟和真实数据实验上验证了提出的置信区间相对于传统的基于K折交叉验证t分布的对称置信区间的优越性. In statistical machine learning research,the AUC(Area Under ROC Curve)measure based on K-fold cross-validation is always used to measure the classification algorithm performance.However,the point estimation obviously does not consider the variance information.For this reason,the commonly used symmetrical confidence interval(interval estimation)of AUC measure constructed by the K-fold cross-validated t distribution based on the normal assumption is proposed.But these symmetrical confidence intervals always exhibit low degrees of confidence or long interval lengths.This may easily result in liberal statistical inference results.Through the theoretical analysis of AUC measure,we find that the real distribution of AUC measure is actually asymmetrical.In this case,it is obviously inappropriate to use symmetrical distribution to approximate asymmetrical distribution.Therefore,for the two-class classification problem,this paper proposes a new asymmetrical confidence interval based on K-fold cross-validated Beta distribution.Simulated and real data experiments show the superiority of the proposed confidence interval compared to the traditional symmetrical confidence interval based on K-fold cross-validated t distribution.
作者 王钰 赵晓艳 杨杏丽 李济洪 WANG Yu;ZHAO Xiaoyan;YANG Xingli;LI Jihong(School of Modern Educational Technology,Shanxi University,Taiyuan 030006;School of Mathematical Sciences,Shanxi University,Taiyuan 030006;School of Software,Shanxi University,Taiyuan 030006)
出处 《系统科学与数学》 CSCD 北大核心 2020年第9期1564-1577,共14页 Journal of Systems Science and Mathematical Sciences
基金 山西省应用基础项目研究计划(201801D211002,201901D111034) 国家统计科学研究项目(2017LY04) 国家自然科学基金(61806115) 统计与数据科学前沿理论及应用教育部重点实验室开放研究课题(KLATASDS2007)资助课题
关键词 AUC度量 置信区间 BETA分布 K折交叉验证 AUC measure confidence interval Beta distribution K-fold cross-validation
  • 相关文献

参考文献1

二级参考文献4

共引文献29

同被引文献147

引证文献13

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部