摘要
自动车牌识别数据中含有不能反映通常交通状况的异常数据,会对行程时间可变性的度量产生干扰。代表通常交通状况的有效数据由多种群组成,在概率分布上具有多峰、偏斜等特点,使用固定数量分布很难准确拟合有效数据的分布。这也导致具有右向长尾分布特点的异常数据识别困难。基于对数正态分布的K分支混合模型,通过动态确定分支数K实现两类数据的区分并对有效数据分布进行最佳拟合。算法对出租车和私家车样本数据取得了良好的异常数据识别效果,并对两种出行方式的行程时间可变性进行准确度量。实验结果表明,异常数据的存在对行程时间可变性度量的统计结果有明显的干扰,若不滤除会在出行决策上产生误判。
The automatic license plate recognition data contains a fair amount of outliers that cannot represent the usual traffic conditions,and interferes with the measurement of the travel time variability.The valid data representing the usual traffic conditions is composed of many groups,and has the characteristics of multi peak and skew in probability distribution.It is difficult to accurately fit the distribution of valid data with fixed number distribution.It also leads to difficulties in the identification of outliers with right and long tail distribution characteristics.K-branch mixed model based on log-normal distribution was used to realize the distinction between the two types of data and to best fit the distribution of valid data by dynamically determining the number of K.The proposed algorithm achieved good outlier recognition effect for taxi and private car sample data,and accurately measured the travel time variability of the two travel modes.Experiments show that the existence of outliers has obvious interference with the statistical result of the travel time variability measurement.If the outliers are not filtered,the travelers will make a misjudgment on the travel decision-making.
作者
王召月
袁绍欣
Wang Zhaoyue;Yuan Shaoxin(School of Information Engineering,Chang an University,Xi an 710064,Shaanxi,China)
出处
《计算机应用与软件》
北大核心
2019年第12期232-238,255,共8页
Computer Applications and Software
基金
国家自然科学基金项目(61703054)
高等学校学科创新引智计划项目(B14043)
中央高校基本科研业务费专项资金项目(300102248204)