摘要
运动估计是视频编码过程中最为复杂和耗时的阶段。为分析和优化其性能,从多个流行的开源视频编码器中提取出单独的运动估计算法模块,根据视频分辨率和视频内容的不同建立程序输入集,从而构成一套完整的测试程序集合。利用性能分析工具对算法性能和微体系结构性能进行量化分析,给出这些算法在当今主流处理器体系结构上的性能差异。实验结果表明,复杂视频和高分辨率视频下的运动估计算法耗时最长,且大部分算法的指令级并行性没有太大差异。算法最后一级高速缓存的缺失率和分支误预测率都较低,分别在0.01%和7%以下。
The Motion Estimation(ME) in the video coding is the most complex and time-consuming one of all the processing stages. This paper extracts all the ME modules fxom multiple popular open source video codecs in order to evaluate and optimize their performance. In addition, a comprehensive input data set is constructed for these ME algorithms considering different video contents and resolutions. A quantitative analysis of runtime efficiency and microarchitecture characteristics are made for these algorithms by means of the profiling tool based on hardware performance counter, and the analysis exposes their performance difference on current mainstream processor architecture. The evaluation results show that for the input of complex and high-resolution video, the ME will consume the most time, while there are little difference between their low Instruction Level ParalMism(ILP). But the Last Level Cache(LLC) miss rate and branch mispredietion rate of these algorithms are all rather low, which are respectively under 0.01% and 7%.
出处
《计算机工程》
CAS
CSCD
2014年第4期295-300,304,共7页
Computer Engineering
基金
国家自然科学基金资助项目(60970023)
国家"973"计划基金资助项目(2011CB302501)
国家"863"计划基金资助项目(2012AA010902
2012AA010901)
关键词
视频编码
运动估计
钻石搜索
六边形搜索
视频内容
分辨率
微结构
video coding
Motion Estimation(ME)
diamond search
hexagon search
video content
resolution ratio
microarchitecture