摘要
大数据环境下如何对互联网广告进行精准投放一直是计算广告学领域高度关注的问题。作为在线广告投放效果的一个重要指标,点击率的精确预测关系到媒体、用户和广告主三方的利益。目前的主流方法是通过抽取特征建立单一点击率预测模型,其不足之处在于使用单个权重来度量特征对点击率的影响过于片面。该研究基于分而治之的思想,提出了基于用户相似度和特征分化的混成模型。该模型首先根据混合高斯分布来评估用户相似度,将其划分为多个群体。针对不同群体,分别构建子模型并进行有效组合,从而挖掘同一特征对不同群体的差异化影响,进而准确地预测广告点击行为。通过使用真实互联网公司的广告数据集进行实验,并与主流方法做了详细的对比分析,检验了该方法的有效性。
Targeting the Internet advertising accurately is an eye-catching problem in the field of computational advertising.As an important evaluation criteria for online advertising effect,the precision of prediction for click through rate(CTR)benefits publishers,advertisers and users.Without considering feature differentiation,mainstream approaches are extracting features and establishing click prediction model,which use a single weight to measure the effect of a feature for CTR.According to the idea divide and conquer,a hybrid model based on user similarity and feature differentiation was proposed.The model divides users into several groups depending on user similarity evaluated by mixture gaussian distribution.For each group,model was built respectively and they were combined to excavate the different effects of a feature to different groups and improve predict CTR prediction accuracy.Several experiments on advertising data sets of an Internet companies were made and the effectiveness of the approach through detailed comparative analysis was verified with the mainstream approaches.
作者
潘书敏
颜娜
谢瑾奎
PAN Shu-min YAN Na XIE Jin-kui(Department of Computer Scienee and Technology, East China Normal University, Shanghai 200241, China)
出处
《计算机科学》
CSCD
北大核心
2017年第2期283-289,共7页
Computer Science
关键词
计算广告学
点击率预测
用户相似度
特征分化
混成模型
Computational advertising
CTR prediction
User similarity
Feature differentiation
Hybrid model