摘要
网络水军的存在极大地影响了网络上的信息质量,同时干扰了人们获取正常网络信息的渠道。根据采集到的用户信息,分析网络水军的行为模式,研究了网络水军的特征,并提取了多个水军特征指标,使用熵值法确定各指标权重,利用主题识别模型对用户特征进行降维,并结合多指标综合指数法建立了网络水军自动识别模型,使用数据挖掘的方法找到异常用户。实验结果表明,模型得到了82.4%的准确率和88.6%的召回率。
Network spammer has highly affected the information quality. According to the users' information, the behaviors of the abnormal user are analyzed, as well as the characteristics of network spammers, after that, several characteristic indices form network review features and features of the network reviews are extracted. Based on index weights of the indices defined by entropy method, the paper establishes on automatic recognition model of network reviews by using multi- index comprehensive index method. After testing, the precision ratio of the model reached 82.4% and recall ratio reached 88.6%.
出处
《激光杂志》
北大核心
2016年第12期110-113,共4页
Laser Journal
基金
中国博士后科学基金资助项目(2012M510110)
河南省重点科技攻关项目(132102310003)
关键词
网络水军
主题识别
熵值法
特征选取
数据挖掘
network spammer
topic recognize
information entropy
feature selection
data mining