摘要
随着大数据时代的到来,数据变得至关重要,但是数据获取一直是数据挖掘的一个难题。社交网络的成熟使得数据获取变得便捷,但是获取方法仍然有待研究。通过分析社交网络中的信息存储状况,构造了社交网络敏感数据获取模型。从获取用户的个人简介信息中得到用户性别、出生日期、所在地等信息,并通过浏览记录对用户兴趣进行分析,最后利用好友列表获取其整个社交网中用户的敏感数据。以新浪微博为例研究了用户敏感数据获取率。实验发现,在所有数据获取中职业获取率是最低的,而其它信息获取率较高。
With the advent of the age of big data, the data becomes critical. But accessing to data has been a problem for data mining. Social network of mature makes get data convenient, but the method still to be researched. The paper constructed so- cial network sensitive data acquisition model by the analysis of social network in information storage condition. In the user's per- sonal profile, we get some information such as user gender, date of birth, location, etc. , and analyse user interest through the browsing record. Finally we get the entire users sensitive data of social network by the list of friends. By python, the paper make web crawler algorithm get network sensitive data. In the case of sina weibo , we get users' sensitive data. In the experi- ment, we found that the acquisition rate of careers was the lowest, while the other information acquisition rate was higher.
出处
《软件导刊》
2018年第3期56-58,共3页
Software Guide
基金
福建省属公益类科研院所基本科研项目(2015R1008-5)