摘要
为获取新浪微博中微博位置数据,提出一种基于Python的新浪微博位置数据获取方法,并遵循此方法设计了一个可以获取新浪微博位置数据的程序.该程序通过模拟登录、网页解析、关键字匹配等技术来获取所需的微博文本数据、用户信息和微博位置数据.实验表明,本程序能够采集特定区域的新浪微博位置等数据,且采集速度可调节,为后续微博的数据挖掘研究提供可能.
In order to obtain the massive location data in Sina microblog, this paper proposes a method of obtaining location data which in Sina Microblog based on Python, following which,we have designed a program to achieved it, Through simulation Iogin, web page parsing, keyword matching and other technologies, this program manages to obtain the required text data,userinfo and location data. Experiments show that this program can collect data such as the location of Sina microblog in a specific area,with an adjustable acquisition speed,which may provide the possibility for the further study on data mining in microblog.
作者
杜翔
蔡燕
兰小机
DU Xiang;CAI Yan;LAN Xiaoji(West campus Management Committee;Jiangxi University of Science and Technology,Ganzhou 341000,Chin;2.School of Architectural and Surveying & Mapping Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
出处
《江西理工大学学报》
CAS
2018年第5期90-96,共7页
Journal of Jiangxi University of Science and Technology
基金
国家自然科学基金资助项目(41561085)