摘要
准确、及时、高效地获取农业数据是全产业链农业信息分析预警工作的前提和基础,是提升农业信息分析预警专业化和规范化水平的关键。本研究针对互联网中存在的大量农业信息数据,以玉米价格数据为例,设计数据抓取和规范化存储策略,首先基于Scrapy框架建立对网页的请求响应,分析网页布局后对关键信息进行循环抓取,并利用正则表达式将抓取的信息提取为格式化数据,然后将数据本地化存储为Microsoft Excel表格或存储至数据库中,最后利用Echarts将数据以可视化的方式在Web端展示,从而实现对农业网络数据的挖掘和利用。
Accurate,timely and efficient access to agricultural data is the prerequisite and basis for analysis and early warning of agricultural informations in the whole industry chain. It is the key to enhancing the professionalization and standardization of agricultural information analysis and early warning. With the maize price as an example,the research focused on large amounts of agricultural informations on the Internet and developed data crawling and normalized storage strategies. Firstly,we created request & response to the web pages based on Scrapy framework,analyzed the web page layout and then crawled the key informations cyclically; the data were extracted into formatted data using regular expressions,and then were stored as the localized data in a Microsoft Excel spreadsheet or in a database. Finally,Echarts was used to visualize the data on the Web,and thus the mining and utilization of agricultural network data were realized.
出处
《山东农业科学》
2018年第1期142-147,共6页
Shandong Agricultural Sciences
基金
山东省农业科学院青年科研基金项目(2016YQN47)
山东省农业科学院农业科技创新工程项目(CXGC2016B15)
山东省重大应用技术创新项目"基于物联网的设施蔬菜大数据平台研究与应用"