摘要
气象数据的增长规模已达到每小时TB级,这使得传统基于关系型数据库和文件存储系统在海量数据存储与管理方面捉襟见肘,进而使得基于大规模异构气象数据的应用无法规模化,同时,也无法满足科研人员对海量气象数据高效探索的需要。为解决这一系列问题,研究者分别基于MapReduce、HBase等分布式框架下的分布式计算和存储技术,尝试为海量气象数据的探索提供有效技术手段,然而,综合性的研究据了解还未开展。因此,利用近年来积累的海量多普勒天气雷达数据,开展了基于MapReduce和HBase相结合的风暴三维追踪方法的研究,并基于传统Rest标准化接口实现了雷达资料的点、线、面、体的多种分布式服务接口,与传统的Rest标准化单机数据存储和访问接口的性能相比,所实现方法在性能方面有100%的效率提升。最后,以2007年至2009年珠江三角洲地区三年雷达数据的风暴追踪回算为例,进一步验证了所提方法在计算和存储管理方面的性能优势。
In recent years, meteorological data increases dramatically, and the amount of data has been TB-per-hourlevel. The traditional relational database and file storage system have troubles in the massive data storage and management,thus large-scale and heterogeneous meteorological data cannot also be used effectively in meteorological business. Furthermore,it would be also difficult for scientific researchers to efficiently explore the huge amount of heterogeneous meteorological data.In order to tackle these problems, researchers have developed many types of distributed computing frameworks based on MapReduce and HBase, etc., which provide an effective way to exploit large-scale meteorological data. The distributed computing and storing techniques have been tested separately in applications of meteorology field. However, to our best knowledge, these techniques have not been carefully studied jointly. Therefore, a new 3D storm tracking method based on the combination of MapReduce and Hbase was studied by using a large amount of weather radar data accumulated in recent years.Moreover, based on the original Rest interface, a series of distributed service interfaces were implemented for exploring a variety of point, line and surface data. Compared with the performance of the standard single data storage and access interface based on Rest, the proposed method has better comprehensive performance, and the efficiency is improved about 100%. A practical application for tracking 3D storm in Zhujiang River urban agglomeration from 2007 to 2009 was used to further validate the performance of the proposed method.
出处
《计算机应用》
CSCD
北大核心
2017年第4期941-944,共4页
journal of Computer Applications
关键词
分布式计算框架
风暴追踪算法
长时间序列分析
distributed computing framework
storm tracking method
long time series analysis