Trajectory Big Data Processing Based on Frequent Activity 被引量：10

Trajectory Big Data Processing Based on Frequent Activity

导出

摘要 With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting places of interest. Therefore, trajectory query processing has emerged in recent years to help users find their best trajectories. However, with the huge amount of trajectory points and text descriptions, such as the activities practiced by users at these points, organizing these data in the index becomes tedious. Therefore, the parallel method becomes indispensable. In this paper, we have investigated the problem of distributed trajectory query processing based on the distance and frequent activities. The query is specified by start and final points in the trajectory, the distance threshold, and a set of frequent activities involved in the point of interest of the trajectory.As a result, the query returns the shortest trajectory including the most frequent activities with high support and high confidence. To simplify the query processing, we have implemented the Distributed Mining Trajectory R-Tree index(DMTR-Tree). For this method, we initially managed the large trajectory dataset in distributed R-Tree indexes.Then, for each index, we applied the frequent itemset Apriori algorithm for each point to select the frequent activity set. For the faster computation of the above algorithms, we utilized the cluster computing framework of Apache Spark with MapReduce as the programing model. The experimental results show that the DMTR-Tree index and the query-processing algorithm are efficient and can achieve the scalability. With the rapid development and wide use of Global Positioning System in technology tools, such as smart phones and touch pads, many people share their personal experience through their trajectories while visiting places of interest. Therefore, trajectory query processing has emerged in recent years to help users find their best trajectories. However, with the huge amount of trajectory points and text descriptions, such as the activities practiced by users at these points, organizing these data in the index becomes tedious. Therefore, the parallel method becomes indispensable. In this paper, we have investigated the problem of distributed trajectory query processing based on the distance and frequent activities. The query is specified by start and final points in the trajectory, the distance threshold, and a set of frequent activities involved in the point of interest of the trajectory.As a result, the query returns the shortest trajectory including the most frequent activities with high support and high confidence. To simplify the query processing, we have implemented the Distributed Mining Trajectory R-Tree index(DMTR-Tree). For this method, we initially managed the large trajectory dataset in distributed R-Tree indexes.Then, for each index, we applied the frequent itemset Apriori algorithm for each point to select the frequent activity set. For the faster computation of the above algorithms, we utilized the cluster computing framework of Apache Spark with MapReduce as the programing model. The experimental results show that the DMTR-Tree index and the query-processing algorithm are efficient and can achieve the scalability.

作者 Amina Belhassena Hongzhi Wang

机构地区 School of Computer Science and Technology

出处《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2019年第3期317-332,共16页 清华大学学报（自然科学版（英文版）

基金 partially supported by the National Natural Science Foundation of China (Nos. U1509216 and 61472099) the National Sci-Tech Support Plan (No. 2015BAH10F01) the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Provience (No. LC2016026) MOECMicrosoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology

关键词 DISTRIBUTED R-TREE TRAJECTORY frequent ACTIVITY QUERY distributed R-tree trajectory frequent activity query

分类号 N [自然科学总论]

引文网络
相关文献

参考文献1

1陈伟,赵雷,许佳捷,刘冠锋,郑凯,周晓方.Trip Oriented Search on Activity Trajectory[J].Journal of Computer Science & Technology,2015,30(4):745-761. 被引量：4

二级参考文献22

1Li Z, Ding B, Han J, Kays R. Swarm: Mining relaxed tem- poral moving object clusters. Proceedings of the VLDB En- dowment, 2010, 3(1/2): 723-734.
2Zheng K, Zheng Y, Yuan N, Shang S, Zhou X. Online dis- covery of gathering patterns over trajectories. IEEE Trans. Knowledge and Data Engineering, 2014, 26(8): 1974-1988.
3Huang M, Hu P, Xia L. A grid based trajectory indexing method for moving objects on fixed network. In Proc. the 18th Int. Conf. Geoinformatics, June 2010.
4Popa L S, Zeitouni K, Oria V, et al. Indexing in-network trajectory flows. The VLDB Journal, 2011, 20(5): 643-669.
5Chu S, Yeh C, Huang C. A cloud-based trajectory index scheme. In Proc. the 12th ICEBE, October 2009, pp.602- 607.
6Vlachos M, Kollios G, Gunopulos D. Discovering similar multidimensional trajectories. In Proc. the 18th ICDE, Feb. 26-Mar. 1, 2002, pp.673-684.
7Chen L, Ozsu M T, Oria V. Robust and fast similarity search for moving object trajectories. In Proc. the 2gth SIG- MOD, June 2005, pp.491-502.
8Chen Z, Shen H, Zhou X, Zheng Y, Xie X. Searching tra- jectories by locations: An efficiency study. In Proc. the 29th SIGMOD, June 2010, pp.255-266.
9Chen Z, Shen H, Zhou X. Discovering popular routes from trajectories. In Proc. the 27th /CDE, April 2011, pp.900- 911.
10Zheng K, Shang S, Yuan N J, 5rang Y. Towards efficient search for activity trajectories. In Proc. the 29th ICDE, April 2013, pp.230-241.

共引文献3

1梁珺秀,许建秋.基于时空标签轨迹的范围模式匹配查询[J].计算机与现代化,2018(8):79-85. 被引量：1
2潘晓,马昂,闫晓倩,吴雷.基于矩阵分解的个性化轨迹推荐方法[J].计算机应用与软件,2021,38(8):58-63. 被引量：1
3Xiao PAN,Lei WU,Fenjie LONG,Ang MA.Exploiting user behavior learning for personalized trajectory recommendations[J].Frontiers of Computer Science,2022,16(3):141-152.

同被引文献58

1叶梓键,李楠,李佳翌,钟宏.基于轨迹聚类的航空器异常识别[J].武汉理工大学学报,2021,43(7):42-47. 被引量：3
2YongZhang,HongZhu.Approximation Algorithm for Weighted Weak Vertex Cover[J].Journal of Computer Science & Technology,2004,19(6):782-786. 被引量：5
3梁会芹,费树岷.自适应遗传算法在服装生产流水线平衡问题中的应用[J].工业控制计算机,2009,22(11):57-59. 被引量：3
4冀进朝,黄岚,王喆,李红明,李三义.一种新的基于社区结构的影响最大化方法[J].吉林大学学报（理学版）,2011,49(1):93-97. 被引量：3
5唐存宝,邵哲平,唐强荣,潘家财,纪贤标.基于AIS的船舶航迹分布算法[J].集美大学学报（自然科学版）,2012,17(2):109-112. 被引量：11
6郭剑毅,李真,余正涛,张志坤.领域本体概念实例、属性和属性值的抽取及关系预测[J].南京大学学报（自然科学版）,2012,48(4):383-389. 被引量：32
7Luning Liu,Xin Chen,Zhaoming Lu,Luhan Wang,Xiangming Wen.Mobile-Edge Computing Framework with Data Compression for Wireless Network in Energy Internet[J].Tsinghua Science and Technology,2019,24(3):271-280. 被引量：7
8易正昌.一种利用注采数据评价水驱油藏井间连通性的新方法[J].中外能源,2019,24(1):40-47. 被引量：2
9庞菊梅,庞忠和,孔彦龙,罗璐,王迎春,王树芳.岩溶热储井间连通性的示踪研究[J].地质科学,2014,49(3):915-923. 被引量：13
10王劲峰,葛咏,李连发,孟斌,武继磊,柏延臣,杜世宏,廖一兰,胡茂桂,徐成东.地理学时空数据分析方法[J].地理学报,2014,69(9):1326-1345. 被引量：119

引证文献10

1吴瑕,赵小明,余建坤.轨迹图谱:一种基于知识图谱结构的轨迹信息抽取方法[J].计算机应用研究,2020,37(11):3255-3262. 被引量：4
2Ran Bi,Akshita Maradapu Vera Venkata Sai,Xiuzhen Cheng,Wei Cheng,Zhi Tian,Yingshu Li.Sampling-Based Approximate Skyline Query in Sensor Equipped IoT Networks[J].Tsinghua Science and Technology,2021,26(2):219-229. 被引量：5
3Jiangru Yuan,Xingjie Zeng,Haiyun Wu,Weishan Zhang,Jiehan Zhou,Bingyang Chen.Analytical Determination of Interwell Connectivity Based on Interwell Influence[J].Tsinghua Science and Technology,2021,26(6):813-820.
4李亭立.基于先验算法的服装面料关联规则挖掘[J].西部皮革,2022,44(8):42-44.
5徐文进,董少康.基于滑动窗口和LSTM自动编码器的渔船作业类型识别[J].计算机系统应用,2022,31(6):287-293. 被引量：1
6Peihuang Huang,Longkun Guo,Yuting Zhong.Efficient Algorithms for Maximizing Group Influence in Social Networks[J].Tsinghua Science and Technology,2022,27(5):832-842.
7黄端琼.基于轨迹数据的渔船安全行为智能化管理方案设计与实现[J].电脑知识与技术,2023,19(10):120-123.
8盛玥曦,侯珏,杨阳,刘正.基于先验算法的服装生产流水线设计优化[J].浙江理工大学学报（自然科学版）,2024,51(3):337-346. 被引量：1
9张丽平,刘斌毓,李松,郝忠孝.基于稀疏多头自注意力的轨迹k NN查询方法[J].吉林大学学报（工学版）,2024,54(6):1756-1766.
10刘玉江,罗双红,郑庆霄.基于三维子轨迹聚类算法的临床路径挖掘方法[J].计算机技术与发展,2024,34(10):156-163.

二级引证文献11

1蒋秉川,游雄,李科,周小军,温荟琦.利用地理知识图谱的COVID-19疫情态势交互式可视分析[J].武汉大学学报（信息科学版）,2020,45(6):836-845. 被引量：25
2Jinbao Wang,Zhuojun Duan,Xixian Han,Donghua Yang.Efficient Top/Bottom-k Fraction Estimation in Spatial Databases Using Bounded Main Memory[J].Tsinghua Science and Technology,2022,27(2):223-234.
3Yifei Zou,Minghui Xu,Dongxiao Yu,Liandong Chen,Shaoyong Guo,Xiaoshuang Xing.Implementation of Abstract MAC Layer Under Jamming[J].Tsinghua Science and Technology,2022,27(2):257-269.
4Lu Han,Changjun Wang,Dachuan Xu,Dongmei Zhang.Algorithms for the Prize-Collecting k-Steiner Tree Problem[J].Tsinghua Science and Technology,2022,27(5):785-792. 被引量：1
5Li Yang,Yifei Zou,Minghui Xu,Yicheng Xu,Dongxiao Yu,Xiuzhen Cheng.Distributed Consensus for Blockchains in Internet-of-Things Networks[J].Tsinghua Science and Technology,2022,27(5):817-831. 被引量：5
6谢鹏,张晋维,魏佩莹,孙子洋,周鹏程.基于知识图谱的物联网多源信息协同挖掘系统设计[J].电子设计工程,2023,31(12):92-95. 被引量：2
7Rui Yuan,Shunmei Meng,Ruihan Dou,Xinna Wang.Modeling Long- and Short-Term Service Recommendations with a Deep Multi-Interest Network for Edge Computing[J].Tsinghua Science and Technology,2024,29(1):86-98.
8王菁菁,姜梦,克亚琳,潘晓.基于自下而上与自上而下的轨迹知识图谱构建[J].软件导刊,2024,23(5):89-94.
9刘奕含,宁念文,杨东霖,李伟,吴斌,周毅.面向城市交通的动态知识图谱综述——构建、表示与应用[J].地球信息科学学报,2024,26(4):946-966. 被引量：1
10王晓洁.服装生产中的优化算法与成本控制[J].染整技术,2024,46(6):81-83. 被引量：1

1A·T·席勒,H·H魏伯斯特,J·C密多斯,陈山仑,陈仁泽.标的规划法在森林经营管理中的应用[J].林业经济,1979(2):83-90.
2Mei Wang,Jan van der Greef.A SYSTEMS PHARMACOLOGY VIEW OF THE BIOACTIVITY OF THE EXTRACT FROM RHIZOME DIOSCOREA NIPPONICA MANKINO[J].World Journal of Traditional Chinese Medicine,2015,1(4):72-72.
3江玉辰.Should smart phones be allowed in school?[J].校园英语,2018(4):95-95.
4张航,张欣,张平康,李琪.基于MapReduce的并行加权FIUT算法[J].微电子学与计算机,2018,35(7):41-44. 被引量：1
5Lee Yit Leng,Osumanu Haruna Ahmed,Mohamadu Boyie Jalloh.Brief review on climate change and tropical peatlands[J].Geoscience Frontiers,2019,10(2):373-380. 被引量：2
6所有的路,最终都是回家的路[J].语数外学习（高中版）（下）,2018(7):88-89.
7China's Electronic Music Group "Shocks" Cervantes International Art Festival in Mexico[J].China & The World Cultural Exchange,2018,84(11):6-6.
8Konstantinos G.Papageorgiou,Konstantinos G.Salonikidis.Physiological Changes after One Month of Exclusive Supplement Consumption and Exercise:A Case Study[J].Journal of Sports Science,2018,6(3):170-177.
92018 China(Humen) Children, Baby and Maternity Industry Forum: “To Be Practical, To See Success” Focusing on winning rules of new retail[J].China Textile,2018(12):34-37.
10董仁才,姜天祺,李欢欢,李思远,张永霖,付晓.基于电子导航地图POI的北京城区绿色空间服务半径分析[J].生态学报,2018,38(23):8536-8543. 被引量：10

Tsinghua Science and Technology

2019年第3期

浏览历史

内容加载中请稍等...