摘要
针对当前图摘要方法压缩率较高,图压缩算法无法直接被用于下游任务分析的问题,提出一种图摘要与图压缩的融合算法,即基于节点相似性分组与图压缩的图摘要算法(GSNSC)。首先,初始化节点为超节点,并根据相似度对超节点分组;其次,将每个组的超节点合并,直到达到指定次数或指定节点数;再次,在超节点之间添加超边和校正边以恢复原始图;最后,对于图压缩部分,判断对每个超节点的邻接边压缩和摘要的代价,并选择二者中代价较小的执行。在Web-NotreDame、Web-Google和Web-Berkstan等6个数据集上进行了图压缩率和图查询实验。实验结果表明,在6个数据集上,与SLUGGER(Scalable Lossless sUmmarization of Graphs with HiERarchy)算法相比,所提算法的压缩率至少降低了23个百分点;与SWeG(Summarization of Web-scale Graphs)算法相比,所提算法的压缩率至少降低了13个百分点;在Web-NotreDame数据集上,所提算法的度误差比SWeG降低了41.6%。以上验证了所提算法具有更好的图压缩率和图查询准确度。
To solve the problem that the current graph summarization methods have high compression ratios and the graph compression algorithms cannot be directly used in downstream tasks,a fusion algorithm of graph summarization and graph compression was proposed,which called Graph Summarization algorithm based on Node Similarity grouping and graph Compression(GSNSC).Firstly,the nodes were initialized as super nodes,and the super nodes were grouped according to the similarity.Secondly,the super nodes of each group were merged until the specified number of times or nodes were reached.Thirdly,super edges and corrected edges were added between the super nodes for reconstructing the original graph.Finally,for the graph compression part,the cost of compressing and summarizing the adjacent edges of each super node were judged,and the less expensive one in these two was selected to execute.Experiments of graph compression ratio and graph query were conducted on six datasets such as Web-NotreDame,Web-Google and Web-Berkstan.Experimental results on six datasets show that,the proposed algorithm has the compression ratio reduced by at least 23 percentage points compared with SLUGGER(Scalable Lossless sUmmarization of Graphs with HiERarchy)algorithm,and the compression ratio decreased by at least 13 percentage points compared with SWeG(Summarization of Web-scale Graphs)algorithm.Experimental results on Web-NotreDame dataset show that the degree error of the proposed algorithm is reduced by 41.6%compared with that of SWeG algorithm.The above verifies that the proposed algorithm has better graph compression ratio and graph query accuracy.
作者
宏宇
陈鸿昶
张建朋
黄瑞阳
HONG Yu;CHEN Hongchang;ZHANG Jianpeng;HUANG Ruiyang(Cyberspace Security College,Zhengzhou University,Zhengzhou Henan 450002,China;Institute of Information Technology,Information Engineering University,Zhengzhou Henan 450002,China)
出处
《计算机应用》
CSCD
北大核心
2023年第10期3047-3053,共7页
journal of Computer Applications
基金
国家自然科学基金资助项目(62002384)
中国博士后科学基金资助项目(2020M683760)。
关键词
图摘要
图压缩
图查询
超边
最小描述长度
graph summarization
graph compression
graph query
super edge
Minimum Description Length(MDL)