摘要
近年来,网络表示学习(Network Representation Learning,NRL)作为一种在低维空间中表示节点来分析异质信息网络(Heterogeneous Information Networks,HIN)的有效方法受到越来越多的关注。基于随机游走的方法是目前网络表示学习常用的方法,然而这些方法大多基于浅层神经网络,难以捕获异质网络结构信息。图卷积神经网络(Gragh Convolutional Network,GCN)是一种流行的能对图进行深度学习的方法,能够更好地利用网络拓扑结构,但目前的GCN设计针对的是同质信息网络,忽略了网络中丰富的语义信息。为了有效地挖掘异质信息网络中的语义信息和高度非线性的网络结构信息,进而提高网络表示的效果,文中提出了一种基于融合元路径的图卷积异质网络表示学习算法(MG2vec)。该算法首先通过基于元路径的关联度量方法来获取异质信息网络中丰富的语义信息;然后采用图卷积神经网络进行深度学习,捕捉节点和邻居节点的特征,弥补浅层模型捕捉网络结构信息能力不足的缺陷,从而实现将丰富的语义信息和结构信息更好地融入低维的节点表示中。在数据集DBLP和IMDB上分别进行实验,相比DeepWalk,node2vec和Metapath2vec算法,所提MG2vec算法在多标签分类任务上的分类精确率更高且性能更优,精确率和Macro-F1值分别达到了94.49%和94.16%,且与DeepWalk相比分别最高提升了26.05%和28.73%。实验结果证明,MG2vec算法的性能优于经典的网络表示学习算法,具有更好的异质信息网络表示效果。
In recent years,network representation learning has received more and more attention as an effective method for analyzing heterogeneous information networks by representing nodes in a low-dimensional space.Random walk based methods are currently popular methods to learn network embedding,however,most of these methods are based on shallow neural networks,which make it difficult to capture heterogeneous network structure information.The graph convolutional network(GCN)is a popular method for deep learning of graphs,which is known to be capable of better exploitation of network topology,but current design of GCN is intended for homogenous networks,ignoring the rich semantic information in the network.In order to effectively mine the semantic information and highly nonlinear network structure information in heterogeneous information networks,this paper proposes a heterogeneous network representation learning algorithm based on graph convolution of fusion meta-path(MG2vec)to improve the effect of network representation.Firstly,the algorithm obtains rich semantic information in heterogeneous information networks through relevance measurement based on meta-paths.Then the graph convolution network is used for deep learning to capture the characteristics of nodes and neighbor nodes,to make up for the deficiency of shallow model in capturing the information of the network structure,so as to better integrate rich semantic information and structural information into the low-dimensional node representation.Experiments are carried out on DBLP and IMDB,compared with DeepWalk,node2vec and Metapath2vec classical algorithms,the proposed MG2vec algorithm has higher classification accuracy and better performance in multi-label classification tasks,the precision and Macro-F1 value can be respectively up to 94.49%and 94.16%,and the both of values are up to 26.05%and 28.73%higher respectively than DeepWalk.The experimental results show that the performance of MG2vec algorithm is better than that of classical network representation learning algorithms,and MG2vec has better heterogeneous information network representation effect.
作者
蒋宗礼
李苗苗
张津丽
JIANG Zong-li;LI Miao-miao;ZHANG Jin-li(Department of Information Technology,Beijing University of Technology,Beijing 100124,China)
出处
《计算机科学》
CSCD
北大核心
2020年第7期231-235,共5页
Computer Science
关键词
网络表示学习
异质信息网络
元路径
语义信息
网络结构信息
图卷积网络
Network representation learning
Heterogeneous information network
Meta-path
Semantics information
Network structure information
Graph convolutional networks