社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文...社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文提出了规模受限的影响力社区搜索(Size-Constrained Influential Community search,SCIC),设计了基于深度优先搜索的基础算法,在此基础上进一步提出了基于结点预处理、剪枝规则和贪心策略的优化算法,用于减少冗余计算,加速枚举过程.在10个不同规模的数据集上进行实验,实验结果表明基础算法在搜索获得的社区规模和影响力上均优于已有算法,同时,本文提出的优化算法能够显著提升搜索效率,将响应时间缩减至基础算法的1%.展开更多
提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩...提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩阵,把邻接矩阵当作马尔可夫链中的概率转移矩阵;寻求概率转移矩阵的主要特征向量;把每个节点的主要特征向量值作为每个数据的离群度。实验结果表明,该算法与其他高维离群点挖掘算法相比,在效率及有效处理的维数方面均有显著提高。展开更多
The introduction of the social networking platform has drastically affected the way individuals interact. Even though most of the effects have been positive, there exist some serious threats associated with the intera...The introduction of the social networking platform has drastically affected the way individuals interact. Even though most of the effects have been positive, there exist some serious threats associated with the interactions on a social networking website. A considerable proportion of the crimes that occur are initiated through a social networking platform [1]. Almost 33% of the crimes on the internet are initiated through a social networking website [1]. Moreover activities like spam messages create unnecessary traffic and might affect the user base of a social networking platform. As a result preventing interactions with malicious intent and spam activities becomes crucial. This work attempts to detect the same in a social networking platform by considering a social network as a weighted graph wherein each node, which represents an individual in the social network, stores activities of other nodes with respect to itself in an optimized format which is referred to as localized data set. The weights associated with the edges in the graph represent the trust relationship between profiles. The weights of the edges along with the localized data set are used to infer whether nodes in the social network are compromised and are performing spam or malicious activities.展开更多
文摘社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文提出了规模受限的影响力社区搜索(Size-Constrained Influential Community search,SCIC),设计了基于深度优先搜索的基础算法,在此基础上进一步提出了基于结点预处理、剪枝规则和贪心策略的优化算法,用于减少冗余计算,加速枚举过程.在10个不同规模的数据集上进行实验,实验结果表明基础算法在搜索获得的社区规模和影响力上均优于已有算法,同时,本文提出的优化算法能够显著提升搜索效率,将响应时间缩减至基础算法的1%.
文摘提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩阵,把邻接矩阵当作马尔可夫链中的概率转移矩阵;寻求概率转移矩阵的主要特征向量;把每个节点的主要特征向量值作为每个数据的离群度。实验结果表明,该算法与其他高维离群点挖掘算法相比,在效率及有效处理的维数方面均有显著提高。
文摘The introduction of the social networking platform has drastically affected the way individuals interact. Even though most of the effects have been positive, there exist some serious threats associated with the interactions on a social networking website. A considerable proportion of the crimes that occur are initiated through a social networking platform [1]. Almost 33% of the crimes on the internet are initiated through a social networking website [1]. Moreover activities like spam messages create unnecessary traffic and might affect the user base of a social networking platform. As a result preventing interactions with malicious intent and spam activities becomes crucial. This work attempts to detect the same in a social networking platform by considering a social network as a weighted graph wherein each node, which represents an individual in the social network, stores activities of other nodes with respect to itself in an optimized format which is referred to as localized data set. The weights associated with the edges in the graph represent the trust relationship between profiles. The weights of the edges along with the localized data set are used to infer whether nodes in the social network are compromised and are performing spam or malicious activities.