snap-社交网络分析

Snap简介

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges.

安装方法

snap官网地址

1.从上面地址下载python版本压缩包

2.解压得到本地目录下 C:\Users\Joker\snap

3.按进入cmd，cd进入该目录，运行 python setup.py install。

linux系统下也是一样的方法。

注意：snap只支持python2.7版本。windows系统也需要64位系统。

文档和数据

snap相关文档说明

文档中包括了一些基本的数据类型。还包括了一些比较容易用到的功能的代码说明：输出网络的基本信息，可视化网络，度分布图，各种网络分析的功能通常一句代码就解决了。

社交网络数据

snap上收集了一些社交网络数据，并且免费的分享给大家使用，免去了大家自己从网上爬数据的麻烦。业界良心啊。

连接预测

from __future__ import division

import snap
import math

G=snap.LoadEdgeList(snap.PUNGraph,'predict_delete.txt',0,1)
d_common={}
d_jaccard={}
d_adamic={}


for N1 in G.Nodes():
    #print node_list
    for N2 in G.Nodes():
        if((int(N1.GetId())<int(N2.GetId()))and not((N1.GetId()==N2.GetId())or (N1.IsNbrNId(N2.GetId())))):
            node_list_all=set([])
            value_adamic=0
            for Id in N1.GetOutEdges():
                #save the neighbors
                node_list_all.add(Id)

            #sage the same neighbors of n1 and n2
            node_list_same=set([])
            for Id in N2.GetOutEdges():

                #save the all neighbors of n1 and n2
                if(Id in node_list_all):
                    node_list_same.add(Id)
                    node_list_n=set([])
                    for N_n in G.GetNI(Id).GetOutEdges():
                        #print N_n
                        node_list_n.add(N_n)

                    if(len(node_list_n)>0):
                        value_adamic=value_adamic+1/(math.log(len(node_list_n)))
                    else:
                        value_adamic=value_adamic+1

                node_list_all.add(Id)
            #if (len(node_list_same)>0):
                #print node_list_all,node_list_same      
            key=str(N1.GetId())+'-'+str(N2.GetId())
            value_common=len(node_list_same)
            value_jaccard=len(node_list_same)/len(node_list_all)

            d_common[key]=int(value_common)
            d_jaccard[key]=int(value_jaccard)
            d_adamic[key]=int(value_adamic)

s_common=sorted(d_common.items(),key=lambda item:item[1],reverse=True)
s_jaccard=sorted(d_jaccard.items(),key=lambda item:item[1],reverse=True)
s_adamic=sorted(d_adamic.items(),key=lambda item:item[1],reverse=True)


sample=set([])
with open ('predict_d.txt','r') as f:
    for line in f.readlines():
        data=line.strip().split()
        ids=str(data[0])+'-'+str(data[1])
        sample.add(ids)
count_common=0
count_jaccard=0
count_adamic=0
num=500
#common neighbors
with open ('predict_common.txt','w') as f:
    for i in range(0,num):
        f.write(str(s_common[i][0])+'\n')
        if(str(s_common[i][0]) in sample):
            count_common=count_common+1
    precise=count_common/num
    f.write('precise='+str(precise))
#jaccard
with open ('predict_jaccard.txt','w') as f:
    for i in range(0,num):
        f.write(str(s_jaccard[i][0])+'\n')
        if(str(s_jaccard[i][0]) in sample):
            count_jaccard=count_jaccard+1
    precise=count_jaccard/num
    f.write('precise='+str(precise))
#adamic
with open ('predict_adamic.txt','w') as f:
    for i in range(0,num):
        f.write(str(s_adamic[i][0])+'\n')
        if(str(s_adamic[i][0]) in sample):
            count_adamic=count_adamic+1
    precise=count_adamic/num
    f.write('precise='+str(precise))

通过三个方法计算每两对节点的score值。排序，排序top-n的为新出现的连接。

最终比较三个预测方法，adamic方法的准确率最高。

common：两个节点共同的邻居的数量

jaccard：两个节点共同邻居的邻居的数量/两个节点总邻居的数量

adamic：对两个节点共同邻居最加一个权重w，若某个共同邻居的邻居数非常高，那这个权重w就可以很小。

snap-社交网络分析

继续阅读

来自python的【条件控制/语句循环/break/continue/else/pass】一、条件控制二、语句循环

无法解析的外部符号 wmain，该符号在函数 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink导出用例转换工具(XML2Excel)

YAML简介和PyYAML安全操作YAML支持的类型YAML的优点：yaml的基本语法python操作

Small tricks

libsvm for python 安装

学习软件测试基础测试第七天

Zeppelin 配置访问 REST APIApache Zeppelin Configuration REST API

【Torch】最简洁logging使用指南

27. Remove Element(列表)题目代码

Cloud Studio初体验

使用 ctypes 进行 Python 和 C 的混合编程

【python】【数据处理】画多维数据分布图

【python】netconf协议对接管理设备

「Python 网络自动化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 网络设备

在python中创建excel并写入