天天看點

Community detection (2)大牛們(Famous Researchers)期刊會議 (Conference and Journal)研究點梳理 (Knowledge Graph)方法綜述工具 (Tool)資料集 (Datasets)相關公開課 (Open Course)

整理下資料,先丢上來,後面再慢慢擴充。

大牛們(Famous Researchers)

(不完全,隻是選了幾個我經常看到的)

M.E.J. Newman

2004 Finding andevaluating community structure in networks

2006 Modularity andcommunity structure in networks

提出著名的modularity,衡量類内連結密集,類間連結稀疏的社團

2006 Mixture models andexploratory analysis in networks

在不知道社團結構的情況下尋找社團,有點毀三觀,其實目标函數是找相同的連結模式

2008 Hierarchicalstructure and the prediction of missing links in networks

這篇上了Nature,層次結構能夠描述複雜網絡的結構,進而用來預測邊。你們還在用社團的準确性來衡量層次結構麼,弱爆了,看大牛直接用層次結構重構網絡!

2011 Stochasticblockmodels and community structure in networks

度修正的随機塊模型,壯哉block model

2012 Communities,modules and large-scale structure

社團檢測的入門讀物,發到nature physics上,有中文翻譯(一時間沒找到網址,想要的留個郵箱)

2010 Network: AnIntroduction

Newman出的本書,網站有目錄,講的比較基礎

Steve Gregory

2007  An algorithm to findoverlapping community structure in networks

改進GN算法到重疊社團,大緻就是允許點也進行分裂

老爺子挺喜歡改進的,有篇文寫的是将任意無重疊算法擴充成重疊的,大緻是先用這裡的方法把點分裂了,再用無重疊的方法檢測

2010  Finding overlapping communities in networks by label propagation

标号傳播方法

2011  Fuzzy overlappingcommunities in networks

認為重疊還有兩種,crisp和fuzzy,相當于是硬重疊和軟重疊,評價了目前方法對這兩種重疊的檢測效果

YYAhn

2010 Link communities revealmultiscale complexity in networks

感覺自從這篇文上了Nature,邊社團一下就火了= =

方法很簡單,定義了邊的相似度,做了個層次聚類

實驗做的非常豐富!

Tim S. Evans

2009 Line graphs, linkpartitions, and overlapping communities

提起邊社團,怎麼能不提Evans的line graph呢,他把邊映射成點,于是用傳統關于點的方法就可以得到邊社團。

Evans和Ahn還寫了聲明說兩人是獨立完成工作的,碰巧都是關于邊社團╮( ̄▽ ̄)╭

Peter J. Mucha

2010 Community Structure in Time-Dependent, Multiscale, andMultiplex Networks

這篇上了Science,講多片的網絡,比如随時間依賴的,邊類型多樣的,多種分辨率的。方法很巧,把各網絡相同的點連了條邊,進而将所有網絡連到一起

Vincent Blondel

2008 Fast unfolding of communities in large networks

(無人可及)快速的無重疊社團檢測方法BGLL,目标函數是modularity,仔細解讀過它的代碼,c++寫的,以至于後來寫的風格都跟它一樣…

Gergely Palla

2005 Uncovering theoverlapping community structure of complex networks in nature and society

2007 Quantifying socialgroup evolution

砸上兩篇Nature 05年那篇是講經典的clique方法;07年那篇講社團的演變

2006 CFinder: locatingcliques and overlapping modules in biological networks

經典的clique方法的工具CFinder,填個表可免費使用 

RenaudLambiotte

和Evans合作的line graph,和Blondel合作的BGLL

Liu Huan

期刊會議 (Conference and Journal)

關鍵字:community detection, social network, socialnetwork analysis, complex network, cluster, graph partition

Nature

Science

AAAI

WWW

ICDM

SIGKDD

SIGMOD

PKDD

PAKDD

TKDD

SDM

CIKM

Proceedings of the National Academy of Sciences 9.681

New journal of Physics     4.177

Physical Review E    2.255

Journal of Statistical Mechanics: Theory and Experiment      1.7

Journal of Physics A: Mathematical and Theoretical      1.540

The European Physical Journal B     1.534

Physica A: Statistical Mechanics and its Applications    1.373

EPL (Europhysics Letters)      

PLOS One 4.096

Complex networks 

Social networks        2.931

Network Science     

右邊一列數字是影響因子,每年在變,也忘記這是哪年的了…

以上也是摘的常見到的,除了資料挖掘相關的,還有大片實體的,是的,有一大群實體學家在搞這方面,比如MarkNewman = =事實上 生物,社會,實體,數學,計算機科學的人都有在搞,交叉學科嘛

研究點梳理 (Knowledge Graph)

相關的wiki

http://en.wikipedia.org/wiki/Community_structure

http://en.wikipedia.org/wiki/Cluster_analysis

學科關系圖

Community detection (2)大牛們(Famous Researchers)期刊會議 (Conference and Journal)研究點梳理 (Knowledge Graph)方法綜述工具 (Tool)資料集 (Datasets)相關公開課 (Open Course)

從以下幾方面能大緻描述一篇論文的研究方面(個人總結,不足求噴)

Flat cluster聚類結果是對網絡的一個劃分,一般結果都是這樣

Hierarchical cluster層次聚類,結果是社團包含關系的樹形圖(dendrogram)

Overlapping(Fuzzy/Crispassignment)成員可以屬于多個社團

Non-overlapping(Hardassignment)成員隻能屬于一個社團

Static network網絡是固定的,不随時間變化,通常是

Dynamic network網絡會随着時間變化

Multiplex network網絡中的邊有多種類型

Bipartite network網絡中的點有兩種類型(依此類推可以有多種類型)

Density community目标是内部連結密集的社團

Bipartite community 目标是内部連結稀疏的社團,通常是将網絡劃分為二部圖或多部圖

Mixture community目标是連結模式類似的社團,上述兩者的混合

說起來大多社團的定義都是靠的算法,算法檢測出來什麼就定義成什麼==

Global利用全局資訊,檢測網絡整體的社團劃分

Local利用局部資訊,比如考慮一個點時隻看它的鄰居點,可以檢測網絡局部的社團,比如指定一個點,看它周圍的社團劃分情況,很實際的應用,尤其是當資料規模非常大的時候

Increment(online) 算法支援線上更新,即添加或删除一些點(邊),不用重新再跑一遍,簡單地調整下就好了,适合于實時變動、規模大的網絡。

進一步還有研究

Node properties (hub, periphery) 研究節點的性質,比如是否為關鍵點,中心點,邊緣點,引導者,跟随者等

Spread process 研究資訊的傳播過程,比如輿論傳播,病毒傳播。

Link prediction預測缺失的邊,其實就是推薦

Evaluation檢測的效果好不好需要評價名額,目前還沒有公認的好的評價名額。直接和帶标簽的真實網絡比吧,小規模的網絡沒有說服力。大規模的資料,社團的定義都不一定相同。一些好文章,是自己做的資料集,用自己的評價名額來衡量。于是一些人專門做了一系列實驗,從比較客觀的角度,來評價目前的算法,這也是個研究方面。

Visualization評價名額得到定量的分析,但也隻是一堆數,人們還是喜歡看到圖,如何可視化地展示社團結構也是個問題。

方法綜述

Community detection (2)大牛們(Famous Researchers)期刊會議 (Conference and Journal)研究點梳理 (Knowledge Graph)方法綜述工具 (Tool)資料集 (Datasets)相關公開課 (Open Course)

來自http://blog.sciencenet.cn/blog-798640-677758.html

http://blog.sina.com.cn/s/blog_63891e610101722t.html

(留個空自己總結個) 

綜述論文 (Surveys)

2010 Community detectionin graphs

工具書般的綜述= =

2012 Communities,modules and large-scale structure

社團檢測的入門讀物,發到nature physics上,有中文翻譯

2012 Temporal networks

總結了随時間變化的網絡結構的分析方法

2013 Overlappingcommunity detection in networks: The state-of-the-art and comparative study

重疊算法的綜述

工具 (Tool)

Gephi

Gephi is an interactivevisualization and explorationplatform forall kinds of networks and complex systems, dynamic and hierarchical graphs.

Runs on Windows, Linuxand Mac OS X. Gephi is open-source and free.

http://gephi.org/users/download/

NetLogo

NetLogo is a multi-agentprogrammable modeling environment. It is used by tens of thousands of students,teachers and researchers worldwide. It also powers HubNet participatorysimulations. It is authored by Uri Wilensky and developed at the CCL. You candownload it free of charge.

http://ccl.northwestern.edu/netlogo/download.shtml

Pajek

Pajek (Slovene word forSpider) is a program, for Windows, for analysis and visualization of largenetworks. It is freely available, for noncommercial use, at itsdownload page.

http://pajek.imfm.si/doku.php?id=download

iGraph

igraphis a free software package for creating and manipulating undirected anddirected graphs. It includes implementations for classic graph theory problemslike minimum spanning trees and network flow, and also implements algorithmsfor some recent network analysis methods, like community structure search.

http://igraph.sourceforge.net/download.html

Cytoscape

Cytoscape is an open sourcesoftware platform for visualizing complex networks and integrating these withany type of attribute data. A lot of Apps are available for various kinds ofproblem domains, including bioinformatics, social network analysis, andsemantic web.

http://www.cytoscape.org/download.html

其他 (Other code)

http://code.google.com/p/community-detection/  C++的

http://code.google.com/p/linloglayout/  java的。

來自 <http://blog.sina.com.cn/s/blog_67532f7c0100qakz.html>

http://blog.sciencenet.cn/blog-404069-297233.html工具

MatlabBGL is a Matlabpackage for working with graphs.  It uses the Boost Graph Library to efficiently implement the graph algorithms. MatlabBGL is designed to work with large sparse graphs with hundreds of thousandsof nodes.

來自 <https://www.cs.purdue.edu/homes/dgleich/packages/matlab_bgl/>

資料集 (Datasets)

http://www.cs.cmu.edu/~enron/

http://www.informatik.uni-trier.de/~ley/db/

http://socialnetworks.mpi-sws.org/data-imc2007.html

http://www.cs.bris.ac.uk/~steve/networks/

 http://www.cs.bris.ac.uk/~steve/networks/peacockpaper/

http://cran.r-project.org/web/packages/timeordered/index.html

http://www.facebook.com/press/info.php?statistics

http://www.cs.cornell.edu/projects/kddcup/datasets.html

http://www-personal.umich.edu/~mejn/netdata/

http://www.cise.ufl.edu/research/sparse/mat/Pajek/

http://arnetminer.org/download

http://yeast-complexes.russelllab.org/complexview.pl?rm=complex_list

http://thebiogrid.org/

http://mips.helmholtz-muenchen.de/genre/proj/yeast/

http://www.yeastgenome.org/

http://vlado.fmf.uni-lj.si/pub/networks/data/

http://archive.routeviews.org/

http://blog.sciencenet.cn/blog-40109-279160.html

 http://deim.urv.cat/~aarenas/data/welcome.htm

相關公開課 (Open Course)

https://www.coursera.org/course/sna

This course will use social network analysis, both its theory andcomputational tools, to make sense of the social and information networks thathave been fueled and rendered accessible by the internet.

http://cm.dce.harvard.edu/2014/01/14328/publicationListing.shtml

繼續閱讀