====== Social Networks ====== **What is Social Network Analysis?** Social Network Analysis (SNA) is the study of the relationships (links) between people (nodes) in a network. SNA combines elements of mathematical graph theory, sociology, and matrix algebra to product quantitative measures that describe patterns of association. * Related Items * **Small world problem** * Small world experiment * Six degrees of separation * **Scale free problem** * Power law distribution * Rich get richer * Robust yet fragile * The strength of weak ties * Guanxi in Chinese society * The rule of 150 ===== Models ===== * Random graph [[http://en.wikipedia.org/wiki/Random_network|on Wikipedia]] * A random graph is obtained by starting with a set of n vertices and adding edges between them at random. Most commonly studied is the [[http://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model|Erdős-Rényi model]], called G(n,p), in which each of the possible edges occurs independently with probability p. A closely related model, G(n,M) assigns equal probability to all graphs with exactly M edges. * Properties of G(n, p): * mean degree = p(n-1); * average path length L ~ ln(n)/ln(); * clustering coefficient C = p = /n << 1, which means that random graphs usually have a small clustering coefficient. * possion degree distribution * Despite their simplicity and power, the ER graphs fail to explain two important properties observed in real-world networks: * By assuming a constant and independent probability of two nodes being connected, they do not account for local clustering and triadic closures. This fact is formally expressed by the ER graphs having a low clustering coefficient. * They do not account for the formation of hubs. Formally, the degree distribution of ER graphs converges to a Poisson distribution, rather than a power law observed in most real-world, scale-free networks. * Small-world network [[http://en.wikipedia.org/wiki/Small-world_network|on Wikipedia]] * Watts and Strogatz (WS) model [[http://en.wikipedia.org/wiki/Watts_and_Strogatz_model|on Wikipedia]] * Many empirical graphs are well modeled by small-world networks. Social networks, the connectivity of the Internet, and gene networks all exhibit small-world network characteristics. * A certain category of small-world networks were identified as a class of random graphs by Duncan Watts and Steven Strogatz in 1998. They noted that graphs could be classified according to their clustering coefficient and their mean-shortest path length. While many random graphs exhibit a small mean-shortest path (varying typically as the logarithm of the number of nodes) they also usually have a small clustering coefficient. * Watts and Strogatz measured that in fact many real-world networks have a small shortest path but also a clustering coefficient significantly higher than expected by random chance. Watts and Strogatz proposed **a simple model of random graphs** with (i) **a small average shortest path** and (ii) **a large clustering coefficient**. * Limitations * The major limitation of the model is that it produces graphs that are homogeneous in degree. In contrast, real networks are often scale-free networks inhomogeneous in degree, having hubs and a scale-free degree distribution. Such networks are better described by the preferential attachment family of models, such as the Barabási–Albert (BA) model. * The model also implies a fixed number of nodes and thus cannot be used to model network growth. * Scale-free network [[http://en.wikipedia.org/wiki/Scale-free_network|on Wikipedia]] * Barabási–Albert (BA) model [[http://en.wikipedia.org/wiki/Barabasi-Albert_model|on Wikipedia]] | Generalized scale-free model [[http://en.wikipedia.org/wiki/Generalized_scale-free_model|on Wikipedia]] * Scale-free networks show a power law degree distribution like most real networks. * Network **growth** and **preferential attachment** have been shown to create networks with power law degree distributions. * Social-circles network model [[http://en.wikipedia.org/wiki/Social-circles_network_model|on Wikipedia]] ===== Issues in Social Networks ===== * Advertising for social networks [[http://www.adchap.com/|Ad Chap]] * Social search [[http://en.wikipedia.org/wiki/Social_search|on Wikipedia]] * Visualizing online social networks [[http://jheer.org/publications/2005-Vizster-InfoVis.pdf|pdf]] ===== Resources ===== ==== Research People & Organization ==== * [[http://www.cs.cornell.edu/home/kleinber/|Jon Kleinberg]] - Cornell * [[http://research.yahoo.com/bouncer_user/106|Duncan J. Watts]] - [[http://www.sociology.columbia.edu/fac-bios/watts/faculty.html|Columbia]] and [[http://research.yahoo.com/bouncer_user/106|Yahoo! Research]] * the author of **Six Degrees: The Science of a Connected Age** (W.W. Norton, 2003) and **Small Worlds: The Dynamics of Networks between Order and Randomness** (Princeton University Press, 1999). * [[http://www.nd.edu/~alb/|Albert-László Barabási]] - University of Notre Dame * Scale fee network in Science @ 1999 * [[http://www.cs.cmu.edu/~christos/|Christos Faloutsos]] | [[http://www.cs.cmu.edu/~jure/|Jure Leskovec]] - CMU * [[http://www.ladamic.com/|Lada Adamic]] - University of Michigan * [[http://www.tomkinshome.com/andrew/|Andrew Tomkins]] - [[http://research.yahoo.com/Search_Technologies|Yahoo! Research Search Technology]] * [[http://www.ics.uci.edu/~jmadden/|Joshua O'Madadhain]] - UCI PhD, now in Google, developer of [[http://jung.sourceforge.net/|JUNG]] * His interests include machine learning, data mining, information retrieval, social network analysis, combinatorial optimization, and the design of data structures and algorithms. He is currently investigating the use of machine learning techniques to create predictive and descriptive models for event-based network data. * Academic Research Centers * [[http://www.nd.edu/~networks/|Center for Complex Network Research]] - Albert-László Barabási's lab in Univ. of Notre Dame * [[http://www.hks.harvard.edu/netgov/html/sna.htm|Social Network Analysis]] - Program on Networked Governance, Harvard University * [[http://www.casos.cs.cmu.edu/|CASOS]] Institute at Carnegie Mellon *NEW* * [[http://www.datalab.uci.edu/|UCI DataLab]] | [[http://www.ics.uci.edu/~smyth/|Padhraic Smyth]] - UCI (JUNG developer) * Corporate Research Centers * [[http://www.hpl.hp.com/research/scl/people/huberman/index.html|Bernardo Huberman]] | [[http://www.hpl.hp.com/research/scl/|Social Computing Lab]] - HP * [[http://www.research.ibm.com/SocialComputing/index.html|Social Computing Group]] | [[http://domino.research.ibm.com/cambridge/research.nsf/pages/index.html|IBM Watson Research Center, Cambridge]]- IBM * [[http://research.microsoft.com/scg/|Social Computing Group]] - Microsoft Research ==== Toolbox & Library ==== * [[http://jung.sourceforge.net/|JUNG]] - the Java Universal Network/Graph Framework -- is a software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. Developed by Scott White, Joshua O'Madadhain, Danyel Fisher, Yan-Biao Boey at UCI. [Open source, free, and has a wide variety of algorithms available. //Recommended by Hongbo//] * [[http://www.libsna.org/|libSNA]] is the premier open source library for conducting social network analysis (SNA) research. [Open source, free, few basic network statistics, in Python] * [[http://www.analytictech.com/ucinet/ucinet.htm|UCINET]] software package. [Commercial, slow, and cannot be embedded into applications: you can't call UCINET in an end-user display] * [[http://vlado.fmf.uni-lj.si/pub/networks/pajek/|Pajek]] [[http://pajek.imfm.si/doku.php|Pajek Wiki]] - Analysis and Visualization of Large Scale Networks, Program for Large Network Analysis * [[http://www.insna.org/INSNA/soft_inf.html|Computer Programs for Social Network Analysis]] in INSNA.org * Social network analysis software - [[http://en.wikipedia.org/wiki/Social_network_analysis_software|on Wikipedia]] * [[http://www.casos.cs.cmu.edu/computational_tools/tools.html|Computational Models and Social Network Tools]] in CASOS@CMU ==== Data Sets ==== * [[http://www.insna.org/INSNA/data_inf.html|Data Sets]] in INSNA.org * [[http://vlado.fmf.uni-lj.si/pub/networks/data/|Pajek datasets]] * [[http://www.casos.cs.cmu.edu/computational_tools/data.php|Network Analysis Data]] in CASOS@CMU ===== Metrics in social network analysis ===== * ** Centrality ** [[http://en.wikipedia.org/wiki/Centrality|on Wikipeida]] - There are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph. There are four measures of centrality that are widely used in network analysis: degree centrality, betweenness, closeness, and eigenvector centrality. * **Degree centrality**: The count of the number of ties to other actors in the network. If the network is directed (meaning that ties have direction), then we usually define two separate measures of degree centrality, namely **indegree** and **outdegree**. Indegree is a count of the number of ties directed to the node, and outdegree is the number of ties that the node directs to others. For positive relations such as friendship or advice, we normally interpret indegree as a form of popularity, and outdegree as gregariousness. * **Betweenness**: Degree an individual lies between other individuals in the network; the extent to which a node is directly connected only to those other nodes that are not directly connected to each other; an intermediary; liaisons; bridges. Therefore, it's the number of people who a person is connected to indirectly through their direct links. * **Closeness**: The degree an individual is near all other individuals in a network (directly or indirectly). It reflects the ability to access information through the "grapevine" of network members. Thus, closeness is the inverse of the sum of the shortest distances between each individual and every other person in the network. * **Eigenvector centrality**: a measure of the importance of a node in a network. It assigns relative scores to all nodes in the network based on the principle that connections to nodes having a high score contribute more to the score of the node in question. Google's PageRank is a variant of the Eigenvector centrality measure. * **Flow betweenness centrality**: The degree that a node contributes to sum of maximum flow between all pairs of nodes (not that node). * **Centralization** [[http://en.wikipedia.org/wiki/Centralization|on Wikipedia]] - The difference between the n of links for each node divided by maximum possible sum of differences. A centralized network will have many of its links dispersed around one or a few nodes, while a decentralized network is one in which there is little variation between the n of links each node possesses. * **Clustering coefficient** [[http://en.wikipedia.org/wiki/Clustering_coefficient|on Wikipeida]] - A measure of the likelihood that two associates of a node are associates themselves. A higher clustering coefficient indicates a greater 'cliquishness'. The clustering coefficient C_i for a vertex v_i is the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them. * C_i = {|{e_{jk}}|}/{{k_i}({k_i}-1)} : v_j,v_k in N_i, e_{jk} in E. * {C} = {1}/{n} sum{i=1}{n} {C_i}. * **Cohesion** [[http://en.wikipedia.org/wiki/Structural_cohesion|on Wikipedia]] - The degree to which actors are connected directly to each other by cohesive bonds. Groups are identified as ‘cliques’ if every actor is directly tied to every other actor, ‘social circles’ if there is less stringency of direct contact, which is imprecise, or as structurally cohesive blocks if precision is wanted. * **Structural cohesion** - The minimum number of members who, if removed from a group, would disconnect the group. It is also useful to know that k-cohesive graphs (or k-components) are always a subgraph of a k-core, although a k-core is not always k-cohesive. A k-core is simply a subgraph in which all nodes have at least k neighbors but it need not even be connected. ^ ^| Some illustrative examples are presented in the gallery below: | | ^|{{kb:networktopology-ring.png?80}} | {{kb:networktopology-fullyconnected.png?80|}} | ^|The 6-node ring in the graph has connectivity-2 or a level 2 of structural cohesion because the removal of two nodes is needed to disconnect it.| A 6-node clique is a 5-component, structural cohesion 5. | * ** Density** [[http://en.wikipedia.org/wiki/Dense_graph|on Wikipedia]] - The degree a respondent's ties know one another/ proportion of ties among an individual's nominees. Network or global-level density is the proportion of ties in a network relative to the total number possible (sparse versus dense networks). * D = {2|E|}/{|V|(|V|-1)}, |E| denotes the number of edges, |V| the number of vertices, the maximal density is 1, which is similar to the C_i * **Path Length** - The distances between pairs of nodes in the network. * **Average path length** is the average of these distances between all pairs of nodes. * **Diameter** - The maximum shortest path length * **Radiality** - Degree an individual’s network reaches out into the network and provides novel information and influence * **Reach** - The degree any member of a network can reach other members of the network. * **Structural equivalence** [[http://en.wikipedia.org/wiki/Equivalence_relation|on Wikipedia]] - Refers to the extent to which actors have a common set of linkages to other actors in the system. The actors don’t need to have any ties to each other to be structurally equivalent. * **Structural hole** - Static holes that can be strategically filled by connecting one or more links to link together other points. Linked to ideas of social capital: if you link to two people who are not linked you can control their communication. * **Degree distribution** [[http://en.wikipedia.org/wiki/Degree_distribution|on Wikipedia]] - Degree distribution gives the probability distribution of degrees in a network. Its use originates from the study of random graph by Paul Erdős and Alfréd Rényi, and it has become an important concept which describes the topology of complex networks. Power-law distribution by Albert-László Barabási. * **Size** - the number of nodes * **Triadic Census** - TriadicCensus is a standard social network tool that counts, for each of the different possible configurations of three vertices, the number of times that that configuration occurs in the given graph. [[http://www.educa.fmf.uni-lj.si/datana/pub/networks/pajek/triade.pdf|16 possible triads]] * **Structural Holes** - Calculates some of the measures from Burt's text [[http://www.feem-web.it/ctn/papers/05_Goyal.pdf|"Structural Holes: The Social Structure of Competition"]]. In Structural Holes, Ron Burt (1992; 1995) describes a set of new measures based on ego networks. One key set of measures is concerned with the notion of redundancy. The general meaning of redundancy is clear: a person's ego network has redundancy to the extent that her contacts are connected to each other as well. [[http://www.analytictech.com/connections/v20(1)/holes.htm|Unpacking Burt's Redundancy Measures]] /* S(f)(t)=a_{0}+sum{n=1}{+infty}{a_{n} cos(n omega t)+b_{n} sin(n omega t)} delim{lbrace}{matrix{3}{1}{{3x-5y+z=0} {sqrt{2}x-7y+8z=0} {x-8y+9z=0}}}{ } delim{|}{{1/N} sum{n=1}{N}{gamma(u_n)} - {1}/{2 pi} int{0}{2 pi}{gamma(t) dt}}{|} <= epsilon/3 */ ===== References ===== ==== Papers ==== ==Model of Small-World Networks and Search== * D. J. Watts and S. H. Strogatz. [[http://www.sociology.columbia.edu/pdf-files/watts08.pdf|Collective dynamics of 'small-world' networks]]. Nature, 393:440-442 (1998). * Networks must exist on a sliding scale between clustered and random, producing tightly knit groups of friends but also short paths that reach throughout the whole network. Proposed two concepts to measure the small world, which are HIGH clustering coefficient and SHORT average path length. * J. Kleinberg. [[http://www.cs.cornell.edu/home/kleinber/nat00.pdf|Navigation in a Small World]]. Nature 406(2000), 845. * J. Kleinberg. [[http://www.caip.rutgers.edu/~virajb/readinglist/smallworld.pdf|The small-world phenomenon: An algorithmic perspective]]. Proc. 32nd ACM Symposium on Theory of Computing, 2000. * J. Kleinberg. [[http://www.cs.cornell.edu/home/kleinber/nips14.pdf|Small-World Phenomena and the Dynamics of Information]]. Advances in Neural Information Processing Systems (NIPS) 14, 2001. * D. J. Watts, P. S. Dodds, and M. E. J. Newman. [[http://www.sociology.columbia.edu/pdf-files/watts05.pdf|Identity and search in social networks]]. Science, 296, 1302-1305 (2002). * Oskar Sandberg and Ian Clarke. [[http://www.arxiv.org/abs/cs?papernum=0607025|The Evolution of Navigable Small-World Networks]]. arxiv cs.DS/0607025, July 2006. ==Scale-free network / Degree distribution== * R. Albert, H. Jeong, A.-L. Barabási. [[http://www.nd.edu/~alb/Publication06/062%20Diameter%20of%20the%20world%20wide%20web/Diameter%20of%20the%20world%20wide%20web.pdf|Diameter of the world wide web]]. Nature 401, 130-131 (1999). * A.-L. Barabási, R. Albert. [[http://www.nd.edu/~alb/Publication06/063%20Emergence%20of%20scaling%20in%20random%20networks/Emergence%20of%20scaling%20in%20random%20networks.pdf|Emergence of scaling in random networks]]. Science 286, 509-512 (1999). * A.-L. Barabási, R. Albert, H. Jeong, G. Bianconi. [[http://www.nd.edu/~alb/Publication06/071%20Power-law%20distribution%20of%20the%20World%20Wide%20Web/Power-law%20distribution%20of%20the%20World%20Wide%20Web.pdf|Power-law distribution of the World Wide Web]]. Science 287, 2115 (2000). ==Search a social network== * Lada A. Adamic, Rajan M. Lukose, Amit R. Puniyani, Bernardo A. Huberman. [[http://www.hpl.hp.com/research/idl/papers/plsearch/pre46135.pdf|Search in Power-Law Networks]]. Phys. Rev. E, 64 46135 (2001). * P. S. Dodds, R. Muhamad, and D. J. Watts. [[http://www.sociology.columbia.edu/pdf-files/watts04.pdf|An experimental study of search in global social networks]]. Science, 301, 827-829 (2003). * Lada A. Adamic and Eytan Adar. [[http://www-personal.umich.edu/~ladamic/papers/socialsearch/adamicsocialsearch.pdf|How to search a social network]]. Social Networks, 27(3):187-203, July 2005. * G. Kossinets and D. J. Watts. [[http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2005Sci...311...88K&link_type=ARTICLE&db_key=GEN&high=24520|Empirical Analysis of Evolving Social Networks]]. Science, 311, 88-90 (2006). == Good Survey papers on complex social networks == * R. Albert, A.LBarabási. [[http://www.nd.edu/~alb/Publication06/092%20Statistical%20mechanics%20of%20complex%20networks/Statistical%20mechanics%20of%20complex%20networks.pdf|Statistical mechanics of complex networks]]. Rev. Modern Phys. 74(1), 47–97 (2002). * M. E. J. Newman. [[http://epubs.siam.org/sam-bin/getfile/SIREV/articles/42480.pdf|The structure and function of complex networks]]. SIAM Review, 2003, 45, 167-256. == Good Related Courses == * [[http://www.cs.cornell.edu/Courses/cs685/2007fa/|The Structure of Information Networks]] by Kleinberg @ Cornell * "The past decade has seen a convergence of social and technological networks, with systems such as the World Wide Web characterized by the interplay between rich information content, the millions of individuals and organizations who create it, and the technology that supports it. This course covers recent research on the structure and analysis of such networks, and on models that abstract their basic properties. Topics include combinatorial and probabilistic techniques for link analysis, centralized and decentralized search algorithms, network models based on random graphs, and connections with work in the social sciences." == Good Related Tutorials == * Tools for large networks: Structure and diffusion [[http://www2008.org/program/program-tutorials-TF1.html| in WWW2008]] by Jure Leskovec and Christos Faloutsos ==== Books ==== * [[http://faculty.ucr.edu/~hanneman/ |Robert A. Hanneman]] and Mark Riddle. [[http://faculty.ucr.edu/~hanneman/networks/nettext.pdf|Introduction to Social Network Methods]]. UC Riverside, 2005. * Peter J. Carrington, John Scott, Stanley Wasserman. Models and Methods in Social Network Analysis. Cambridge University Press, 2005. [[http://www.amazon.com/Methods-Network-Analysis-Structural-Sciences/dp/0521600979/ref=pd_sim_b_title_3/102-5829959-8639313|on Amazon]] * John P Scott. Social Network Analysis: A Handbook (Paperback). Sage Publications Ltd, 2000. * Pierre Baldi, Paolo Frasconi, and Padhraic Smyth. Modeling the Internet and the Web: Probabilistic Methods and Algorithms. John Wiley and Sons, June 2003. [[http://www.amazon.com/exec/obidos/ASIN/0470849061/qid%3D1051677544/sr%3D11-1/ref%3Dsr_11_1/102-3198847-7819334|on Amazon]], [[http://ibook.ics.uci.edu/|online]] ==== Articles ==== * [[http://neuroscience.ntu.edu.tw/ep/10.htm|社會網絡分析(Social Network Analysis)]] * [[http://www.uaf.edu/northern/big_world.html|Could it be a Big World After All? The `Six Degrees of Separation' Myth]], J. Kleinfeld. Society, April 2002. * [[http://bradfitz.com/social-graph-problem/|Thoughts on the Social Graph]], August, 2007. * [[http://venturebeat.com/2008/01/31/googles-marissa-mayer-social-search-is-the-future/|Google’s Marissa Mayer: Social search is the future]], Interviewed by VentureBest * [[http://www.marketingpilgrim.com/2008/02/is-social-search-coming-from-google.html|Is Social Search Coming from Google?]], February 1, 2008 * [[http://www.readwriteweb.com/archives/google_social_search_is_coming.php|Social Search is Coming]], February 1, 2008 * [[http://searchenginewatch.com/showPage.html?page=3628912|The Evolution of Social Networks]] by By Levy Cohen, Search Engine Watch, Mar 27, 2008 * [[http://news.bbc.co.uk/2/hi/programmes/click_online/7318288.stm|Evolution of the social network]] by By Marc Cieslak, BBC Click, 29 March 2008 ==== Videos & PPTs ==== * Duncan Watts and Kleinberg discuss the small-world problem on [[http://www.sciencefriday.com/pages/2003/Aug/hour1_080803.html|NPR's Science Friday]] (8 August 2003). * **Challenges in Social Network Data: Processes, Privacy and Paradoxes** by Jon Kleinberg [[http://videolectures.net/kdd07_kleinberg_cisnd/|on Videolectures]] * **Social networks and trust : NetTrust** [[http://www.youtube.com/watch?v=IGBKI1QBAr8|on Youtube]] * **Social Network Analysis: A Research Perspective** by [[http://research.microsoft.com/~danyelf/|Danyel Fisher]] (Microsoft)[[http://research.microsoft.com/~danyelf/presentations/FisherPoliticsOnline.ppt|ppt]] ==== Others ==== * [[http://www.insna.org/|International Network for Social Network Analysis (INSNA)]] * [[http://www.orgnet.com/|orgnet.com]] * [[http://www.socialnetworks.org/|Social Network References (Academic Bibliography)]] * [[http://www.cscs.umich.edu/~crshalizi/notebooks/social-networks.html|Social Networks notes]] of Prof. Cosma Rohilla Shalizi * [[http://stat.gamma.rug.nl/socnet.htm|Social Network Analysis Page]] of Prof. Tom Snijders * [[http://www.danah.org/SNSResearch.html|Research on Social Network Sites]] of [[http://www.danah.org/|danah boyd]]