Social Network Analysis American Sociological Association San Francisco, August 2004 James Moody Introduction We live in a connected world: “To speak of social life is to speak of the association between people – their associating in work and in play, in love and in war, to trade or to worship, to help or to hinder. It is in the social relations men establish that their interests find expression and their desires become realized.” Peter M. Blau Exchange and Power in Social Life, 1964 "If we ever get to the point of charting a whole city or a whole nation, we would have … a picture of a vast solar system of intangible structures, powerfully influencing conduct, as gravitation does in space. Such an invisible structure underlies society and has its influence in determining the conduct of society as a whole." J.L. Moreno, New York Times, April 13, 1933 These patterns of connection form a social space, that can be seen in multiple contexts: Introduction Source: Linton Freeman “See you in the funny pages” Connections, 23, 2000, 32-42. Introduction High Schools as Networks Introduction And yet, standard social science analysis methods do not take this space into account. “For the last thirty years, empirical social research has been dominated by the sample survey. But as usually practiced, …, the survey is a sociological meat grinder, tearing the individual from his social context and guaranteeing that nobody in the study interacts with anyone else in it.” Allen Barton, 1968 (Quoted in Freeman 2004) Moreover, the complexity of the relational world makes it impossible to identify social connectivity using only our intuitive understanding. Social Network Analysis (SNA) provides a set of tools to empirically extend our theoretical intuition of the patterns that construct social structure. Introduction Why do Networks Matter? Local vision Introduction Why do Networks Matter? Local vision Introduction Why networks matter: • Intuitive: “goods” travel through contacts between actors, which can reflect a power distribution or influence attitudes and behaviors. Our understanding of social life improves if we account for this social space. • Less intuitive: patterns of inter-actor contact can have effects on the spread of “goods” or power dynamics that could not be seen focusing only on individual behavior. Introduction Social network analysis is: •a set of relational methods for systematically understanding and identifying connections among actors. SNA •is motivated by a structural intuition based on ties linking social actors •is grounded in systematic empirical data •draws heavily on graphic imagery •relies on the use of mathematical and/or computational models. •Social Network Analysis embodies a range of theories relating types of observable social spaces and their relation to individual and group behavior. 1. 2. 3. 4. 5. Introduction Social Network Data a. Basic data Elements b. Collecting network data c. Basic data structures Measuring Networks a. Flows within of goods in networks 1) Topology 2) Time b. Structure of Social Space 1) Small Worlds, Scale-Free, Triads 2) Cohesive Groups 3) Role Positions Modeling with Networks a. Modeling Behaviors with Networks 1) Peer attribute models 2) Network Autocorrelation Models 3) Dyad / QAP Models b. Modeling Network Network Structure 1) QAP for network structure 2) Exponential Random Graph Models SNA Computer Programs Social Network Data The unit of interest in a network are the combined sets of actors and their relations. We represent actors with points and relations with lines. Actors are referred to variously as: Nodes, vertices or points Relations are referred to variously as: Edges, Arcs, Lines, Ties Example: b a d c e Social Network Data In general, a relation can be: Binary or Valued Directed or Undirected b b d a c a e c 1 a b d 1 3 c Undirected, Valued e Directed, binary Undirected, binary b d d 2 4 e a c Directed, Valued e Social Network Data Social network data are substantively divided by the number of modes in the data. 1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples: Communication, friendship, giving orders, sending email. 1-mode data are usually singly reported (each person reports on their friends), but you can use multiple-informant data, which is more common in child development research (Cairns and Cairns). Social Network Data Social network data are substantively divided by the number of modes in the data. 2-mode data represents nodes from two separate classes, where all ties are across classes. Examples: People as members of groups People as authors on papers Words used often by people Events in the life history of people The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to each other through common membership There may be multiple relations of multiple types connecting your nodes. Social Network Data We can examine networks across multiple levels: 1) Ego-network - Have data on a respondent (ego) and the people they are connected to (alters). Example: 1985 GSS module - May include estimates of connections among alters 2) Partial network - Ego networks plus some amount of tracing to reach contacts of contacts - Something less than full account of connections among all pairs of actors in the relevant population - Example: CDC Contact tracing data for STDs Social Network Data We can examine networks across multiple levels: 3) Complete or “Global” data - Data on all actors within a particular (relevant) boundary - Never exactly complete (due to missing data), but boundaries are set -Example: Coauthorship data among all writers in the social sciences, friendships among all students in a classroom For the most part, I will be discussing techniques surrounding global networks today, though I will briefly mention some standard uses of ego-network data. Social Network Data Collecting Network Data Data capture any connection between the nodes. Sources include surveys, published accounts, special informants, etc. In general, you can only make conclusions about relations among the set of nodes you have collected, so it is important to observe as much of the network as possible. See W&F, chap 2 on different types of data collection Social Network Data Collecting Network Data If you use surveys to collect data, some general rules of thumb: a) Network data collection can be time consuming. It is better (I think) to have breadth over depth. Having detailed information on <50% of the sample will make it very difficult to draw conclusions about the general network structure. b) Question format: • If you ask people to recall names (an open list format), fatigue will result in under-reporting • If you ask people to check off names from a full list, you can often get over-reporting c) It is common to limit people to ~5 nominations. This will bias network stats for stars, but is sometimes the best choice to avoid fatigue. d) Concrete relational indicators are best (who did you talk to?) over attitudes that are harder to define (who do you like?) Social Network Data Collecting Network Data Existing Sources of Social Network Data 1) Check INSNA: The International Network of Social Network Analysis 2) Many secondary sources (particularly for 2-mode data) 3) National Longitudinal Survey of Adolescent Health (Add Health) Social Network Data Basic Data Structures Working with pictures. No standard way to draw a sociogram: each of these are equal: Social Network Data Basic Data Structures In general, graphs are cumbersome to work with analytically, though there is a great deal of good work to be done on using visualization to build network intuition. I recommend using layouts that optimize on the feature you are most interested in, and find that either a hierarchical layout or a force-directed layout are best. Social Network Data Basic Data Structures From pictures to matrices b b d a c e Undirected, binary a b 1 a b 1 c 1 d e c d 1 1 c e a 1 a b 1 c 1 d e 1 1 a e Directed, binary 1 1 d b 1 c 1 d e 1 1 1 Social Network Data Basic Data Structures From matrices to lists a a b 1 c d e b 1 c d e 1 1 1 1 1 1 1 1 Adjacency List ab bac cbde dce ecd Arc List ab ba bc cb cd ce dc de ec ed Measuring Networks: Flow “Goods” flow through networks: Measuring Networks: Flow In addition to the simple probability that one actor passes information on to another (pij), two factors affect flow through a network: Topology -the shape, or form, of the network - Example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contact matters - Example: an actor cannot pass information he has not receive yet Measuring Networks: Flow Two features of the network’s topology are known to be important: connectivity and centrality Connectivity refers to how actors in one part of the network are connected to actors in another part of the network. • Reachability: Is it possible for actor i to reach actor j? This can only be true if there is a chain of contact from one actor to another. • Distance: Given they can be reached, how many steps are they from each other? • Number of paths: How many different paths connect each pair? Measuring Networks: Flow Without full network data, you can’t distinguish actors with limited information potential from those more deeply embedded in a setting. c b a Measuring Networks: Flow Reachability Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them. b a a d c b e f c f d e Paths can be directed, leading to a distinction between “strong” and “weak” components Measuring Networks: Flow Reachability Reachability If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component. Intuitively, a component is the set of people who are all connected by a chain of relations. Measuring Networks: Flow Reachability This example contains many components. Measuring Networks: Flow Distance & number of paths Distance is measured by the (weighted) number of relations separating a pair: Actor “a” is: 1 step from 4 2 steps from 5 3 steps from 4 4 steps from 3 5 steps from 1 a Measuring Networks: Flow Distance & number of paths Paths are the different routes one can take. Node-independent paths are particularly important. b There are 2 independent paths connecting a and b. There are many nonindependent paths a Measuring Networks: Flow Distance & number of paths Probability of transfer by distance and number of paths, assume a constant pij of 0.6 1.2 1 probability 10 paths 0.8 5 paths 0.6 2 paths 0.4 1 path 0.2 0 2 3 4 Path distance 5 6 Reachability in Colorado Springs (Sexual contact only) •High-risk actors over 4 years •695 people represented •Longest path is 17 steps •Average distance is about 5 steps •Average person is within 3 steps of 75 other people •137 people connected through 2 independent paths, core of 30 people connected through 4 independent paths (Node size = log of degree) Measuring Networks: Flow Centrality Centrality refers to (one dimension of) location, identifying where an actor resides in a network. • For example, we can compare actors at the edge of the network to actors at the center. • In general, this is a way to formalize intuitive notions about the distinction between insiders and outsiders. Measuring Networks: Flow Centrality At the individual level, one dimension of position in the network can be captured through centrality. Conceptually, centrality is fairly straight forward: we want to identify which nodes are in the ‘center’ of the network. In practice, identifying exactly what we mean by ‘center’ is somewhat complicated, but substantively we often have reason to believe that people at the center are very important. Three standard centrality measures capture a wide range of “importance” in a network: •Degree •Closeness •Betweenness Measuring Networks: Flow Centrality The most intuitive notion of centrality focuses on degree. Degree is the number of ties, and the actor with the most ties is the most important: C D d (ni ) X i X ij j Measuring Networks: Flow Centrality If we want to measure the degree to which the graph as a whole is centralized, we look at the dispersion of centrality: Simple: variance of the individual centrality scores. g 2 2 S D (CD (ni ) Cd ) / g i 1 Or, using Freeman’s general formula for centralization (which ranges from 0 to 1): C g CD i 1 (n ) CD (ni ) * D [( g 1)( g 2)] Measuring Networks: Flow Centrality Freeman: 1.0 Variance: 3.9 Degree Centralization Scores Freeman: .02 Variance: .17 Freeman: .07 Variance: .20 Freeman: 0.0 Variance: 0.0 Measuring Networks: Flow Centrality A second measure of centrality is closeness centrality. An actor is considered important if he/she is relatively close to all other actors. Closeness is based on the inverse of the distance of each actor to every other actor in the network. Closeness Centrality: Cc (ni ) d (ni , n j ) j 1 g 1 Normalized Closeness Centrality CC' (ni ) (CC (ni ))( g 1) Measuring Networks: Flow Centrality Closeness Centrality in the examples C=0.0 C=1.0 C=0.36 C=0.28 Measuring Networks: Flow Centrality Betweenness Centrality: Model based on communication flow: A person who lies on communication paths can control communication flow, and is thus important. Betweenness centrality counts the number of shortest paths between i and k that actor j resides on. b a C d e f g h Measuring Networks: Flow Centrality Betweenness Centrality: C B (ni ) g jk (ni ) / g jk j k Where gjk = the number of geodesics connecting jk, and gjk(ni) = the number that actor i is on. Usually normalized by: C (ni ) C B (ni ) /[( g 1)( g 2) / 2] ' B Measuring Networks: Flow Centrality Betweenness Centrality: Centralization: 1.0 Centralization: .59 Centralization: .31 Centralization: 0 Measuring Networks: Flow Centrality Actors that appear very different when seen individually, are comparable in the global network. (Node size proportional to betweenness centrality ) Measuring Networks: Flow Time Two factors that affect network flows: Topology - the shape, or form, of the network - simple example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contacts matters - simple example: an actor cannot pass information he has not yet received. Measuring Networks: Flow Time Timing in networks A focus on contact structure has often slighted the importance of network dynamics,though a number of recent pieces are addressing this. Time affects networks in two important ways: 1) The structure itself evolves, in ways that will affect the topology an thus flow. 2) The timing of contact constrains information flow Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 1 Data on drug users in Colorado Springs, over 5 years Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 2 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 3 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 4 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 5 Current year in red, past relations in gray Measuring Networks: Flow Time What impact does timing have on flow through the network? C A 2-5 8-9 E B D Numbers above lines indicate contact periods 3-5 F Measuring Networks: Flow Time The path graph for the hypothetical contact network A C E D F B While clearly important, this is not often handled well by current software. Measuring Networks: Structure & Social Space The second broad division for measuring networks steps back to generalized features of the global network. These factors almost always are of interest because of what they imply about how goods move through the network, but have resulted in a distinct line of methods and substantive research. We focus on 3 such factors today: 1) Basic structure of large-scale networks 2) Cohesive Peer Groups 3) Identifying Role positions (blockmodels) Measuring Networks: Large-Scale Models Small World Networks Based on Milgram’s (1967) famous work, the substantive point is that networks are structured such that even when most of our connections are local, any pair of people can be connected by a fairly small number of relational steps. Works on 2 parameters: 1) The Clustering Coefficient (c) = average proportion of closed triangles 2) The average distance (L) separating nodes in the network Measuring Networks: Large-Scale Models Small World Networks C=Large, L is Small = SW Graphs •High probability that a node’s contacts are connected to each other. •Small average distance between nodes Measuring Networks: Large-Scale Models Small World Networks In a highly clustered, ordered network, a single random connection will create a shortcut that lowers L dramatically Watts demonstrates that small world properties can occur in graphs with a surprisingly small number of shortcuts Diffusion / flow implications are unclear, but seem similar to a random graphs where local clusters are reduced to a single point. Measuring Networks: Large-Scale Models Scale-Free Networks Across a large number of substantive settings, Barabási points out that the distribution of network involvement (degree) is highly and characteristically skewed. Measuring Networks: Large-Scale Models Scale Free Networks Many large networks are characterized by a highly skewed distribution of the number of partners (degree) Measuring Networks: Large-Scale Models Scale Free Networks Many large networks are characterized by a highly skewed distribution of the number of partners (degree) p(k ) ~ k Measuring Networks: Large-Scale Models Scale Free Networks The scale-free model focuses on the distance-reducing capacity of high-degree nodes: Measuring Networks: Large-Scale Models Scale Free Networks The scale-free model focuses on the distance-reducing capacity of highdegree nodes, as ‘hubs’ create shortcuts that carry network flow. Measuring Networks: Large-Scale Models Scale Free Networks Colorado Springs High-Risk (Sexual contact only) •Network is approximately scale-free, with = -1.3 •But connectivity does not depend on the hubs. Measuring Networks: Large-Scale Models Social Cohesion White, D. R. and F. Harary. 2001. "The Cohesiveness of Blocks in Social Networks: Node Connectivity and Conditional Density." Sociological Methodology 31:305-59. Moody, James and Douglas R. White. 2003. “Structural Cohesion and Embeddedness: A hierarchical Conception of Social Groups” American Sociological Review 68:103-127 White, Douglas R., Jason Owen-Smith, James Moody, & Walter W. Powell (2004) "Networks, Fields, and Organizations: Scale, Topology and Cohesive Embeddings." Computational and Mathematical Organization Theory. 10:95-117 Moody, James "The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999" American Sociological Review. 69:213238 Measuring Networks: Large-Scale Models Social Cohesion Formal definition of Structural Cohesion: (a) A group’s structural cohesion is equal to the minimum number of actors who, if removed from the group, would disconnect the group. Equivalently (by Menger’s Theorem): (b) A group’s structural cohesion is equal to the minimum number of independent paths linking each pair of actors in the group. Measuring Networks: Large-Scale Models Social Cohesion •Networks are structurally cohesive if they remain connected even when nodes are removed 0 2 1 Node Connectivity 3 Measuring Networks: Large-Scale Models Social Cohesion Structural cohesion gives rise automatically to a clear notion of embeddedness, since cohesive sets nest inside of each other. 2 3 1 9 10 8 4 5 11 7 12 13 6 14 15 17 16 18 19 20 2 22 23 Measuring Networks: Large-Scale Models Social Cohesion Project 90, Sex-only network (n=695) 3-Component (n=58) Measuring Networks: Large-Scale Models Social Cohesion IV Drug Sharing Largest BC: 247 k > 4: 318 Max k: 12 Structural Cohesion simultaneously gives us a positional and subgroup analysis. Connected Bicomponents Measuring Networks: Cohesive Sub Groups A primary interest in Social Network Analysis is the identification of “significant social subgroups” – some smaller collection of nodes in the graph that can be considered, at least in some senses, as a “unit” based on the pattern, strength, or frequency of ties. There are many ways to identify groups. They all insist on a group being in a connected component, but other than that the variation is wide. Measuring Networks: Cohesive Sub Groups Graph Theoretical Models. Start with a clique. A clique is defined as a maximal subgraph in which every member of the graph is connected to every other member of the graph. Cliques are collections of nodes where density = 1.0. Properties of cliques: • Density: 1.0 • Everyone connected to n-1 alters • Distance between every pair is 1 • Ratio of within group ties to between group ties is infinite • All triads are transitive Measuring Networks: Cohesive Sub Groups Graph Theoretical Models. In practice, complete cliques are not very useful. They tend to overlap heavily and are limited in their size. Graph theorists have thus relaxed the complete connectivity requirement (with varying degrees of success). See the Moody & White (2003) for a discussion of these attempts. Measuring Networks: Cohesive Sub Groups Identifying Primary groups: 1) Measures of fit To identify a primary group, we need some measure of how clustered the network is. Usually, this is a function of the number of ties that fall within group to the number of ties that fall between group. 2) Algorithmic approaches to maximizing (1) Once we have such an index, we need a method for searching through the network to maximize the fit. 3) Generalized cluster analysis In addition to maximizing a group function such as (1) we can use the relational distance directly, and look for clusters in the data. We next go over two different styles of cluster analysis Measuring Networks: Cohesive Sub Groups Segregation Index (Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and Research 6411-30.) Freeman asked how we could identify segregation in a social network. Theoretically, he argues, if a given attribute (group label) does not matter for social relations, then relations should be distributed randomly with respect to the attribute. Thus, the difference between the number of cross-group ties expected by chance and the number observed measures segregation. E( X ) X Seg E( X ) Measuring Networks: Cohesive Sub Groups Consider the (hypothetical) network below. There are two attributes in this network: people with Blue eyes and Brown eyes and people who are square or not (they must be hip). Measuring Networks: Cohesive Sub Groups Segregation Index Mixing Matrix: Blue Blue Brown 6 Brown 17 17 16 Seg = -0.25 Hip Square Hip 20 3 Square 3 30 Seg = 0.78 Measuring Networks: Cohesive Sub Groups The segregation index is one metric used to identify groups. Others include: a) The ratio of in-group to out-group ties (Negopy, UCINET Factions) b) Maximizing the probability of in-group contact (CliqueFinder) c) The Segregation Matrix Index (SMI) d) The dyadic factor loadings for overlapping groups (akin to a latent class model) e) Minimize the within-group distance Once a metric has been chosen, some algorithm is needed to search through the graph to identify clusters. These algorithms range from very sophisticated “graph-intelligent” algorithms, such as NEGOPY, to simple cluster analysis of distance matrices. In most cases, you have to pre-set the number of groups to use (the exceptions are NEGOPY and CliqueFinder. Moody’s CROWDS algorithm also has automatic stopping criteria, but you have to give it starting values. Measuring Networks: Cohesive Sub Groups In practice, the different algorithms will give different results. Here, I compare the NEGOPY results to the RNM results. NEGOPY returned one large group, RNM found many smaller, denser groups. It’s usually a good idea to explore multiple solutions and algorithms. Measuring Networks: Cohesive Sub Groups Gangon Prison Network In practice, the different algorithms will give different results. Here, I compare NEGOPY, FACTIONS and RNM. Groups A and B are identical, C is close. F, E and D differ. It’s usually a good idea to explore multiple solutions and algorithms. (all solutions constrained to 6 groups) Measuring Networks: Role Positions Overview •Social life can be described (at least in part) through social roles. •To the extent that roles can be characterized by regular interaction patterns, we can summarize roles through common relational patterns. •Identifying these sets is the goal of block-model analyses. Nadel: The Coherence of Role Systems •Background ideas for White, Boorman and Brieger. Social life as interconnected system of roles •Important feature: thinking of roles as connected in a role system = social structure White, Harrison C.; Boorman, Scott A., and Breiger, Ronald L. Social Structure from Multiple Networks I. American Journal of Sociology. 1976; 81730-780. •The key article describing the theoretical and technical elements of block-modeling Measuring Networks: Role Positions Elements of a Role: •Rights and obligations with respect to other people or classes of people •Roles require a ‘role compliment’ another person who the roleoccupant acts with respect to Examples: Parent - child, Teacher - student, Lover - lover, Friend - Friend, Husband - Wife, etc. Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together. Measuring Networks: Role Positions White et al: From logical role systems to empirical social structures Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as: H W C C C Romantic Love Provides food for Bickers with (and there are, of course, many other relations inside a family!) Measuring Networks: Role Positions The key idea, is that we can express a role through a relation (or set of relations) and thus a social system by the inventory of roles. If roles equate to positions in an exchange system, then we need only identify particular aspects of a position. But what aspect? Structural Equivalence Two actors are structurally equivalent if they have the same types of ties to the same people. Measuring Networks: Role Positions Structural Equivalence A single relation Measuring Networks: Role Positions Structural Equivalence Graph reduced to positions Measuring Networks: Role Positions Blockmodeling: basic steps In any positional analysis, there are 4 basic steps: 1) Identify a definition of equivalence 2) Measure the degree to which pairs of actors are equivalent 3) Develop a representation of the equivalencies 4) Assess the adequacy of the representation Measuring Networks: Role Positions 1) Identify a definition of equivalence Structural Equivalence: Two actors are equivalent if they have the same type of ties to the same people. Measuring Networks: Role Positions Automorphic Equivalence: Actors occupy indistinguishable structural locations in the network. That is, that they are in isomorphic positions in the network. In general, automorphically equivalent nodes are equivalent with respect to all graph theoretic properties (I.e. degree, number of people reachable, centrality, etc.) Measuring Networks: Role Positions Automorphic Equivalence: Measuring Networks: Role Positions Regular Equivalence: Regular equivalence does not require actors to have identical ties to identical actors or to be structurally indistinguishable. Actors who are regularly equivalent have identical ties to and from equivalent actors. If actors i and j are regularly equivalent, then for all relations and for all actors, if i k, then there exists some actor l such that j l and k is regularly equivalent to l. Measuring Networks: Role Positions Regular Equivalence: There may be multiple regular equivalence partitions in a network, and thus we tend to want to find the maximal regular equivalence position, the one with the fewest positions. Measuring Networks: Role Positions Role or Local Equivalence: While most equivalence measures focus on position within the full network, some measures focus only on the patters within the local tie neighborhood. These have been called ‘local role’ equivalence. Note that: Structurally equivalent actors are automorphically equivalent, Automorphically equivalent actors are regularly equivalent. Structurally equivalent and automorphically equivalent actors are role equivalent In practice, we tend to ignore some of these distinctions, as they get blurred quickly once we have to operationalize them in real-world graphs. It turns out that few people are ever exactly equivalent, and thus we approximate the links between the types. In all cases, the procedure can work over multiple relations simultaneously. The process of identifying positions is called blockmodeling, and requires identifying a measure of similarity among nodes. Measuring Networks: Role Positions Once you identify equivalent actors, block them in the matrix and reduce it, based on the number of ties in the cell of interest. The key values are a zero block (no ties) and a one-block (all ties present): 1 2 1 . 1 2 1 . 1 0 3 1 0 0 1 4 0 1 0 0 5 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 3 1 0 . 1 0 0 1 1 1 1 0 0 0 0 4 1 0 1 . 0 0 1 1 1 1 0 0 0 0 0 1 0 0 . 1 0 0 0 0 1 1 1 1 5 0 1 0 0 1 . 0 0 0 0 1 1 1 1 0 0 1 1 0 0 . 0 0 0 0 0 0 0 0 0 1 1 0 0 0 . 0 0 0 0 0 0 6 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 1 1 0 0 0 0 0 0 . 0 0 0 0 0 1 1 0 0 0 0 0 0 0 . 1 2 3 4 5 6 1 0 1 1 0 0 0 2 1 0 0 1 0 0 3 1 0 1 0 1 0 4 0 1 0 1 0 1 5 0 0 1 0 0 0 6 0 0 0 1 0 0 Structural equivalence thus generates 6 positions in the network Measuring Networks: Role Positions Once you partition the matrix, reduce it: . 1 1 1 0 0 0 0 0 0 0 0 0 0 1 . 0 0 1 1 0 0 0 0 0 0 0 0 1 0 . 1 0 0 1 1 1 1 0 0 0 0 1 0 1 . 0 0 1 1 1 1 0 0 0 0 0 1 0 0 . 1 0 0 0 0 1 1 1 1 0 1 0 0 1 . 0 0 0 0 1 1 1 1 0 0 1 1 0 0 . 0 0 0 0 0 0 0 0 0 1 1 0 0 0 . 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 1 1 0 0 0 0 0 0 . 0 0 0 0 0 1 1 0 0 0 0 0 0 0 . 1 1 1 2 1 3 0 1 2 1 1 1 2 3 Regular equivalence (here I placed a one in the image matrix if there were any ties in the ij block) 3 0 1 0 Measuring Networks: Role Positions Operationally, you have to measure the similarity between actors. If two actors are structurally equivalent, then they will have identical ties to other people. Consider the example again: 1 2 1 . 1 2 1 . 1 0 3 1 0 0 1 4 0 1 0 0 5 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 3 1 0 . 1 0 0 1 1 1 1 0 0 0 0 4 1 0 1 . 0 0 1 1 1 1 0 0 0 0 0 1 0 0 . 1 0 0 0 0 1 1 1 1 5 0 1 0 0 1 . 0 0 0 0 1 1 1 1 0 0 1 1 0 0 . 0 0 0 0 0 0 0 0 0 1 1 0 0 0 . 0 0 0 0 0 0 6 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 0 0 1 1 0 0 0 0 . 0 0 0 0 0 0 0 1 1 0 0 0 0 0 . 0 0 0 0 0 0 1 1 0 0 0 0 0 0 . 0 0 0 0 0 1 1 0 0 0 0 0 0 0 . C D Match 1 1 1 0 0 1 . 1 . 1 . . 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 Sum: 12 C and D match on all 12 other people, and are thus structurally equivalent. Measuring Networks: Role Positions If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations, usually including both “directions” of asymmetric relations: H Romance 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W C C C Romantic Love Provides food for Bickers with 0 0 0 0 0 Feeds 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 Bicker 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 1 0 0 Stacked 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 Measuring Networks: Role Positions The metric used to measure structural equivalence by White, Boorman and Brieger is the correlation between each node’s set of ties. For the example, this would be: 1.00 -0.20 0.08 0.08 -0.19 -0.19 0.77 0.77 0.77 0.77 -0.26 -0.26 -0.26 -0.26 -0.20 1.00 -0.19 -0.19 0.08 0.08 -0.26 -0.26 -0.26 -0.26 0.77 0.77 0.77 0.77 0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45 0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45 -0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36 -0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 -0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00 -0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00 -0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00 -0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00 Another common metric is the Euclidean distance between pairs of actors, which you then use in a standard cluster analysis. Measuring Networks: Role Positions Automorphic and Regular equivalence are more difficult to find, and require iteratively searching over possible class assignments for sets that have the same graph theoretic patterns. Usually start with a set of nodes defined as similar on a number of network measures, then look within these classes for automorphic equivalence classes. A theoretically appealing method for finding structures that are very similar to regular equivalence, role equivalence, uses the triad census. Each node is involved in (n-1)(n-2)/2 triads, and occupies a particular position in each of these triads. Measuring Networks: Role Positions Moving from a similarity/distance matrix to a blockmodel: number of groups and determining blocks: “An important decision in an analysis using CONCOR is how fine the partition should be; in other words, when should one stop splitting positions? Theory and the interpretability of the solution are the primary consideration in deciding how many positions to produce.” (W&F, p.378) “In defining positions of actors, the ‘trick’ is to choose the point along the series that gives a useful and interpretable partition of the actors into equivalence classes.” (W&F p.383) Measuring Networks: Role Positions An example: Padgett, J. F. and Ansell, C. K. Robust action and the rise of the Medici, 1400-1434. American Journal of Sociology. 1993; 9812591319. “Political Groups” in the attribute sense do not seem to exist, so P&A turn to the pattern of network relations among families. This is the block reduction of the full 92 family network. Modeling with Networks: Behaviors There are two general approaches to modeling behaviors with network data: 1) Using network measures as variables to predict individual outcomes 2) Network autocorrelation / peer influence models 3) Dyad / QAP models of the similarity of actors and their joint network position Modeling with Networks: Behaviors The simplest way to use network data in research is to include the network measure as a covariate in a standard model: Y = a0 + b(netvars) + b(other vars) + e “netvars” most commonly include: •Functions of each person’s direct contacts attributes •Such as: mean income of friends, proportion of friends who are employed, racial heterogeneity of the friends,etc. •Structural indicators: •Such as: Centrality, dummies for group / role membership, etc. These models are the only option for ego-network data,where information on network alters is collected from a single respondent’s (ego’s) report. They can be used from extractions of partial or complete data, but the error term is – by definition – autocorrelated. Cases are not independent, but connected through the social relations Modeling with Networks: Behaviors Network Autocorrelation models (aka Peer Influence models): Friedkin, N. E. 1984. "Structural Cohesion and Equivalence Explanations of Social Homogeneity." Sociological Methods and Research 12:235-61. ———. 1998. A Structural Theory of Social Influence. Cambridge: Cambridge. Friedkin, N. E. and E. C. Johnsen. 1990. "Social Influence and Opinions." Journal of Mathematical Sociology 15(193-205). ———. 1997. "Social Positions in Influence Networks." Social Networks 19:209-22. Y () αWY () ~ Xb e Where W is a direct function of the adjacency matrix, and a is the estimated value of peer influence. Modeling with Networks: Behaviors There are two general ways to test for peer influence in an observed network. The first estimates the parameters (a and b) of the peer influence model directly, the second transforms the network into a dyadic model, predicting similarity among actors. Peer influence model: See Doreian, Patrick. “Maximum likelihood methods for linear models Spatial Effects and Spatial Disturbances Terms.” Sociological Methods and Research. 1982; 10243-269. Gould, Roger V. Multiple Networks and mobilization in the Paris Commune, 1871. American Sociological Review. 1991; 56716-729. (applied example) Y () αWY () ~ Xb e Modeling with Networks: Behaviors The basic model says that people’s opinions are a function of the opinions of others and their characteristics. Y () αWY () ~ Xb e WY = A simple vector which can be added to your model. That is, multiply Y by a W matrix, and run the regression with WY as a new variable, and the regression coefficient is an estimate of a. This is what Doriean calls the QAD (“Quick and Dirty” estimate of peer influence, and is equivalent (under certain assumptions) to adding the mean of ego’s friends to the model. Modeling with Networks: Behaviors The problem with the above regression is that cases are, by definition, not independent. In fact, WY is also known as the ‘network autocorrelation’ coefficient, since a ‘peer influence’ effect is an autocorrelation effect -- your value is a function of the people you are connected to. In general, OLS is not the best way to estimate this equation. That is, QAD = Quick and Dirty, and your results will not be exact. In practice, the QAD approach (perhaps combined with a GLS estimator) results in empirical estimates that are “virtually indistinguishable” from MLE (Doreian et al, 1984) The proper way to estimate the peer equation is to use maximum likelihood estimates, and Doreian gives the formulas for this in his paper. The other way is to use non-parametric approaches, such as the Quadratic Assignment Procedure, to estimate the effects. Modeling with Networks: Behaviors An empirical Example: Peer influence in the OSU Graduate Student Network. Each person was asked to rank their satisfaction with the program, which is the dependent variable in this analysis. I constructed two W matrices, one from HELP the other from Best Friend. I treat relations as symmetric and valued, such that: 1 if Aijt 1 or A jit 1 Wijt 2 if Aijt 1 and A jit 1 0 otherwise Wij 1 j Wii 0 I also include Race (white/Non-white, Gender and Cohort Year as exogenous variables in the model. Modeling with Networks: Behaviors An empirical Example: Peer influence in the OSU Graduate Student Network. Distribution of Satisfaction with the department. Modeling with Networks: Behaviors Parameter Estimates Variable Parameter Estimate Standardized Pr > |t| Estimate Intercept FEMALE NONWHITE y00 y99 y98 y97 PEER_BF PEER_H 2.60252 -1.07540 -0.22087 0.93176 -0.19375 -0.45912 0.60670 0.23936 0.50668 0.0931 0.0142 0.5975 0.0798 0.7052 0.4637 0.3060 0.0002 0.0277 0 -0.25455 -0.05491 0.21627 -0.04586 -0.08289 0.11919 0.42084 0.23321 Model R2 = .41, compared to .15 without the peer effects Modeling with Networks: Behaviors Dyad QAP models Another way to get at peer influence is not through the level of Y, but through the extent to which actors are similar with respect to Y. The model is now expressed at the dyad level as: Yij b0 b1 Aij bk X k eij k Where Y is a matrix of similarities, A is an adjacency matrix, and Xk is a matrix of similarities on attributes Modeling with Networks: Behaviors Dyad QAP models NODE 1 2 3 4 5 6 7 8 9 0 1 1 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 ADJMAT 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 1 0 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 1 SAMERCE 0 0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 0 0 1 1 0 1 0 1 0 0 0 1 1 0 1 1 0 0 1 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 0 0 1 SAMESEX 1 1 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 Modeling with Networks: Behaviors Dyad QAP models Y 0.32 0.59 0.54 0.50 0.04 0.02 0.41 0.01 -0.17 Distance (Dij=abs(Yi-Yj) .000 .277 .228 .181 .278 .277 .000 .049 .096 .555 .228 .049 .000 .047 .506 .181 .096 .047 .000 .459 .278 .555 .506 .459 .000 .298 .575 .526 .479 .020 .095 .182 .134 .087 .372 .307 .584 .535 .488 .029 .481 .758 .710 .663 .204 .298 .575 .526 .479 .020 .000 .392 .009 .184 .095 .182 .134 .087 .372 .392 .000 .401 .576 .307 .584 .535 .488 .029 .009 .401 .000 .175 .481 .758 .710 .663 .204 .184 .576 .175 .000 Modeling with Networks: Behaviors Dyad QAP models The REG Procedure Model: MODEL1 Dependent Variable: SIM Analysis of Variance Source DF Sum of Squares Model Error Corrected Total 4 31 35 0.90657 0.75591 1.66248 Root MSE Dependent Mean Coeff Var 0.15615 0.33161 47.08929 Mean Square 0.22664 0.02438 R-Square Adj R-Sq F Value Pr > F 9.29 <.0001 0.5453 0.4866 Parameter Estimates Variable Intercept NOM SAMERCE SAMESEX NCOMFND DF Parameter Estimate Standard Error t Value Pr > |t| 1 1 1 1 1 0.51931 -0.17054 0.05387 -0.06535 -0.16134 0.05116 0.05963 0.05916 0.05365 0.03862 10.15 -2.86 0.91 -1.22 -4.18 <.0001 0.0075 0.3696 0.2324 0.0002 Modeling with Networks: Behaviors Dyad QAP models Like the basic Peer influence model, cases in dyad models are not independent. However, the non-independence now comes from two sources: (1) the fact that the same person is represented in (n-1) dyads and (2) that i and j are linked through relations. One of the best solutions to this problem is QAP: Quadratic Assignment Procedure. A non-parametric procedure for significance testing. QAP runs the model of interest on the real data, then randomly permutes the rows/cols of the data matrix and estimates the model again. In so doing, it generates an empirical distribution of the coefficients, generating n levels of the coefficients at ‘chance’ levels, which you then compare to the observed data. This is implemented in UCINET for regression, and in DAMN for logistic regression (J.L. Martin). Modeling with Networks: Behaviors Dyad QAP models Procedure: 1. Calculate the observed association / model 2. for K iterations do: a) randomly sort one of the matrices b) recalculate the association / model c) store the outcome 3. compare the observed outcome to the distribution of outcomes created by the random permutations. Modeling with Networks: Behaviors Dyad QAP models Comparing multiple networks: QAP Modeling with Networks: Behaviors Dyad QAP models Modeling with Networks: Behaviors Dyad QAP models MULTIPLE REGRESSION QAP W/ MISSING VALUES -------------------------------------------------------------------------------- # of permutations: Diagonal valid? Random seed: Dependent variable: Expected values: Independent variables: 2000 NO 533 EX_SIM c:\moody\Classes\soc884\examples\UCINET\mrqap-predicted EX_NCOM EX_ADJ EX_SRCE EX_SSEX Number of valid observations among the X variables = 72 N = 72 Number of permutations performed: 1999 MODEL FIT R-square Adj R-Sqr Probability # of Obs -------- --------- ----------- ----------0.545 0.525 0.029 72 REGRESSION COEFFICIENTS Un-stdized Stdized Proportion Proportion Independent Coefficient Coefficient Significance As Large As Small ----------- ----------- ----------- ------------ ----------- ----------Intercept 0.519314 0.000000 0.012 0.012 0.988 EX_NCOM -0.161337 -0.541828 0.011 0.989 0.011 EX_ADJ -0.170539 -0.381186 0.020 0.980 0.020 EX_SRCE 0.053864 0.124551 0.236 0.236 0.764 EX_SSEX -0.065364 -0.151144 0.180 0.820 0.180 Note that the coefficient values will be identical, but the p values differ Modeling with Networks: Behaviors Dyad QAP models A substantive question raised with any kind of network autocorrelation model is whether observed associations between network structure and behaviors is due to selection or influence. Theory is your best friend here, as there is no fool proof method to distinguish the two. However, recent work has made great progress using individual-level fixed effect models (sometimes random effects models), where the network features vary over time. This removes any stable characteristic that might account for selection into a particular group. Modeling with Networks: Structure Dyad QAP models While the most common way to use QAP models is to predict the similarity on some substantive variable, one can just as easily predict the presence/absence of a relation given attribute similarity. This makes it possible to model the network itself, and ask questions about how particular structures form. Modeling with Networks: Structure Exponential Random Graph Models (p*) A long research tradition in statistics and random graph theory has lead to parametric models of networks. These are models of the entire graph, though as we will see they often work on the dyads in the graph to be estimated. Substantively, the approach is to ask whether the graph in question is an element of the class of all random graphs with the given known elements. For example, all graphs with 5 nodes and 3 edges, or, put probabilistically, the probability of observing the current graph given the conditions. Modeling with Networks: Structure Exponential Random Graph Models (p*) The earliest approaches are based on simple random graph theory, but there’s been a flurry of activity in the last 10 years or so. Key references: - Holland and Leinhardt (1981) JASA - Frank and Strauss (1986) JASA - Wasserman and Faust (1994) – Chap 15 & 16 - Wasserman and Pattison (1996) Thanks to Mark Handcock for sharing some figures/slides about these models. Modeling with Networks: Structure Exponential Random Graph Models (p*) exp{ z ( x)} p ( X x) ( ) Where: is a vector of parameters (like regression coefficients) z is a vector of network statistics, conditioning the graph is a normalizing constant, to ensure the probabilities sum to 1. Modeling with Networks: Structure Exponential Random Graph Models (p*) The simplest graph is a Bernoulli random graph,where each Xij is independent: p( X x) exp{ ij xij } i, j ( ) Where: ij = logit[P(Xij = 1)] () =P[1 + exp(ij )] Note this is one of the few cases where () can be written. Modeling with Networks: Structure Exponential Random Graph Models (p*) Typically, we add a homogeneity condition, so that all isomorphic graphs are equally likely. The homogeneous bernulli graph model: p( X x) exp { xij } Where: () =[1 + exp()]g i, j ( ) Modeling with Networks: Structure Exponential Random Graph Models (p*) If we want to condition on anything much more complicated than density, the normalizing constant ends up being a problem. We need a way to express the probability of the graph that doesn’t depend on that constant. It turns out we can do this by conditioning on a ‘complement’ graph. First some terms: X i, j Sociomatri x with ij element forced to 1 X i, j Sociomatri x with ij element forced to 0 X ic, j Sociomatri x with no tie between i and j Modeling with Networks: Structure Exponential Random Graph Models (p*) After some algebra: p( X ij 1 | X ijc ) ij log [ z ( xij ) z ( xij )] c p( X ij 0 | X ij ) Note that we can now model the conditional probability of the graph, as a function of a set of difference statistics, without reference to the normalizing constant. The model, then, simply reduces to a logit model on the dyads. Modeling with Networks: Structure Exponential Random Graph Models (p*) Fitting p* models I highly recommend working through the p* primer examples, which can be found at: http://kentucky.psych.uiuc.edu/pstar/index.html Including: A Practical Guide To Fitting p* Social Network Models Via Logistic Regression The site includes the PREPSTAR program for creating the difference variables of interest. Modeling with Networks: Structure Exponential Random Graph Models (p*) 1 2 3 |4 5 6 1 1 1 2 1 1 3 1 1 1 x 4 1 1 5 1 6 1 1 1 2 3 6 4 5 We can model this network based on parameters for overall degree of Choice (), Differential Choice Within Positions (W), Mutuality(), Differential Mutuality Within Positions (W), and Transitivity (T). The vector of model parameters to be estimated is: = { W W T }. Modeling with Networks: Structure Exponential Random Graph Models (p*) proc logistic descending ; tie = l lw m mw tt / noint; run; 1 2 3 6 4 5 L = Choice LW = Within Group M = Mutuality MW = Mutual within Group TT = Transitivity Substantively, this graph is likely from the random class of graphs with similar mutuality and size Modeling with Networks: Structure Exponential Random Graph Models (p*) One practical problem is that the resulting values are often quite correlated, making estimation difficult. This is particularly difficult with “star” parameters. lw m mw tt lw 1.00000 0.58333 0.0007 0.80178 <.0001 0.15830 0.4034 m 0.58333 0.0007 1.00000 0.80178 <.0001 -0.02435 0.8984 mw 0.80178 <.0001 0.80178 <.0001 1.00000 -0.11716 0.5375 tt 0.15830 0.4034 -0.02435 0.8984 -0.11716 0.5375 1.00000 Modeling with Networks: Structure Exponential Random Graph Models (p*) Parameters that are often fit include: 1) Expansiveness and attractiveness parameters. = dummies for each sender/receiver in the network 2) Degree distribution 3) Mutuality 4) Group membership (and all other parameters by group) 5) Transitivity / Intransitivity 6) K-in-stars, k-out-stars 7) Cyclicity Modeling with Networks: Structure Comparing to Random Graphs A conceptual merge between random graph models and QAP models is to identify a sample of graphs from the universe you are trying to model. So, instead of estimating: exp{ z ( x)} p ( X x) ( ) generate X empirically, then compare z(x) to see how likely a measure on x would be given X. The difficulty, however, is generating X. Modeling with Networks: Structure Comparing to Random Graphs The first option would be to generate all isomorphic graphs within a given constraint. This is possible for small graphs, but the number gets large fast. For a network with 3 nodes, there are 16 possible directed graphs. For a network with 4 nodes, there are 218, for 5 nodes 9608, for 6 nodes1,540,944, and so on… So, the best approach is to sample from the universe, but, of course, if you had the universe you wouldn’t need to sample from it. How do you sample from a population you haven’t observed? Use a construction algorithm that generates a random graph with known constraints. Modeling with Networks: Structure Comparing to Random Graphs Example: Bearman, Peter S., James Moody and Katherine Stovel (2004) “Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks” American Journal of Sociology 110:44:92 Romantic Relations in Jefferson High Modeling with Networks: Structure Comparing to Random Graphs Simulate random networks with similar degree distribution: Modeling with Networks: Structure Comparing to Random Graphs Simulated networks preserve observed degree, isolated dyad distribution, and four-cycle constraint Modeling with Networks: Structure Comparing to Random Graphs Simulated networks preserve observed degree, isolated dyad distribution, and four-cycle constraint: 4 examples from the simulated set Social Network Software UCINET •The Standard network analysis program, runs in Windows •Good for computing measures of network topography for single nets •Input-Output of data is a special 2-file format, but is now able to read PAJEK files directly. •Not optimal for large networks •Available from: Analytic Technologies Social Network Software PAJEK •Program for analyzing and plotting very large networks •Intuitive windows interface •Used for most of the real data plots in this presentation •Started mainly a graphics program, but has expanded to a wide range of analytic capabilities •Can link to the R statistical package •Free •Available from: Social Network Software Cyram Netminer for Windows •Newest Product, not yet widely used •Price range depends on application •Limited to smaller networks O(100) http://www.netminer.com/NetMiner/home_01.jsp Social Network Software NetDraw •Also very new, but by one of the best known names in network analysis software. •Free •Limited to smaller networks O(100) Social Network Software NEGOPY •Program designed to identify cohesive sub-groups in a network, based on the relative density of ties. •DOS based program, need to have data in arc-list format •Moving the results back into an analysis program is difficult. •Available from: William D. Richards http://www.sfu.ca/~richards/Pages/negopy.htm SPAN - Sas Programs for Analyzing Networks (Moody, ongoing) •is a collection of IML and Macro programs that allow one to: a) create network data structures from nomination data b) import/export data to/from the other network programs c) calculate measures of network pattern and composition d) analyze network models •Allows one to work with multiple, large networks •Easy to move from creating measures to analyzing data •Available by sending an email to: Moody.77@sociology.osu.edu