Social Network Analysis Tutorial Rob Cross University of Virginia robcross@virginia.edu 1 Social network analysis tutorial Planning and Administering a Network Analysis Visual Analysis of Social Networks Quantitative Analysis of Social Networks 2 Planning and administering a network analysis Selecting an Appropriate Group Survey Design Administering the Survey Formatting Data 3 Social network analysis tutorial Planning and Administering a Network Analysis Visual Analysis of Social Networks Quantitative Analysis of Social Networks 4 Organizational Network Analysis Software There are numerous network analysis software packages available. We use the following. • UCINET: Windows based tool which is used to manipulate and analyze the data. It includes a comprehensive range of network techniques. See www.analytictech.com • NetDraw: Visualization software that creates pictures of networks. It can also incorporate attribute data into the diagrams. See www.analytictech.com • Pajek: Sophisticated visualization software available from http://vlado.fmf.uni-lj.si • Mage: Three dimensional drawing tool available from ftp://152.174.194/pcprograms/Win95_98_2000/ 5 An Overview of UCINET 6 Transferring Data from Excel 7 Transferring Excel Matrix Data into UCINET Step 1. Copy data from Excel Step 2. Paste into spreadsheet editor in UCINET Step 3. Save as “info,” etc. 8 Transferring Attribute Data into UCINET Step 1. Copy data from Excel Step 2. Paste into spreadsheet editor in UCINET Step 3. Save as “attrib” 9 Opening Data in NetDraw Step 1. File > Open > Ucinet dataset > Network Step 2. Choose network dataset (info.##h) 10 Opening Data in NetDraw Step 1. Click - open folder icon Step 2. Click - box Step 3. Choose network dataset (info.##h), then click OK. 11 Dichotomizing in NetDraw Step 1. Choose “>=” and “4” 12 Using Drawing Algorithm in NetDraw Step 1. Choose Step 2. Choose option on tool bar = option on tool bar 13 Using Attribute Data in NetDraw Step 1. Click - open folder icon A Step 2. Click - box Step 3. Choose attribute dataset (attrib.##h), then click OK. 14 Choosing Color Attribute in NetDraw Step 1. Select “Nodes” Step 2. Select “Region” Step 3. Place a check mark in the color box 15 Selecting Nodes in NetDraw Step 1. Default is all groups selected. To remove one group, e.g. group 2, remove check from box 16 Selecting Egonets in NetDraw Step 1. Layout > Egonets Step 2. Choose egonet initials, e.g. BM 17 Changing the Size of Nodes in NetDraw Step 1. Properties > Nodes > Size > Attribute-based Step 2. Select attribute, e.g. gender 18 Changing the Shape of Nodes in NetDraw Step 1. Properties > Nodes > Shape > Attribute-based Step 2. Select attribute, e.g. hierarchy 19 Changing the Size of Lines in NetDraw Step 1. Properties > Lines > Size > Tie strength Step 2. Select minimum =1 and maximum = 5 20 Changing the Color of Lines in NetDraw Step 1. Properties > Lines > Color > Node attribute-based Step 2. Select attribute, then choose within, between or both 21 Deleting Isolates in NetDraw Step 1. Select Iso option on the toolbar 22 Combining Relations in NetDraw Step 1. Properties > Lines > Boolean selection Step 2. Select relations, e.g. info and value Step 3. Select cut-off operators and values, e.g. >= 4 23 Resizing and Re-centering in NetDraw Step 1. Layout > Move/Rotate Step 2. Select “Center” option 24 Saving Pictures in NetDraw Step 1. File > Save diagram as > Bitmap Step 2. Choose file name, e.g. “infoge4region” 25 The information seeking and information giving networks are both loosely connected. This represents an opportunity to improve knowledge re-use and leverage throughout the group. “From whom do you typically seek work-related information?” Network Measures Density 5% Cohesion n/a Centrality 15 I do not typically seek information from this person Network Measures “From whom do you typically give work-related information?” Network Measures Density 5% Cohesion n/a Centrality 15 I do not typically give information to this Network Measures Density 5% Density 4% Cohesion 2.6 Cohesion 2.6 Centrality 12 Centrality 13 I do typically seek information from this person I do typically give information to this person 26 Visual Data Display: Packing info in and allowing time for interpretation… Information: “How often do you typically turn to this person for information to get your work done? Network includes responses to this statement of often to continuously (4,5&6). Location = Location 1 = Location 2 = Location 3 = Location 4 = Location 5 = Location 6 = Location 7 = Location 8 = Location 9 = Location 10 = Location 11 = Location 12 Network Measures Density = 3% Cohesion = 4.0 Centrality = 3.1 27 Social network analysis tutorial Planning and Administering a Network Analysis Visual Analysis of Social Networks Quantitative Analysis of Social Networks 28 Quantitative Analysis of Organizational Networks Measures of Network Connection Measures of Centrality Cross Boundary Analysis 29 Dichotomizing Valued Data The survey data that we collect is usually valued data. Although we can use valued data in UCINET we prefer to take different cuts of the data. For example, we may want to examine the data where people only responded “strongly agree” to a question. To do this we dichotomize the data i.e. convert it to zeros and ones where one means strongly agree and zero means any other response. Step 3. Choose cut-off op. and value (e.g. GE and 4) Step 1. Transform > Dichotomize Step 2. Choose input dataset (info.##h) Step 4. Specify output data set (infoGE4.##h) 30 Measures of Network Connection Network Connection Centrality Cross Boundary Analysis Density • Shows overall level of connection within a network. • We can also look at ties within and between groups. Distance • Shows average distance for people to get to all other people. • Shorter distances mean faster, more certain, more accurate transmission / sharing. 31 Density Low Density (25%) Avg. Dist. = 2.27 Network Connection Centrality Cross Boundary Analysis High Density (39%) Avg. Dist. = 1.76 Number of ties, expressed as percentage of the number of pairs Dense networks have more face-to-face relationships 32 Quantitative Analysis: Density Network Connection Centrality Cross Boundary Analysis Density of this network is 8%. Step 1. Network > Cohesion > Density Step 2. Input dataset “infoge4.##h” 33 Distance Short average distance Network Connection Centrality Cross Boundary Analysis Long average distance Average number of steps to reach all network participants Lower scores reflect a group better able to leverage knowledge 34 Quantitative Analysis: Distance Network Connection Centrality Cross Boundary Analysis Average Distance is 3.5 Step 1. Network > Cohesion > Distance Step 2. Input dataset “infoge4.##h” 35 Measures of Centrality Network Connection Centrality Cross Boundary Analysis Degree Centrality: How well connected each individual is. Betweenness Centrality: Extent to which individuals lie along short paths. Closeness Centrality: How far a person is from all others in the network. 36 Degree Centrality Network Connection Centrality Cross Boundary Analysis y x Communication Network degree of X is 7 Seek Advice Network in-degree of Y is 5 How well connected each individual is Technical definition: Number of ties a person has 37 Closeness Centrality Network Connection Centrality Cross Boundary Analysis c a f i h d j b g e Closeness of F is 13 How far a person is from all others in the network Index of how quickly information can flow to that person Technical definition: Total number of links along shortest paths from the individual to each other individual 38 Betweenness Centrality Network Connection Centrality Cross Boundary Analysis c a f k l j m h d b g e Betweenness of h is 28.33 Extent to which individuals lie along short paths Index of potential to play brokerage, liaison or gatekeeping Technical definition: number of times that a person lies along the shortest path between two others, adjusted for number of alternative shortest paths 39 Without the twelve most central people the network is 26% less well connected, reflecting a vulnerability in the group “From whom do you typically seek work-related information?” Network Measures Density = 5% Cohesion = 2.6 Centrality = 12 Without 12 central people Network Measures Density = 3% Cohesion = 2.8 Centrality = 9 Responses of I do typically seek information from this person 40 Pulling People Dynamically From the Network… 41 Quantitative Analysis: Degree Centrality Network Connection Centrality Cross Boundary Analysis Step 1. Network > Centrality > Degree 42 Quantitative Analysis: Degree centrality Network Connection Centrality Cross Boundary Analysis Step 2. Input dataset “infoge4.##h” Step 3. Choose whether to treat data as symmetric. If you choose “no” it will calculate separate figures for the people you go to and the people that go to you. 43 Quantitative Analysis: Degree Centrality Network Connection Centrality Cross Boundary Analysis In-degree for HA is 7 44 Quantitative Analysis: Degree Centrality Network Connection Centrality Cross Boundary Analysis Average in-degree is 3.7 In-degree Network Centralization is 12% 45 Opportunities exist to re-distribute relational load. Focus on ways to delayer those in the top right quadrant (info access, decision rights, role) while also better leveraging those in the bottom quadrant “From whom do you typically seek work-related information?” # People Receives Information From 90.00 80.00 Integrators High Info Sources 70.00 60.00 279 163 78 170 117 295 50.00 196 37 93 272 40.00 90 255 53 275 119 30.00 278 263 6 171 26 201 141 248177 5161 273 54299 8 300 178 19722 233 118 9 212 16 82 52 229 211 55 203 308 174 113 184 158 199 7 249 20.00 135 268 3 147 140 294 270 133 28 303 175 243 169 95 81 224 69 241 127 30 286 245189 126 191 105 202 45 265 230 14 198 35 217 234 132 39 3874 5 220 301 240 59 36 221 24 296 143 164 315 100 231 183 75 144 87 19 29 155 48 10.00 32 302 195 292 216 27256 99 60 205 242 190 101 269 57131 56 185 153 23 102 148176 210 92 91 264 258 1 213 257 317 89 237 47192 44 167 246 15 244 188 222 2106 316 10 209 312 149 223 206 120 43 280 34 314 139 116 281 193 247 67 111 50 276 145 311 136 112 239 160 266 173 187 High Info Seekers 0.00 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 # People Each Person Seeks Information From * Calculations based on people who responded to the survey only 46 Opportunities exist to re-distribute relational load. Focus on ways to de-layer those in the top quadrant (info access, decision rights, role) while also better leveraging those in the bottom quadrant # People Receives Information From 50 Integrators High Info Sources 40 BKA/BA/Research Analyst Assoc/Know. Assoc Speialist/Sr. Spec 30 M anager EM /PKM Assoc Principal 20 Partner External Admin/Assistant 10 High Info Seekers 0 0 10 20 30 40 50 # People Each Person gives Information To 47 Predicting Satisfaction Social Network Level of Satisfaction: Neutral Satisfied Very Satisfied • There is a statistically significant relationship between Social OutDegree and Level of Satisfaction. (0.022) • Correlation: 0.375 48 Showing performance implications can quickly get people’s attention… HelpOut HelpIn KnowOut KnowIn KnbefOut knbefin SocOut SocIn Sat 10 13 36 30 34 30 25 24 10 14 16 32 26 24 27 35 0 2 6 4 3 1 6 5 1 6 17 26 22 22 15 17 0 3 10 6 4 6 0 3 12 5 31 16 22 18 22 19 0 5 3 19 23 26 3 12 3 6 28 30 11 15 25 25 5 8 14 19 12 15 16 19 16 20 30 39 34 34 38 37 8 10 34 36 29 29 19 29 19 15 42 35 40 37 22 22 7 10 33 31 22 21 34 34 53 31 38 37 34 33 22 28 13 8 34 29 10 7 34 30 23 18 38 34 27 28 29 28 9 9 26 19 14 14 28 23 11 13 39 31 15 18 43 36 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 49 Cross-boundary Analysis Network Connection Centrality Cross Boundary Analysis Density across boundaries: How connected are groups within themselves and with other pre-defined groups. This view can be used for different boundaries. We have used the following in our research: • Function or other designation of skill or knowledge. • Geographic location (even if only different floors). • Hierarchical level. • Time in organization or time in department. • Personality traits. • Gender (interesting though may be inflammatory). Brokers: Which individuals are the links between other groups. Brokers can be beneficial conduits of information but they can also hold up the flow of information. 50 Cross-boundary Analysis Network Connection Centrality Cross Boundary Analysis Information Network: Density as related to practice Please indicate how often you have turned to this person for information or advice on workrelated topics in the past three months (response of often or very often). Healthcare Government IT Oil & Gas Pharmaceuticals Industrial Healthcare Government 17% 0% 0% 17% 0% 0% 4% 0% 35% 0% 1% 9% IT 0% 0% 0% 0% 0% 9% Oil & Gas Pharmaceuticals Industrial 7% 38% 0% 0% 0% 10% 0% 0% 6% 19% 3% 8% 1% 49% 0% 12% 1% 8% 51 Density Across Practice Network Connection Centrality Cross Boundary Analysis Tip: Col 3 is the column that includes the practice attribute. You can select different columns for different attributes Step 1. Network > Cohesion > Density Step 2. Input dataset “infoge4.##h” Step 3. Row Partitioning “Attrib col 3 Step 4. Column Partitioning “Attrib col 3 52 Broker Categories Network Connection Centrality Cross Boundary Analysis Ego Coordinator - This person connects people within their group. A Ego Gatekeeper - This person is a buffer between their own group and outsiders. Influential in information entering the group. B A Ego Representative - This person conveys information from their group to outsiders. Influential in information sharing. B A B 53 Quantitative Analysis: Broker Metrics Network Connection Centrality Cross Boundary Analysis Tip: Col 2 is the column that includes the gender attribute. You can select different columns for different attributes Step 1. Network > Ego networks > Brokerage Step 2. Input dataset “infoge4.##h” Step 3. Partition vector “attrib col 2” 54 Additional Quantitative Analysis Symmetrization & Verification Scatter Plots Combining Networks QAP Correlation and Regression 55 Symmetrizing Data Bill John Bill says he communicated with John last week, but John doesn’t mention communicating with Bill Three options • take the conservative option, and put no tie between John and Bill (minimum) • take the liberal option, and put a tie between John and Bill (maximum) • take the average, assigning a tie strength of 0.5 for the relationship between John and Bill (average) 56 Symmetrizing Data (Continued) Tip: See previous slide for how to choose the most applicable symmetrizing method. Step 1. Transform > Symmetrize Step 2. Input dataset “infoge4.##h” Step 3. Symmetrizing method “maximum” Step 4. Output dataset “Syminfoge4.##h” 57 Verification of Asymmetric Data You have both “Give information to” and “Get information from” networks If A says they give info to B, then B must say that they get info from A Tip: The new matrix “newinfo” can now be used for various visual and quantitative analysis. Step 1. Tools > Matrix algebra Step 2. In the Enter Command box type “newinfo = average(transpose(infofrom),infoto)” Step 3. Enter 58 Scatterplots Step 1. Create attribute file spreadsheet editor in UCINET. Each column is taken from the In-degree numbers in the Degree Centrality function. Step 2. Save as “Indegree” 59 Scatterplots (Continued) Step 1. Tools > Scatterplot Step 2. File name “Indegree” Step 3. Choose X and Y axis Step 4. To move initials – point and click Step 5. To save - File > Save as 60 Combining Networks In the picture to the left you can see the information network. In the picture below is the combined information and value network. 61 Combining Networks (Continued) Tip: The new matrix “infovalue” can now be used for various visual and quantitative analysis. Step 1. Tools > Matrix Algebra Step 2. In the Enter Command box type “infovalue = mult(infoge4,valuege4)” 62 QAP Correlation Step 1. Tools > Testing Hypothesis > Dyadic (QAP) > QAP Correlations Step 2. 1st Data Matrix “InfoGE4” Step 3. 2nd Data Matrix “ValueGE4” 63 QAP Regression Adjusted R-Square of 0.214 indicates a moderate relationship between the two social relations. The probability of 0.000 indicates that it is statistically significant. Step 1. Tools > Testing Hypothesis > Dyadic (QAP) > QAP Regression > Original (Y-permutation) method Step 2. Dependent variable “InfoGE4” Step 3. Independent variable “ValueGE4” 64