SOC 206 • INTRODUCTION TO • NETWORK ANALYSIS I U.S. interstate highway system The switch yard at New York's Niagara Project became useless metalwork when a blackout struck the eastern United States http://www.sfgate.com/cgi-bin/object/article?f=/c/a/2003/08/15/MN191082.DTL&o=0 Network of physical interactions between nuclear proteins [...] consisting of all proteins that are known to be localized in the yeast nucleus [...], and which interact with at least one other protein in the nucleus. This subset consists of 318 interactions between 329 proteins. Note that most neighbors of highly connected nodes have rather low connectivity. Maslov and Sneppen 2002 http://arxiv.org/abs/condmat/0205380 Protein networks in yeast Spread of TB Black nodes are persons with clinical disease (and are potentially infectious), pink nodes represent exposed persons with incubating (or dormant) infection and are not infectious, green represent exposed persons with no infection and are not infectious. The infection status is unknown for the grey nodes. Unfortunately the 'social butterfly' in this community, the black node in the center of the graph, is also the most infectious -- a super spreader. http://www.orgnet.com/contagion.html Scientific collaboration The largest component of the Santa Fe Institute collaboration network, with the primary divisions detected by our algorithm indicated by different vertex shapes. http://www.pnas.org/content/99/12/7821.f ull.pdf+html The Internet http://www.lumeta.com/research/ http://iserp.columbia.edu/files/iserp/2002_04.pdf also as Berman et al. Chains of Affection, AJS 2004 Management hierarchy of a major corporation and decision-making conversations What do the decision-making links reveal about this organization? Some advice flows along formal ties [within the hierarchy], while other advice flows along informal ties [outside of the hierarchy]. There is strong triangle of input and feedback amongst Directors 2 and 3 and the General Manager. These strong, trusting ties have grown and solidified over many years of working together. Director 1 is new to the organization. Manager 12 was hoping to get this position, but Corporate strongly pushed for Director 1. Notice that Manager 12 is still locally influential in the decision-making network. Director 1 does not include input from direct reports in decision-making [ remember A --> B means that A seeks out B ] ! Director 4 is about to retire. He used to run this division when it was much smaller. Unlike Director 1, Director 4 does include inputs from his staff. The decision-making patterns in the departments of Directors 2 and 3 are quite different from the pattern of links in the departments of Directors 1 and 4. Directors 2 and 3 seek information from all levels of the organization -- their departments show both vertical and horizontal flows. Several managers in these departments [23, 24, 34, and 35] are boundary spanners -- connecting to others outside of their immediate group. Departments 2 and 3 are an example of participatory decisionmaking -- including inputs from up and down the hierarchy, as well as inside and outside the department. Who do you see as the most influential person[s] in shaping decisions in this organization? http://www.orgnet.com/decisions.html Interlocking Directorates in the Corporate Community http://sociology.ucsc.edu/whorulesamerica/power/corporate_community.html Political books purchased in August 2008 at Amazon.com http://www.orgnet.com/divided.html Based on the pattern of connections between the books in the map above, the most influential political books at the end of the summer 2008 are: What Happened (White House spokesman Scott McClellan’s tell-all) and The Post American World (Fareed Zakaria’s book on the rise of regional powers -- neither addressed the ongoing election. Mark Lombardi: Global (Conspiracy) Networks Social Network Analysis of the 9-11 Terrorist Network http://www.orgnet.com/hijackers.html http://socialsim.wordpress.com/2007/03/01/another-fabulous-network-image-academy-award-thanks/ World Trade in 1981 and 1992 Lothar Krempel. The structure of world trade of between 28 OECD countries in 1981 and 1992. The size of the nodes gives the volume of flows in dollars (imports and exports) for each country . The size of the links stands for the volume of trade between any two countries. Colors give respectively the regional memberships in different trade organisations: EC countries (yellow), EFTA countries (green), USA and Canada (blue), Japan (red), East Asian Countries (pink), Oceania (Australia , New Zealand) (black). http://www.mpi-fg-koeln.mpg.de/~lk/netvis/trade/WorldTrade.html Social Network Analysis • A network is a set of objects/nodes and a set of connections/ties between them • In a social network, nodes can be people, groups, organizations, countries, physical or cultural objects created or used by people, people’s thoughts or activities, etc. just about anything • Explanations based on network ties are usually categorized as network approach or structural approach • SNA is mathematical, but not necessarily statistical • SNA is not just about methods, it’s theory too Barbasi: Scale-Free Networks • • • • Scale free networks obey the power law The power law posits that the distribution of links in a network follows a highly skewed distribution where a few has a great number of ties while the rest have few In a scale free network new nodes form by preferential attachment – – – – • On the internet, the number of hyperlinks follow the power law. – • i.e. those with more connections will get even more This is also known as the Matthew principle (Merton) (see also “rich gets richer,” “cumulative advantage,” “increasing returns to scale,” network externalities”) The more a site is linked the more new links it will attract. Hence you have a few giga sites like Google and millions of sites with only a few links to it. Some other examples: – – – – Protein-to-protein interaction networks Sexual partners in humans Scientific citation networks Semantic networks Small World Studies • • • Milgram (1967) gave a letter to people in Nebraska and Kansas to get it to a person in Massachusetts they did not know through personal acquaintances. The average number of steps was 5 the maximum 12 and 25% of the letters arrived. Suppose each person knows 100 people (including the superficial acquaintances). Each person has 100 degrees. Suppose there is no clustering. This person will have access to 100*100= 10 000 people in the second step. 100*100*100= 1 million in the third. 1004= 100 million in the fourth and 10 billion in the fifth. • Example of no clustering if everyone has only 4 friends (the average degree is 4 with a variance of 0) http://smallworld.columbia.edu/ Six Degrees of Kevin Bacon or Who is the Center of the Hollywood Universe? • 800,000 people in the Internet Movie Database • Kevin Bacon’s number (average chain length) is 2.946 • Sean Connery’s number is 2 • Charlie Chaplin’s number is 3 • Jean-Luc Godard’s 2 • My father-in-law (Janos Hersko) has number 3 • Average for all 800,000 people is 9.200 • http://oracleofbacon.org/center.html Bacon Number # of people 0 1 1 1806 2 145024 3 395126 4 95497 5 7451 6 933 7 106 8 13 Vertex= nodes/objects, edge=ties Newman 2003 Transitivity C A C B Forbidden Triad A B Granovetter: The Strength of Weak Ties • Burt: Structural Holes • “The BEFORE network contains 5 primary contacts and reaches a total of 15 people. However, there are only two nonredundant contacts in the network. Contacts 2 and 3 are redundant in the sense of being connected with each other and reaching the same people. The same is true for contacts 4 and 5. Contact 1 is not connected directly to contact 2, but he reaches the same secondary contacts; thus contacts 1 and 2 provide redundant network benefits. Illustrating the other extreme, contacts 3 and 5 are connected directly, but they are nonredundant because they reach separate clusters of secondary contacts. In the AFTER network, contact 2 is used to reach the first cluster in the BEFORE network, contact 4 is used to reach the second cluster. The time and energy saved by withdrawing from relations with the other three primary contacts is reallocated to primary contacts in new clusters. The BEFORE and AFTER networks are both maintained at a cost of fice primary relationships, but the AFTER network is dramatically richer in structural holes, and so benefits." (Burt, Structural Holes pp.22-3. Robust Action and the Rise of the Medici • • • Padgett and Ansell argue that the Medicis were powerful because they could use their membership in overlapping social networks strategically. Power comes from not being locked into a single network or identity but to cultivate ambiguity by belonging to many networks. Multiple networks also deliver more resources. Both ambiguity and multiple resources lead to more discretion and power. They could also attain central position where others had to communicate through them. Putnam: Bowling Alone Social capital is declining: • Political participation is declining • Participation in religious groups is declining: • Labor union membership is declining • Participation in voluntary organizations is declining • Family ties are looser • Less contact with neighbors Wellman: Cyberplace • Larger volume and higher speed of information transfer • Portability of wireless technology • Globalized connectivity • Personalization • Networked individualism Cognitive Maps (Carley and Palmquist 1992) The figure is a graphic illustration of the complete map extracted from the complete interview with a student at the beginning of the term […]. All concepts in the map are listed in a circle. The relationship between two concepts is denoted by a line. This map represents the student's conception of research writing at the beginning of the term, and it illustrates that those concepts about which the student has the most information at the beginning are fact, research, topic, and writing. Tracing through some of the relationships (represented by lines) between concepts reveals that in the student's view, at the beginning of the term, writing a paper involves having an opinion that is based on fact which can be found through research. This figure is a graphic illustration of the map extracted from an interview with the same student later in the term. This interview shows the student's conception of research writing at the end of the term. A comparison of Figure 4 and Figure 5 shows that the student's conception has shifted over time. For example, many of the concepts used by the student to describe research writing have changed and, for those concepts that are retained, their relative semantic importance may have changed (more important, more relationships, more lines). From the beginning to the end of the term, in the students mental model of research writing, the concept information has grown in importance (more lines in Figure 5 than Figure 4) but the concept outline has decreased in importance to the extent that it does not even appear in the later map. Once again tracing through some of the relationships between concepts reveals that in the student's view, at the end of the term, writing a paper involves having information that depends on facts and a plan that is original and guides research. Network Data •Types of Questions asked: – Structural variables – questions about ties/connections – Compositional variables – questions about characteristics/attributes of the nodes/actors •When analyzing networks researchers often do not have representative samples of individuals •Often creates human subject concerns •Specifying boundaries of the population to study: – Nominalist approach – actors themselves decide on membership in a network answering name generating questions – Realist approach – a list is constructed by a researcher based on theoretical concerns Snowball Sampling Types of Network Data: One-Mode Network • Actor-to-actor network • Actor attributes – characteristics of an actor • Actors: people, groups, organizations, communities, nations • Relations: interactions, transfer of resources or information, movement, formal roles, kinship – E.g. Network data representing friendships among students in a high school This is a simple one-mode network data, where the relationships are binary (yes/no) and asymmetric, and we know one attribute about everyone (race) Student 1 (W) Student 2 (H) Student 3 (B) X 1 1 1 X 0 1 0 X …………. ………… …. ………… …. ………… …. Student N (W) 1 0 1 Student 1 (W) Student 2 (H) Student 3 (B) ………… …. ………… …. ………… …. ………… …. ………… …. ………… …. Student N (W) 1 0 1 ………… …. X High school friendship (color is for race) http://www.soc.washington.edu/users/stovel/Chains.pdf Types of Network Data: Dyadic Two-Mode (Bipartite) Network Lab1 Lab2 • Two sets of actors with connections only between the sets Corporations Nonprofits High school dating Girl 1 Girl 2 Girl 3 …………. Girl N http://www.soc.washington.edu/users/stovel/Chains.pdf Boy 1 Boy 2 Boy 3 ……………. Boy M 0 0 1 ……………. 0 1 0 0 ……………. 0 1 0 0 ……………. 1 ……………. ……………. ……………. ……………. ……………. 1 0 1 ……………. 0 Types of Network Data: Two-Mode, Affiliation Network • Actor-to-event network • Actors are the first mode, events are the second mode • Events are activities or groups that actors may participate in or be affiliated with • Events: social functions, clubs, voluntary organizations, agreements and treaties for countries, etc. • Attributes are recorded for both actors and events Breiger: Duality of Persons and Groups McPherson: Hypernetwork Sampling Sample of Individuals Sample of Organizations Org 1 Org 2 Org 3 Org 4 Org 5 Org 6 Org 7 Person 1 0 0 0 0 0 0 0 Person 2 0 0 0 0 0 0 0 Person 3 0 0 0 0 0 0 0 Person 4 0 0 0 0 0 0 0 Person 5 0 0 0 1 0 0 0 Person 6 0 1 0 0 0 0 1 Person 7 0 0 0 0 0 0 0 Person 8 0 0 0 0 1 0 0 Person 9 0 0 0 0 0 0 0 Person 10 0 0 0 0 0 0 0 Types of Network Data: Ego-Centered Network (GSS 1985) • Also called personal network • Centered on respondent • Often used in surveys with representative samples • Ego – the focal actor • Alter – actors tied to an ego • Attributes are recorded for both ego and alters • Information on alters’ contacts with each other can be collected – In most surveys, Ego’s connections to alters is ignored Quantifying Relationships • Direction: – directed vs. symmetric (reciprocal) ties • Level of measurement: – dichotomous vs. valued data • Sign – positive vs. negative Question Formats: Roster vs. Free Recall Q1. This is a list of students taking Sociology 101 with you. Please circle your own name. Please also indicate with an X with whom of these people you interact outside of class. Q2. Please think of up to three people you usually go to for an advice about your life and answer a few questions about them. Question Formats: Free vs. Fixed Choice Q1. Please think of the people you usually go to for an advice about your life. Please write down their initials and answer a few questions about them. Q2. Please think of up to three people you usually go to for advice about your life and answer a few questions about them. Question Formats: Ratings vs. Complete Rankings Q1. On a scale from 1 to 5 where 1 is “not important at all” and 5 is “very important,” please tell us how important this person is to you Q2. Please rank these persons in the order of importance to you with person number one being the most important. Summarizing Network Data • • • • • • Actor Dyad Triad Subgroup Set of actors Entire network Questionnaire Questions about graduate students and relations related to the studies: • How frequently do you talk to each person on this list (in person or on the phone)? • With whom do you hang out discussing or debating sociological ideas? • Who do you go to when you need help with your class work, paper, presentation or research? • Who do you go to for advice or information on matters related to your studies (for example, who to choose as your advisor, which classes to take, which conference to attend, etc.) • Who have you collaborated with on a class project, paper, conference presentation, or writing an article? • Who do you go to when you face a stressful situation related to your graduate studies and want to talk to someone about it? • If you receive good news related to your studies or professional career, who do you tell it to first? Questions about graduate students and relations beyond graduate studies: • Who do you go to for help when you need a $10 loan, a ride to a doctor, etc.? • Who do you go to when you face a stressful situation not related to your graduate studies and want to talk to someone about it? • Who do you go to for advice or information when making life decisions not related to your studies? • Who do you usually hang out with outside of the department socially (for example, visit each other for dinner or go to concerts, clubs, or parties together, etc.)? Questionnaire Questions about discussion/reading groups and classes: • Which of the following discussion/reading groups are you a regular participant of? • Which of the following classes have you taken this academic year? Questions about faculty: • Which faculty members have you been in contact with beyond your class work this academic year (your worked for them as TA or RA, you joined their project, they are on your dissertation committee, etc.) Questionnaire Personal network questions: • Please write down initials of up to three people who you consider to be the most important people in your life and answer a few questions about them: • What is this person’s gender? • What is this person’s age? • How is this person related to you? (Please check all that apply) • Does this person live in San Diego? • How frequently do you see these people in person? • How often to you talk to this person on the phone? • How frequently to you email or text-message this person? • Do you go to this person for help when you need a loan, a ride to a doctor, etc.? • Do you go to this person when you face a stressful situation and want to talk to someone about it? • Do you go to this person for advice or information when making life decisions? • Do you spend your leisure time with this person (for example, visit each other for dinner or go to concerts, clubs, and parties together, etc.)? • Who of the named people know each other, meet and talk to each other even when you are not around? Questionnaire • Questions about the program • Questions about satisfaction with the graduate school experience • Demographic questions Academic Advice Network Student by Faculty Network Student by Courses Network