Analyzing Virtual Worlds Next Step in the Evolution of Social Science Research Dmitri Williams, USC Noshir Contractor, Northwestern University Jaideep Srivastava, University of Minnesota Marshall Scott Poole, University of Illinois at Urbana-Champaign Overview Brief Intros of PIs Origins & EQ2 Goals & Research Agenda Data and methods Database structure/CS & Informatics Challenges Early findings: Census, gender, role play, proximity, network signatures Case study with full metrics: Economics Upcoming/in progress: Problematic uses, social capital, team performance, group metrics The PIs Noshir Contractor, Northwestern Marshall Scott Poole, University of Illinois at Urbana-Champaign Jaideep Srivastava, Univ. of Minnesotta Dmitri Williams, University of Southern California Multidimensional Networks in Virtual Worlds Multiple Types of Nodes and Multiple Types of Relationships SONIC Advancing the Science of Networks in Communities Any churn, customer service or big-picture material from Nosh, plus referencing the larger issues and the Bainbridge session to follow. Support U.S. National Science Foundation ◦ Small Grants for Emerging Research (SGER) ◦ Information and Intelligent Systems (IIS) Army Research Institute SONY Online Entertainment (SOE) National Center for Supercomputing Applications, UIUC Annenberg School for Communication, USC Origins Prior work: experimental and survey Begging for server-side data EQ2: Some basics: ◦ About 175,000+ players ◦ Dozens of servers worldwide ◦ Successor to Ever Quest http://www.youtube.com/watch?v=G3VamwcAlzs The generalizability issue Is one title like the next? = ? Elements of the game Basics – – – – – – – – Fantasy based Create character Never ending quest for advancement and exploration Underlying Storyline: Conflict between good and evil factions Band with other characters Quests to kill monsters Craftwork Socializing Why study these things? MMOGs are of interest in their own right ◦ Psychological, social, economic impacts ◦ Number of users now around 45 million MMOGs may be a mirror of the real world ◦ ◦ ◦ ◦ ◦ ◦ Networks Economics Group Processes Conversation Conflict Learning Research framework When does the virtual map on to the real? Group size Traditional controls and independent variables Contextual and social architecture factors Directionality Individual Psychological profile World size Online to offline Dyads Motivations Persistence Offline to online Small groups Demographics Competitive vs. Collaborative Endogenous Large groups Communications medium Role play Communities Network-level variables Sandbox vs. linear Societies Representation Interaction affordances Data & Methods Previous work… ◦ Experiments ◦ Ethnographies ◦ Surveys in separate sites VWE Project… ◦ Proprietary in-game player data ◦ Survey of players from game Who Plays, How Much, And Why? Replacing selfselected survey work, adding basic use patterns It ain’t a bunch of kids: Average age is 31.16 (US population median is 35) More players in their 30s than in their 20s. How much do they play? Mean is 25.86 hours/week Compares to US mean of 31.5 for TV (Hu et al, 2001) • From prior experimental work, MMO play eats into entertainment TV and going out, not news • So much for kids being the ones with the free time. Some oddball findings EQ2 players have lower Body Mass Index than US Population, i.e. they are physically healthier EQ2 players have higher levels of depression, lower levels of anxiety than US population Healthiest group vs. US Pop is older women Women are: - More intense players in terms of time and commitment - Older (33 vs. 30) - More likely to be disabled - Disproportionately likely to be bisexual (14% vs. 2.8% for US) or physically impaired (14.4% vs. 7.3% for US) Gender-based tests Gender role theory: Boys and girls are socialized early on, and thus have clear role expectations for their behaviors and identities Hypotheses based on social categories and gendered expectations Men indeed were more of the players (80/20%), and played to compete vs. women playing to socialize. However . . . Who’s Hardcore? Men play more other games, but it was the women who were the most dedicated and intense satisfied EQ2 players Women: 29.32 hours/week Men: 25.03 hours/week Likelihood of quitting: “no plans to quit”: women 48.66%, men 35.08% The self-reported play issue Actual play was: Women: 29.32 hours/week Men: 25.03 hours/week Self-reported play was: Women: 26.03 (3 hours less than actual) Men: 24.10 (1 hour less than actual) Implications for prior research! Playing with a partner: Apparently not good for the gander! Role Players: asic theory and hypotheses Theory base: Goffman, Meyrowitz, Turkle How common is it, why do they do it, and who are they? It’s not that common, despite the name of the game. Only 5% were dedicated role players Role players had significantly worse psychological profiles, e.g. depression, loneliness, addictions, learning disabilities One possible reason for play: to be accepted or to experiment when it’s not possible to do so offline Evidence comes from the profile Marginalized populations role play more Economics: a first test of mapping Do players behave in virtual worlds as we expect them to in the actual world? Economics is an obvious dimension to test In the real world, perfect aggregate data are hard to get Results GDP and price levels are robust but comparatively unstable GDP and Prices on Antonia Bayle 5,000,000 160 4,500,000 140 120 3,500,000 3,000,000 100 2,500,000 80 2,000,000 60 1,500,000 40 1,000,000 20 500,000 0 0 January February Nominal GDP March April Price Level (January = 100) May Prices (January = 100) 4,000,000 GDP (Gold) Results The instability is explicable through the Quantity Theory of Money: a rapid influx of money, combined with a drop in population . . . Change in Money Supply and Population on Antonia Bayle 2500 4000 3000 2000 2000 1000 1500 0 -1000 1000 -2000 -3000 500 -4000 0 -5000 February March Change in Money Supply (000 Gold) April May Change in Active Accounts Change in Accounts Change in Money (000 Gold) Results . . . dramatically boosted prices: More evidence that this behaves like a real economy. Price Level on Antonia Bayle Percent Change in Price Level 50 40 30 20 10 0 February March April -10 -20 Price Level May June Results A new server’s GDP and price level rapidly converge to those of an existing server (replicability). Lessig and the mapping idea. GDP and Prices, New and Mature Servers 5,000 4,500 4,000 160 140 120 3,500 3,000 2,500 2,000 1,500 100 80 60 40 1,000 500 0 20 0 January February March April May Real GDP Antonia Bayle, Base Antonia Bayle January Real GDP Nagafen, Base Antonia Bayle January Price Level Antonia Bayle, Base Antonia Bayle January Price Level Nagafen, Base Antonia Bayle January Price Level GDP (000) Trust in Virtual Worlds Trust in Virtual Worlds Trust has become problematic in the late 20th and early 21st centuries ◦ Modernity and Self Identity (Giddens): Institutions that once provided foundation for trust are changing rapidly and creating a sense of discontinuity that creates a sense of insecurity ◦ Bowling Alone (Putnam and Oldenberg): Sense of civil society and community that provided basis for trust has eroded ◦ No Sense of Place (Meyrowitz): Media connect us to others not part of our communities—can we trust them? Trust in Virtual Worlds Trust and feeling part of a community is very important in the early 21st century Online worlds offer potential for community Community depends on trustworthiness of online worlds Do participants have a sense of trust in EQ2? Does trust work the same way in EQ2 as in RL? Trust in Virtual Worlds Sources of Trust 1. Personality: Generalized Trust (Rotter) 2. Institutions: Within Online World (e.g. guilds in EQ2) 3. Communication: Amount Self-disclosure Private versus public messages Modality Trust in Virtual Worlds H1: Levels of generalized trust will be positively related to online trust H2a: Players trust their guild members more than others in the game H2b: MMO players trust others in the game more than others online Trust in Virtual Worlds H3a: Amount of small talk is positively related to trust of guildmates and others in the MMO H3b: Self-disclosure is positively related to trust of guildmates and others in the MMO H3c: The higher the proportion of all messages that a player publicly sends to a guild, the more this player trusts guild members. H3d: People who use voice chat in the game more frequently are more likely to trust others a) in their guild and b) in the game the game and in their guild compared to people who use voice chat less frequently. Trust in Virtual Worlds Measures: Survey: Generalized Trust: “Generally speaking, would you say that most people in everyday life can be trusted or that you can’t be too careful in dealing with people?” [Five Point Likert Scale] Trust in Others Online: Same question rephrased to refer to others online Trust in Others in EQ2: Same question rephrased to refer to EQ2 players Trust in Others in My Guild: Same question rephrased to refer to members of guild Trust in Virtual Worlds Measures: Survey: Self-disclosure: index of subjects’ responses to questions about a variety of relatively sensitive topics they might have discussed while playing. These covered how much they talked about dating or married life, parenting, sex, religion, and politics [Five Point Likert Scale] Small Talk: “When you are online talking to other players, how much ‘small talk’ (e.g. talk about the weather, short conversations about how you are doing, etc.) goes on?” [Scale: “None at all”, “Very little”, “Some”, “A lot”, “I am not in a guild”, and “Don’t Know”]. Voice Chat Use: “How often do you use a voice system (e.g. TeamSpeak, Ventrilo) to talk to other players?” [Scale: “Never”, “Rarely, less than once a month”, “Once a week”, “About once a day, but not always”, and “Always”] Trust in Virtual Worlds Measures Behavioral Public and Private Messages: The number of private (to individual players) and public (to guild) messages each player sent was derived from Sony’s databases. The proportion of all messages that players publicly sent to a guild was calculated by dividing the number of public guild messages by the sum of the number of private messages and public guild messages. Higher values mean that a player sent a higher proportion of public messages. Trust in Virtual Worlds Measures Control Variables Generalized Trust Gender Amount of time used email in typical week (self-report) Amount of time used internet in typical week (self-report) Amount of time per week play EQ2 (self-report) Trust in Virtual Worlds Findings: H1: Levels of generalized trust will be positively related to online trust Supported: Correlations of generalized trust with other forms of trust (online, EQ2, guild) are significant H2a: Players trust their guild members more than others in the game Supported:Trust of guild members is significantly greater than trust of others in the EQ2 H2b: MMO players trust others in the game more than others online Supported:Trust of others in EQ2 is significantly higher than trust of others online, Trust in Virtual Worlds H3a: Amount of small talk is positively related to trust of guildmates and others in the MMO Supported: Small talk significantly predicted trust of guildmates (medium effect size)and others in EQ2 (small effect size); coefficients in the predicted direction. H3b: Self-disclosure is positively related to trust of guildmates and others in the MMO Supported: Self-disclosure significantly predicted trust of guildmates (small effect size) and others in EQ2 (small effect size); coefficients in predicted direction. Trust in Virtual Worlds H3c: The higher the proportion of all messages that a player publicly sends to a guild, the more this player trusts guild members. Supported: Players who sent a greater proportion of public guild messages reported higher levels of trust in guild members as well as other players. H3d: People who use voice chat in the game more frequently are more likely to trust others a) in their guild and b) in the game the game and in their guild compared to people who use voice chat less frequently. Not Supported Trust in Virtual Worlds Conclusions: ◦ Trust in online world exhibits similar patterns of result to what would be expected in RL ◦ Institutions like guilds serve an important function in MMOs in that they provide a basis for trust ◦ MMOs engender trust and may be a basis for community ◦ Communication has a role in generating trust in online worlds Expertise in Virtual Worlds Expertise in Online Worlds Important factor influencing effectiveness and accomplishment in the MMO In MMOs there are precise metrics defining expertise and changes in expertise. What can expertise in the MMO tell us about expertise in RL? Expertise in Online Worlds Expertise = f(Motivation, Effort, Time, Personal Characteristics) Expertise in Online Worlds RQ1: What is the relationship between age, gender and game expertise? H1a: Players who are high on achievement motivation in game playing will have higher levels of expertise. H1b: Players who are high on social motivation will have lower levels of expertise. H1c: Players who are high on immersion motivation will have higher levels of expertise. Expertise in Online Worlds H2a: Players who spend more time playing the game will have higher levels of expertise. H2b: Players who focus their game time on completing quests will have higher levels of expertise. H3a: Players who are more extroverted will have higher levels of expertise. H3b. Players who are more aggressive will have higher levels of expertise in the game. Expertise in Online Worlds Measures: Expertise: Sum of each character’s statistics on strength, agility, intellect, wisdom, and stamina. Age and gender: From the survey data; the age range was coded from 12 to 65, with “12” indicating “under 13” and 65 indicating “65 and above.” Play time: Total number of seconds spent on primary character. The value was log-transformed to reduce positive skew. Expertise in Online Worlds Measures: Player motivation: The measures for player motivations were adapted from those in Yee (2007), with one additional item on chatting, which yielded 11 items in total. A principal component analysis with varimax rotation revealed three major player motivations, which were in agreement with the findings by Yee (2007). Achievement motivation focuses on advancement in the game, understanding game mechanics, and competition with other players. Social motivation focuses on socialization, building long-term meaningful relationships, and participation in team work. Immersion motivation focuses on exploration in the game world, role-playing, customization of characters, and escapism. The average score for each major motivation was calculated and included as a predictor variable in the final model. Expertise in Online Worlds Measures Game focus: Amount of time players spend doing quests as opposed to other game activities such as utilizing trade skills or socializing with other players. It was calculated by dividing the total number of quests that players completed by the log-transformed play time. Aggression: The AQ Physical Aggression subscale. On 9 statements, survey participants were asked to indicate the degree of their agreement on a 7-point scale ranging from “Strongly Disagree” (1) to “Strongly Agree” (7). An illustrative statement involving physical aggression was “If somebody hits me, I hit back.” The average score across the 9 items was calculated as the players’ score on physical aggression. Expertise in Online Worlds Measures: Extroversion: Adapted from Bendig (1962). Survey respondents were asked to judge how much they agreed with 10 statements on a 7-point scale ranging from “Strongly Disagree” (1) to “Strongly Agree” (7). One example item was “I like to mix socially with people.” The average score across the 10 items was calculated as the players’ score on extroversion Expertise in Online Worlds Results: RQ1: What is the relationship between age, gender and game expertise? Age positively related to expertise; no relationship to gender H1a: Players who are high on achievement motivation in game playing will have higher levels of expertise. Supported: Achievement motivation significantly predicted expertise; coefficient was in expected direction (small effect size) Expertise in Online Worlds Results: H1b: Players who are high on social motivation will have lower levels of expertise. Not Supported: Social motivation significantly predicts expertise; coefficient is positive, not negative as expected (small effect size) H1c: Players who are high on immersion motivation will have higher levels of expertise Not Supported: Immersion motivation significantly predicts expertise; coefficient is negative, not positive as expected (small effect size) Expertise in Online Worlds H2a: Players who spend more time playing the game will have higher levels of expertise. Supported:Time spent playing significantly predicted expertise; coefficient was in predicted direction (large effect size) H2b: Players who focus their game time on completing quests will have higher levels of expertise. Supported: Focus on quests significantly predicted expertise; coefficient was in predicted direction (large effect size) Expertise in Online Worlds H3a: Players who are more extroverted will have higher levels of expertise. Supported: Extroversion significantly predicted expertise; coefficient in predicted direction (small effect size) H3b. Players who are more aggressive will have higher levels of expertise in the game. Supported:Aggression significantly predicted expertise; coefficient in predicted direction (small effect size) WHY DO WE CREATE NETWORKS IN VIRTUAL WORLDS? SONIC Advancing the Science of Networks in Communities Monge, P. R. & Contractor, N. S. (2003). Theories of Communication Networks. New York: Oxford University Press. SONIC Advancing the Science of Networks in Communities Social Drivers: Why do we create and sustain networks? Theories of selfinterest Theories of social and resource exchange Theories of mutual interest and collective action Theories of contagion Theories of balance Theories of homophily Theories of proximity Theories of coevolution Sources: Contractor, N. S., Wasserman, S. & Faust, K. (2006). Testing multi-theoretical multilevel hypotheses about organizational networks: An analytic framework and empirical example. Academy of Management Review. Monge, P. R. & Contractor, N. S. (2003). Theories of Communication Networks. New York: Oxford University Press. SONIC Advancing the Science of Networks in Communities “Structural signatures” of MTML A A B B C + - F F + - C E E D D Theories of Self interest Theories of Exchange Theories of Balance A A B B F + C F - + E D C E D Theories of Collective Action Theories of Homophily Novice Expert Theories of Cognition SONIC Advancing the Science of Networks in Communities Statistical “MRI” for Structural Signatures p*/ERGM: Exponential Random Graph Models Statistical “Macro-scope” to detect structural motifs in observed networks Move from exploratory to confirmatory network analysis to understand multitheoretical multilevel motivations for why SONIC we create social networks Advancing the Science of Networks in Communities SONIC Advancing the Science of Networks in Communities Four Types of Relations in EQ2 Partnership: Two players play together in combat activities; Instant messaging: Two players exchange messages through Sony universal chat system Player trade: Players meet “face-to-face” in EQ2 and one gives items to another; Mail: One player sends a message and/or items to others by in-game mail Synchronous Interpersonal interaction Partnership, Instant messaging Transactional interaction Player trade Asynchronous Mail SONIC Advancing the Science of Networks in Communities Hypotheses - Network H1: (Sparsity) Individuals are not likely to engage in interaction randomly in a virtual world. H2: (Popularity) Individuals with many interactions are more likely to engage in interaction than those have a few interactions. H3: (Transitivity) Two individuals who both interact with the third parties are more likely to engage in interaction than those do not have common parties between them. SONIC Advancing the Science of Networks in Communities Hypotheses - Proximity H4: (Geographical) Individuals who are proximate in geographical distance are more likely to engage in interaction than those who are not proximate. H4a: (Short distance) Individuals who are in close proximity are substantially more likely to engage in interaction than those who are at medium or low proximity. Probability to engage in interaction 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 500 1000 1500 2000 2500 3000 SONIC 1 Distance between players (Km) 10 100 1000 10000 Advancing the Science of Networks in Communities Hypotheses - Proximity H4: (Geographical) Individuals who are proximate in geographical distance are more likely to engage in interaction than those who are not proximate. • H4b: (Interaction types) Individuals who are proximate are more likely to engage in interpersonal interactions (i.e. partnership and instant messaging) than transactional interactions (i.e. trade or mail). SONIC Advancing the Science of Networks in Communities Hypotheses – Proximity (cont.) H5: (Temporal) Individuals who are proximate in time zone are more likely to engage in interaction than those who have bigger time differences. H6: (Synchronization) Individuals who are proximate in time zone are more likely to engage in synchronous interactions than asynchronous interactions. SONIC Advancing the Science of Networks in Communities Hypotheses - Homophily H7: (Gender) Individuals of the same gender are more likely to engage in interaction than those of opposite genders. H8: (Age) Individuals who have smaller age differences are more likely to engage in interaction than those who have bigger differences. H9: (Experience) Players who have similar years of game experience are more likely to engage in interaction than those who have bigger differences. SONIC Advancing the Science of Networks in Communities Data Description 3140 players from Aug 25 to Aug 31 2006, in Antonia Bayle ◦ 2998 US, 142 CA ; 2447 male, 693 female • Demographic information – Gender, age, and account age (years played Sony games) – Zip code, state, and country SONIC Advancing the Science of Networks in Communities Descriptive Statistics 3140 players from Aug 25 to Aug 31 2006, in Antonia Bayle ◦ 2998 US, 142 CA ; 2447 male, 693 female Min. 1st Qu. Median Mean 3rd Qu. Max. Age 5.668 25.410 31.350 32.340 37.550 71.040 AcctAge 0.003 1.014 1.803 2.451 3.275 9.400 Latitude 18.27 33.88 38.89 38.28 42.11 71.29 Longitude -158.00 -112.00 -88.38 -94.19 -80.58 50.71 Network Node s Edges Density Degre e (mean ) Degre e (max) Diamet er Centralizati on (degree) Partner 1924 1789 0.097% 1.860 14 24 0.63% Instant messaging 548 517 0.34% 1.887 10 27 1.49% Trade 2456 Mail SONIC 2090 3812 3120 0.13% 0.14% 3.104 2.986 24 84 19 19 0.85% 3.83% Advancing the Science of Networks in Communities Black: male Red: female Partnership Instant messaging SONIC Trade Mail Advancing the Science of Networks in Communities Construct Physical Proximity Zip code -> latitude/longitude, time zone ◦ U.S. ZIPList5 and Canada Geocode from ZipInfo.com Latitude/longitude -> geographical distance ◦ Spherical law of cosines: Distance category acos(sin(lat 1)×sin(lat2)+cos(lat1)×cos(lat2)×cos(long2-long1))×6371 Km ◦ short (0 < r < 50 Km), intermediate (50 < r < 800 Km) and long (r > 800 Km) distances SONIC Advancing the Science of Networks in Communities Analysis Network signatures ◦ Sparsity: number of edges ◦ Polularity: geometrically weighted degree (gwdegree) ◦ Transitivity: geometrically weighted edgewise shared partner(gwesp) Homophily measures ◦ Gender homophily: gender matching (1 for M-M or F-F; 0 for M-F match) ◦ Age homophily: age difference (absolute value in years) ◦ Experience homophily: account age difference (absolute value in years) Proximity measures ◦ Geographical proximity: physical distance (absolute value in kilometers) ◦ Temporal proximity: time zone difference (absolute value in hours) SONIC Advancing the Science of Networks in Communities p*/Exponential Random Graph Models Analysis network data with Interdependencies – endogenous correlation among the relations ERGMs are a class of stochastic models: P(Y y ) 1 exp( T g ( y )) k ( ) ◦ g(y) is a vector of network statistics such as network structural measures and node attributes ◦ θ is a vector of coefficients. Frank & Strauss, 1986; Pattison & Wasserman, 1999; Robins, Pattison, & Wasserman, 1999; Wasserman & Pattison, 1996; Hunter, 2007; Robins, Snijders, Wang, Handcock, & Pattison, 2007 SONIC Advancing the Science of Networks in Communities Estimation Methods p*/ERGM model variables ◦ Non-directed network: dichotomized partner, IM, trade, and mail relations ◦ Network structures: edges, gwdegree, and gwesp ◦ Dyadic attributes: geographical distance and time zone difference between players ◦ Actor attributes: gender, age, and account age Estimation tools ◦ Statnet version 2.1, R-2.8.0 ◦ Social Sciences Computing Cluster (SSCC) at NU SONIC Advancing the Science of Networks in Communities Results 0: Baseline Model Supported: Individuals are not likely to engage in interaction Hypothese Variables randomly. s Partner Model P0 IM Model I0 Partially supported: High degree Individuals are more likely to Trade Mail engage in partner and IM relations Model T0 Model but notM0 trade and mail. H1: Sparsity Edges -8.72***(.28) -7.12***(.20) -6.02***(.03) -5.94***(.03) H2: Popularity GWDegree 1.07***(.18) 1.36***(.20) -1.12***(.06) -1.27***(.06) H3: Transitivity GWESP 1.49***(.03) 1.25***(.005) Log likelihood -13643.56 -3137.08 -29107.26 -23514.92 0.134 0.032 0.006 Supported: If two Degeneracy value 0.092 individuals have shared partners, they are more likely to engage in partner and IM relations. SONIC Signif. codes: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < + < 0.1 Advancing the Science of Networks in Communities Results 1: Homophily Model Not supported: Individuals of the same gender are NOT likely to engage in interaction. Hypotheses Variables Partner Model P1 IM Model I1 Trade Model T1 Mail Model M1 H1: Sparsity Edges -8.587***(.28) -6.687***(.20) -5.780***(.03) -5.525***(.03) H2: Popularity GWDegree 1.253***(.18) 1.407***(.20) -0.905***(.06) -1.145***(.06) H3: Transitivity GWESP 1.454***(.04) 1.237***(.04) H7: Gender homophily Same Gender -0.152***(.02) -0.148***(.03) -0.061***(.01) -0.059***(.01) H8: Age homophily Age Difference -0.035***(.001) -0.026***(.002) -0.025***(.0007) -0.031***(.001) H9: Experience homophily AcctAge Difference -0.115***(.005) -0.063***(.01) -0.144***(.003) -0.086***(.003) H4,4a,4b: Geographical proximity Log(Distance) H5,6: Temporal proximity Timezone difference Control Female -0.201***(.02) -0.080* (.04) -0.041***(.01) 0.024** (.009) Age 0.005***(.0002) -0.002***(.0005) AcctAge 0.007** (.002) 0.038***(.005) 0.032***(.001) Log likelihood -13547.59 -3123.28 -29001.57 -23386.54 Degeneracy value 0.478 0.672 2.120 5.013 Supported: Individuals with similar age and experience are more likely to engage in interaction. Signif. codes: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < + < 0.1 SONIC Advancing the Science of Networks in Communities Special behavior in MMORPG: Females have a strong tendency to interact with male partners Results 1a: Gender Detail Hypotheses Variables Partner Model P1a IM Model I1a Trade Model T1a Mail Model M1a H1: Sparsity Edges -8.716***(.28) -6.837***(.21) -5.820***(.03) -5.478***(.03) H2: Popularity GWDegree 1.223***(.18) 1.381***(.21) -0.901***(.05) -1.154***(.06) H3: Transitivity GWESP 1.481***(.05) 1.172***(.04) H7: Gender homophily Male-male match 0.041* (.02) -0.067+ (.03) -0.019* (.009) -0.114***(.01) Female-female match -0.369***(.06) -0.209* (.10) -0.115** (.04) -0.038 (.04) H8: Age homophily Age Difference -0.035***(.001) -0.025***(.002) -0.025***(.001) -0.030***(.0006) H9: Experience homophily AcctAge Difference -0.115***(.006) -0.057***(.01) -0.144***(.003) -0.085***(.003) H4,4a,4b: Geographical proximity Log(Distance) H5,6: Temporal proximity Timezone difference Control Female 0.005***(.0002) -0.002** Age (.0005) AcctAge 0.007** (.002) 0.037***(.005) 0.033***(.001) Log likelihood -13547.45 -3122.35 -29001.21 0 < *** < 0.0011.503 < ** < 0.01 < *2.520 < 0.05 < DegeneracySignif. value codes: 1.370 SONIC -23387.15 + Advancing the Science of Networks in Communities < 0.1 6.672 Results 2:Temporal Proximity Hypotheses Variables Partner Model P2 IM Model I2 Trade Model T2 Mail Model M2 H1: Sparsity Edges -6.435***(.03) -6.539***(.20) -5.597***(.04) -5.218***(.03) H2: Popularity GWDegree 1.393***(.19) -0.884***(.08) -1.274***(.05) H3: Transitivity GWESP 1.393***(.07) 1.224***(.04) H7: Gender homophily Same Gender -0.159***(.02) -0.134***(.04) -0.068***(.01) -0.062***(.010) H8: Age homophily Age Difference -0.034***(.001) -0.026***(.002) -0.026***(.001) -0.030***(.001) H9: Experience homophily AcctAge Difference -0.115***(.005) -0.059***(.01) -0.147***(.003) -0.085***(.003) H4,4a,4b: Geographical proximity Log(Distance) H5,6: Temporal proximity Timezone difference -0.226***(.01) -0.111** (.04) -0.173***(.02) -0.153***(.005) Control Female -0.165***(.02) -0.079* (.03) -0.040***(.01) 0.029** (.01) Supported: Time zone Age difference 0.005***(.0002) -0.002***(.0004) reduces the likelihood of AcctAge 0.012***(.002) 0.035***(.005) 0.032***(.001) interaction, e.g individuals in the likelihood -13582.8 -3120.64 -28921.29 -23271.95 same time zone areLog1.25 times more likely to be partners than SONIC Degeneracy value 0.427 1.224 2.850 5.857 the individuals with one hour Signif. codes: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < + < 0.1 difference. Advancing the Science of Networks in Communities Results 3: Geographical Proximity Hypotheses Variables Partner Model P3 IM Model I3 Trade Model T3 Mail Model M3 H1: Sparsity Edges -1.759***(.04) -5.251***(.22) -1.848***(.07) -2.211***(.03) H2: Popularity GWDegree 1.958***(.02) -0.710***(.01) -1.209***(.05) H3: Transitivity GWESP 1.027***(.08) 1.073***(.05) H7: Gender homophily Same Gender -0.196***(.02) -0.172***(.03) -0.062***(.01) -0.074***(.007) H8: Age homophily Age Difference -0.030***(.002) -0.025***(.002) -0.023***(.001) -0.029***(.001) H9: Experience homophily AcctAge Difference -0.107***(.003) -0.050***(.01) -0.115***(.003) -0.084***(.003) H4,4a,4b: Geographical proximity Log(Distance) -1.642***(.007) -0.999***(.07) -1.315***(.03) -1.065***(.002) H5,6: Temporal proximity Timezone difference Control Female -0.217***(.02) -0.086** (.03) -0.054***(.01) 0.035***(.01) Supported: Distance reduces the Age e.g 0.003***(.0004) -0.003***(.0004) likelihood of interaction, individuals 10 Km away AcctAge from each 0.011***(.003) 0.026***(.005) other are 5 times more likely to Log100 likelihood -12562.75 -3245.12 -27563.39 -22512.26 be partners than the ones Km apart. Distance has a stronger SONIC Degeneracy value 2.65 1.992 5.171 4.851 impact on partner than IM. Signif. codes: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < + < 0.1 Advancing the Science of Networks in Communities Results 3a: Detail Distance Hypotheses Variables Partner Model P3a IM Model I3a Trade Model T3a Mail Model M3a H1: Sparsity Edges -6.799***(.03) -7.592***(.53) -6.053***(.08) -5.547***(.03) H2: Popularity GWDegree 1.430***(.33) -0.793***(.10) -1.231***(.05) H3: Transitivity GWESP 1.051***(.08) 1.058***(.04) H7: Gender homophily Same Gender -0.176***(.02) -0.170***(.03) -0.066***(.01) -0.059***(.01) H8: Age homophily Age Difference -0.032***(.001) -0.026***(.002) -0.025***(.001) -0.030***(.001) H9: Experience homophily AcctAge Difference -0.110***(.005) -0.044***(.01) -0.145***(.004) -0.084***(.003) H4a: Short distance Short distance 3.332***(.05) 2.493***(.17) 3.090***(.16) 2.773***(.04) Intermediate distance 0.215***(.03) 0.151 (.11) 0.218* (.09) 0.162***(.02) -0.201***(.02) -0.094***(.03) -0.042** (.02) 0.029***(.008) 0.004***(.0002) -0.002***(.0004) 0.012***(.002) 0.028***(.004) 0.032***(.002) -12783.63 -3237.30 -27781.24 -22668.67 3.319 0.600 8.883 7.596 Control Female Short distance has the strongest impact compared to intermediate Age and long distance, e.g individuals AcctAge living within 50 Km are 22.6 times more likely to be partners Logthan likelihood those who live between 50 and Degeneracy value 800 Km. Signif. codes: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < + < 0.1 SONIC Advancing the Science of Networks in Communities Hypotheses Tested Hypotheses Partnershi p IM Trade Mail H1 Sparsity Yes Yes Yes Yes H2 Popularity Yes Yes No No H3 Transitivity Yes Yes N/A N/A H4 Geographic proximity Yes Yes Yes Yes H4a Short distance Yes Yes Yes Yes H4b Interaction types High Low Medium Medium H5 Temporal proximity Yes Yes Yes Yes H6 Synchronization High Low Medium Medium H7 Gender homophily No No No No H8 Age homophily Yes Yes Yes Yes H9 Experience homophily Yes Yes Yes Yes SONIC Advancing the Science of Networks in Communities Results and Discussion Sparsity and transitivity exists in all online relations. Homophily of age and game experience is supported in all fours relations. Distance matters but short distances are more important. Time zone has a bigger impact on partner and trade relations compared to mail and IM relations. Gender homophily is not supported for all relations and female players are more likely to interact with the male players. SONIC SONIC Advancing the Science of Networks in Communities Advancing the Science o Networks in Communitie Jaideep Srivastava Muhammad A Ahmad Nishith Pathak VWE COMPUTATIONAL CHALLENGES Outline Data Extraction and Inference Algorithm Development Issues Computational Metrics Current Work and Future Directions Data Extraction & Inference WVE data: ◦ 23 tables, 500 classifications of actions ◦ 3 Terabytes of data ◦ Data captured at the level of actions Analysis & extraction of higher order structures ◦ Social Groups ◦ Reconstruction of Events ◦ Group formation and dissipation Computational Challenges Challenges in large scale data handling Need to develop One pass algorithms Incremental Algorithms Computational Metrics Quantitative measurement of ◦ Trust, Performance, Expertise Quantifying subjective qualities Inference through ◦ Group membership ◦ Histories of interaction ◦ Co-location and other spatio-temporal analysis. Inferring relations from data Applications Why do we need to develop such metrics? The availability of massive amounts of data makes quantification of otherwise subjective tasks possible Hypothesis testing ◦ e.g., High performance groups and individuals tend to group together Quantifying and measuring performance of different tasks Difficulty Indexes Factors effecting performance and assessment of task difficulty ◦ ◦ ◦ ◦ Player Level Player Effective Level Group Level Group Size ◦ Also, task difficulty will play a large role Monster Composite Difficulty Index Composite Difficulty vs Avg Pts 3000 2500 1500 1000 500 38 5 40 3 41 9 43 6 45 5 47 2 48 8 50 5 51 9 53 4 54 8 56 6 58 7 61 9 86 11 5 13 2 14 7 16 4 17 9 19 5 20 9 22 5 23 9 25 4 26 8 28 7 30 5 32 2 33 6 35 3 36 8 0 Composite Difficulty Composite Difficulty vs Avg Pts 500 450 400 350 Avg Pts 15 62 Avg Pts 2000 300 250 200 150 100 50 0 15 25 35 42 43 44 45 51 52 53 54 55 56 62 63 64 65 Composite Difficulty 66 67 73 74 75 82 83 84 85 86 92 93 Experience Rate Experience gained does not capture everything about performance Issues ◦ Relative difficulty varies for player levels ◦ Players can “mentor down” Proposed metric DP = Ding points i.e., points needed to move from one level to another. N = Maximum levels Experience Rate 1545 957 792 716 638 591 547 505 478 451 424 403 379 357 337 316 295 272 251 231 209 190 171 152 133 114 95 76 57 38 19 0 Frequency Experience Rate Distribution of Experience Rate 2500 2000 1500 1000 500 0 General Performance Rate Performance of players varies: ◦ ◦ ◦ ◦ Size of groups Type of players grouped with Difficulty of task History of play Churn Prediction Problem Given player attributes and history of play ◦ Predict which players are likely to quit Base model: Regression Proposed models: Network models ◦ Diffusion models ◦ Other classification models Pattern Mining for Studying Group Evolution What are the frequent configurations of changes in an evolving social network? How stable are groupings of users across an extended period of time? Are there any behavioral patterns that trigger changes in the topology of the network? Conclusions A wide range of findings about the game world, ranging from economics to network signatures. New linkages between computer science and social science Developing new metrics and new algorithms Began to explore the potential of mapping Several more studies in process: problematic use, flow, social capital formation Still only using about 1/10th of the data