Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy Sunbelt XXIV, Portorož, 2004 1 Today’s Program • Introduction to Pajek and social network analysis • Analysing large networks with Pajek and fine-tuning layouts • Discussion and questions Sunbelt XXIV, Portorož, 2004 2 PART 1 Exploratory Network Analysis ž with Pajek (Published at Cambridge University Press, October 2004) W. de Nooy, A. Mrvar, V. Batagelj Sunbelt XXIV, Portorož, 2004 3 Overview • Network data • Vertex attributes and properties • Cohesive subgroups: – in simple networks – in signed networks – in valued networks • Brokerage: – centrality – structural holes – brokerage roles • Ranking: – prestige – acyclic networks • Blockmodeling • Networks and time – repeated measurement – diffusion – genealogies, citations • Network analysis and statistics • Building your own Sunbelt XXIV, Portorož, 2004 4 Network data • Opening a network in Pajek • Drawing a network in Pajek – Energizing the layout – Selecting display options – Exporting the sociogram • Pajek network data – Structure – Store & export from Access • Example: World trade relations – Imports_manufactures.net Sunbelt XXIV, Portorož, 2004 5 Vertex attributes and structural properties • Types of data objects – Partitions: discrete properties – Clusters: 1 class from a partition – Vectors: continuous (numeric) properties – Hierarchies: nested classification – Permutations: reordering (renumbering) • Visualizing partitions and vectors • Menu structure • Pajek project file Sunbelt XXIV, Portorož, 2004 6 Cohesive subgroups in simple networks • Connectivity • Example: Attiro.paj • Measures: – Components: weak and strong – k-cores – Cliques, complete subnetworks • Analytic strategy Sunbelt XXIV, Portorož, 2004 7 1. Is the network directed? no yes Info>Network>General 2a. Find components 2b. Find weak components Net>Components>Weak Net>Components>Weak Do components identify subgroups? Do components identify subgroups? yes yes no no Finish: subgroups are classes in the components partition. 3b. Find strong components yes Net>Components>Strong Do components identify subgroups? no 3a. Find k-cores Net>Partitions>Core>Input 4b. Find overlapping complete subnetworks Do k-cores contain subgroups? no Select Nets>Find Fragment (1 in 2) >Options>Extract Subnetwork and execute Nets>Find Fragment (1 in 2)>Find yes Do subnetworks identify subgroups? 4a. Remove vertices of the lowest k-cores no Operations>Extract from Network>Partition yes 5b. Symmetrize the network Net>Transform>Arcs->Edges>All 5.5 Find components (see 2a.) 5a. Find overlapping complete subnetworks Select Nets>Find Fragment (1 in 2) >Options>Extract Subnetwork and execute Nets>Find Fragment (1 in 2)>Find yes no Do subnetworks identify subgroups? Finish: subgroups not found. Sunbelt XXIV, Portorož, 2004 8 Cohesive subgroups in signed networks • • • • Balanced clusters Example: Sampson.paj Using line values & signs in layout Optimization approach – Set parameters – Search optimal solution – Repeat many times • Stepping through partitions Sunbelt XXIV, Portorož, 2004 9 Cohesive subgroups in valued networks • Cohesion by strong or multiple ties • Example: interlocking directorates in Scottish banking (circa 1900) Scotland.paj • Transform 2-mode into 1-mode network • Measure: – m-core (valued core) • SVG output Sunbelt XXIV, Portorož, 2004 10 Centrality • Centrality and centralization • Undirected networks (Knoke & Burt, 1983) • Example: Strike.paj – Degree – Closeness – Betweenness Sunbelt XXIV, Portorož, 2004 11 Brokerage • The flow of information • Example: Strike.paj • Overall network structure: – Bridges – Cut-vertices or articulation points – Bi-components • Investigating the ego-network: – Structural holes – Brokerage roles Sunbelt XXIV, Portorož, 2004 12 5 Brokerage roles v v u w u w v gatekeeper itinerant broker v u v w coordinator u w representative Sunbelt XXIV, Portorož, 2004 u w liaison 13 Prestige • Asymmetric choices • Example: SanJuanSur2.paj • Measures: – Popularity: indegree – Input domain: direct and indirect nominations – Proximity prestige: size of domain divided by the average distance within the domain • Structural and social prestige Sunbelt XXIV, Portorož, 2004 14 Ranks: acyclic networks • Discrete ranks or levels • Example: student_government.paj • Local network structure: – Triadic analysis and the triad census • Overall network structure: – Strong components and ranks – Symmetric-acyclic decomposition Sunbelt XXIV, Portorož, 2004 15 Balance-theoretic models Model Ties within a cluster Ties between ranks Permitted triads Balance symmetric ties within a cluster, no ties between clusters; max 2 clusters idem no restriction on the number of clusters idem none 102, 300 idem + 003 asymmetric ties from each vertex to all vertices on higher ranks null ties may occur between ranks idem + 021D, 021U, 030T, 120D, 120U Clusterability Ranked Clusters Transitivity Hierarchical Clusters idem asymmetric ties within a cluster allowed provided that they are acyclic no balance-theoretic model (‘forbidden’) Sunbelt XXIV, Portorož, 2004 + 012 + 120C, 210 021C, 111D, 111U, 030C, 201 16 Triad types and models Balance 3 - 102 Clusterability 16 - 300 1 - 003 Ranked Clusters 4 - 021D 5 - 021U Transitivity 9 - 030T 12 - 120D 13 - 120U 14 - 120C 15 - 210 Hierarchical Clusters 2 - 012 Sunbelt XXIV, Portorož, 2004 17 Blockmodeling • Matrix and permutation for visualization • Blockmodel – Partition of vertices into classes (positions) – Image matrix of relations among blocks • Types of blockmodels – Cohesive subgroups – Center-periphery structure – Ranks • Types of equivalence: – Structural equivalence: hierarchical clustering – Regular equivalence Sunbelt XXIV, Portorož, 2004 18 Cohesive subgroups Domingo Carlos Alejandro Eduardo Frank Hal Karl Bob Ike Gill Lanny Mike John Xavier Utrecht Norm Russ Quint Wendle Ozzie Ted Sam Vern Paul Sunbelt XXIV, Portorož, 2004 19 Image matrix Class Spanish English – young Spanish Complete Empty Empty English – young Empty Complete Empty English – old Empty Empty Complete Sunbelt XXIV, Portorož, 2004 English – old 20 Blockmodel types Image matrix Cohesion Center-periphery Sunbelt XXIV, Portorož, 2004 Ranking 21 Regular equivalence and errors pminister ministers advisors X pminister minister1 minister2 minister3 X minister4 minister5 minister6 X X minister7 advisor1 X X X advisor2 X advisor3 Sunbelt XXIV, Portorož, 2004 22 Networks and time • Longitudinal network: a network measured at different time points – Example: Sampson.paj • Diffusion: vertex property changing over time, e.g., adoption – Example: ModMath.paj • Descent: a relation spanning time – Genealogies: descent by birth; structural relinking – Citations: descent of ideas; main path analysis – Example: Gondola_Petrus.ged, centrality_literature.paj Sunbelt XXIV, Portorož, 2004 23 Genealogies • Data format: GEDCOM 5.5 standard www.gendex.com/gedcom55/55gcint.htm • Software: - Genealogical Information Manager www.mind spring.com/~dblaine/gim home.html - Personal Ancestral File www.familysearch.org Sunbelt XXIV, Portorož, 2004 24 Networks and statistics • Statistical relations among properties of vertices: partitions and vectors • Example: social and structural prestige (Ch. 9) • In Pajek: discrete (Cramer’s V, Rajski, rank correlation) and continuous (Pearson correlation, regression) • Pajek to R: see afternoon session • Pajek to other statistics software: paste numbers from partition or vector into statistics software datasheet Sunbelt XXIV, Portorož, 2004 25 Building your own • Macro: sequences of commands performed on selected data objects • Example: exposure in a diffusion network • Macro commands: – Record – Add message: add comment – Play Sunbelt XXIV, Portorož, 2004 26 Relations among chapters Ch.5 - Affiliations Ch.11 - Genealogies and citations Ch.9 - Prestige Ch.10 - Ranking Ch.2 - Attributes and relations Ch.4 - Sentiments and friendship Ch.12 - Blockmodels Ch.1 - Looking for social structure Ch.3 - Cohesive subgroups Ch.6 - Center and periphery Ch.7 - Brokers and bridges Ch.8 - Diffusion Sunbelt XXIV, Portorož, 2004 27