Wh t are networks? What t k? Brian D. Fath y, Towson,, MD USA Towson University, International Institute for Applied Systems Analysis, Laxenburg, Austria A new paradigm Environmental concerns have become of paramount importance. Certain global problems may soon be irreversible (example, deforestation, extinction, soil loss, climate change) - can’t turn back the clock. These are systemic Th t i problems bl that th t can’t ’t be b understood d t d in i isolation i l ti but rather are interconnected and interdependent. Current problem: Pest management Conventional response to crop pest is to spray a pesticide designed to kill that insect. Imagine a perfect pesticide that kills all target insects and which has no side effects on air, water, or soil. Is using this pesticide likely to make the farmer better off? Representing R ti the th thinking thi ki usedd by b those th applying l i the th pesticides ti id would ld look like this: Unfortunately, what frequently happens is that in following years the problem of crop damage gets worse and worse and the pesticide that formerly seemed so effective does not seem to help anymore. E.g., the pest was controlling another insect population, either by predation d ti or competition. titi The Th effective ff ti pesticide ti id eliminates li i t the th control that those insects were applying on the population of the other insects. Then non non-target target insect populations explode and cause more damage than the insects killed by the pesticide. In other words, the action intended to solve the problem actually makes it worse because unintended side effects change the system ends up exacerbating the problem. Studies suggest a majority of the 25 insects that cause the most crop damage became problems because of this cycle. Many important problems today are complex, involve multiple actors, and are at least partly the result of past actions that were taken to alleviate them. Dealing with such problems is difficult and the results of conventional solutions are often poor enough to create discouragement about the prospects of ever effectively addressing them. If everything is connected to everything else, else then how can we ever know anything? There is a need for scientific methodologies that deal with whole systems: Systems modelling and Network analysis are such approaches System theory Core Assumptions and Statements System theory is the transdisciplinary study of the abstract organization of phenomena, independent of their substance, type, or spatial ti l or temporal t l scale l off existence. it It investigates i ti t both b th the th principles common to all complex entities, and the (usually mathematical) models which can be used to describe them. them History of Systems theory •Ludwig von Bertalanffy - biologist (1940s) ((General Systems y Theory, y 1968)) •Ross Ashby (Introduction to Cybernetics, 1956). •Jay J F Forrester t founded f d d System S t dynamics d i in i 1956 - way off ttesting ti new ideas about systems, in the same way we test ideas in engineering. engineering •Club of Rome –think tank developed world models •Donella Meadows et al. “Limits to Growth” •George G Kli Klir (Facets (F t off Systems S t Science, S i 1991) di discusses conceptual t l foundations and philosophy (e.g. Bunge, Bohm and Laszlo); •Fritjof Capra “popularized” systems theory ideas through mass media books and application to social system von Bertalanffyy was both reactingg against g reductionism and attempting to revive the unity of science. The approach of systems thinking is fundamentally different from that of traditional forms of analysis, which focuses on separating the individual pieces of the study object. Rather than reducingg an entityy to the pproperties p of its parts, p , systems y theory focuses on the arrangement of and relations between the parts which connect them into a whole. This results in sometimes strikingly different conclusions, especially when what is being studied is dynamically complex or has a great deal of feedback from internal or external sources. Investigating Biological Systems Haeckel – 1866 “Ecology” (oikos) study of Earth household. von Uexkull U k ll – Umwelt U lt means "environment" " i t" or "surrounding " di world" ld" Lotka – energy flow in ecology Elton – feeding relations Tansley 1935 coined term “ecosystem” Lindeman 1942 – trophic dynamic concept Vernadsky – Biosphere: life partly creates and partly controls the planetary environment Lovelock and Margulis: Gaia A system can be said to consist of four things: 1. the parts or elements of the system. 1 system These may be physical or abstract or both, depending on the nature of the system. 22. the th qualities liti or properties ti off the th system t andd its it objects; bj t attributes 3 internal 3. i l relationships l i hi among its i objects. bj 4. systems exist in an environment. A system, then, is a set of things that affect one another within an environment and form a larger pattern that is different from any of the parts. S t Systems as Networks N t k Introduction to Networks Fundamental Concepts in Network Analysis C Concerned d with ith understanding d t di linkages li k among actors/objects and the implications of them. Actor/Object – are discrete individual or collective unit it (people, ( l departments, d t t nations, ti corporate t sectors, t species, trophic groups, cells, organelles). Connections/Ties – links between two actors/agents Transaction – exchange of material or information • transfer t f off material t i l resources (financial, (fi i l energetic) ti ) • movement (migration) • behavioral interaction (talking, (talking messaging) Pattern – structure of organization • evaluation of one person by another (friendship, respect) • association or affiliation ((social groups, g p , trophic p groups) g p) • physical connection (road, river, bridge) • formal relation (authority) ( y) • biological relation (kinship) Communicable disease Syphilis Outbreak in Rockdale County, Georgia 1996 Terrorism network High School Friendship High School Dating The Internet Ecological Food Web Oyster Reef Model z1 = 41.4697 y1 = 25.1646 25 1646 f61 = 0.5135 0 5135 Filter Feeders P d Predators x6 = 69.2367 x1 = 2000.00 f21 = 15.7915 15 7915 y6 = 0.3594 f26 = 0.3262 f65 = 0.1721 f25 = 1.9076 y2 = 6.1759 Deposited Detritus x2 = 1000.00 f53 = 1.2060 y5 = 0.4303 0 4303 x5 = 16.2740 f24 = 4.2403 f32 = 8.1721 y3 = 5.7600 Deposit Feeders f52 = 0.6431 0 6431 f42 = 7.2745 7 2745 f54 = 0.6609 Microbiota Meiofauna x3 = 2.4121 x4 = 24.12140 f43 = 1.2060 Dame and Patten 1981 y4 = 3.5794 Network analysis is a tool that allows you to formally (i e not just intuitively) investigate and interpret (i.e., systems. A big part of this class will be in learning to recognize, i construct, t t analyze l and d interpret i t t (socio)-ecological networks!!! What are network data? Boundary specification What h iis your population? l i Must have a finite set of actors ( (company, sports league, l ecosystem, group). ) Who h are the h relevant l actors?? Identify the population. How are they connected? Id if the Identify h connections. i Be B consistent. i How do you get the data? D t measurementt andd collection Data ll ti Questionnaires I t i Interviews Observations A hi l records Archival d Experiments Oth techniques Other t h i Examples: Ecosystems (from field or from literature) Economies Employment p oy e Kinship Social relations Sports leagues Let’s construct a network of the students in the class… Notation for network data: Graphs Whyy ggraphs? p A ggraph p is a model of the system. y Model - a simplified representation Graphs p provide: p • a common vocabulary • known mathematical operations p • one can prove theorems about graphs and hence about representations p of network structures. A graph consists of two sets of information: { 1, x2, …,, xn} and a set off nodes,, X = {x a set of lines, L = {l1, l2,…, lL} between pairs of nodes nodes. Th are n nodes There d andd L lines. li Graph – undirected pairwise connection (“is ) kin to”,, “lives near”,, “works with”). No direction implied. Two nodes are adjacent if the line lk = (xi, xj), is in the set of lines L. L Each line is an unordered pair of distinct node lk = (xi, xj), since it is unordered lk = (xi, xj) = (xj, xi). xi lk xj Loop – single edge starting and ending on same node Simple graph – no m multiple ltiple edges or loops Special p Cases: A trivial graph is one with only one node. An empty graph is one with no lines. X1 X2 X3 X4 X5 X6 Actor Allison Drew Eliot Keith Ross Sarah l1 X1 Allison l2 X3 Eliot X5 Ross l3 l4 l6 Connection (lives near) Ross,, Sarah Eliot Drew Ross, Sarah Allison,, Keith,, Sarah Allison, Keith, Ross X2 Drew X4 Keith l5 X6 Sarah l1 = (x ( 1, x 6) l2 = (x1, x5) l3 = (x2, x3) l4 = ((x4, x5) l5 = (x4, x6) l6 = (x5, x6) Degree of a node is given by the number of nodes that are adjacent to it it. Degree ranges from 0 to n–1 n 1 Each node could have its own degree Mean nodal degree is a statistic that reports th average degree the d off the th nodes d in i the th graph. h d = ∑ n i =1 d ( xi ) n 2L = n X1 Allison X2 Drew X3 Eliot X4 Keith X5 Ross X6 Sarah d(x1)=2 d(x2)=1 d(x3)=1 d(x4)=2 d(x5)=3 d(x6)=3 Total=12 n=6 Mean nodal degree = 2 d = ∑ n d ( x ) i i =1 n 2L = n If all node degrees are equal graph is said to be d-regular, a measure of uniformity If it is not dd-regular, regular the variance of degrees is calculated as: ∑ (d ( x ) − d ) n S = 2 D i =1 i n 2 ∑ (d ( x ) − d ) n S = 2 D SD = = i =1 i n 2 2 2 2 2 2 ( ) ( ) ( ) ( ) ( ) ( ) ( 2− 2 + 2− 1 + 2− 1 + 2− 2 + 2− 3 + 2− 3 ) 6 2 2 2 2 2 2 ( ) ( ) ( ) ( ) ( ) ( ) (0 + 1 + 1 + 0 + −1 + −1 ) 4 = 6 2 6 Which has a higher mean nodal Degree standard deviation? X5 X4 B. X3 X1 X2 X1 X3 X4 X5 A. X2 Graph Density – proportion of lines in graph Since there are n nodes, and excluding loops, there are n(n–1)/2 possible lines in the graph. L 2L Δ = = n( n − 1) / 2 n( n − 1) Relation between density and mean degree. C combine Can bi equations ti to t get: t d Δ = ( n − 1) X1 Allison X2 Drew X3 Eliot X4 Keith X5 Ross X6 Sarah 2L 2(6) 12 Δ = = = = 0.40 n( n − 1) 6( 5) 30 If all lines are present, then the graph is called a complete l t graph, h Kn Denoted Kn and has n(n-1)/2 n(n 1)/2 undirected ndi ected edges Example Florentine Families d = 2.5; S D2 = 2.120; Δ = 01667 . ; Nodal Degree g 1 Acciaiuol 2 Albizzi 3 Barbadori B b d i 4 Bicheri 5 Castellan 6 Ginori 7 Guadagni 8 Lambertes 9 Medici 10 Pazzi 11 Perruzi 12 Pucci 13 Ridolfi 14 Salviati 15 Strozzi 16 Tornabuon 1 3 2 3 3 1 4 1 6 1 3 0 3 2 4 3 Walk, trail, and path Walk is a sequence of nodes and lines, starting and ending with nodes Length h off a walk lk is i number b off occurrences off lines in it. If a line is included more than once on the walk, then it is counted each time it occurs. occurs Walk, trail, and path (cont) Trail is a walk in which all lines are distinct Path is a walk in which are all nodes and lines are di i distinct If there is a path between two nodes xi and xj then xi and xj are said to be reachable. A graph is connected if there is a path between every pair of nodes, i.e., all nodes are reachable. Distance and Diameter Distance, d(i,j), is the shortest path b between pairs i off nodes d Diameter of a connected graph is the length of the largest distance between any pair of nodes. nodes Graph vs. Subgraph Node and line generated subgraphs selecting l i nodes d or lines li to generate a subgraph b h Connected subgraphs in a graph are called components Graph Connectivity Cutpoints: A node, xi is a cutpoint if the number of components in i the h graphh with i h xi is i fewer f than h the h number of components in the subgraph that results from deleting xi from the graph. Bridge, B d analogous l to cutpoint. i A bridge b id is i a line li that h is i critical to the connectedness of the graph. Florentine example revisited A vulnerable graph is one that is more likely to become disconnected if a few nodes or lines are removed. Cutpoints: Albizzi Guadagni Medici Salviati Bridges: Albizzi-Ginori Guadagni-Lambertes Medici Salviati Medici-Salviati Pazzi-Salviati Medici-Acciaiuol Isomorphic graphs – one one-to-one to one mapping, that preserves the adjacency of the nodes. If two graphs are isomorphic, then they are identical on all graph theoretic properties. x2 x1 x1 x2 x3 x4 x3 x4 Cyclic and acyclic graphs A graph that is connected and is acyclic is called a tree. A di disconnected d graph h with i h no cycles l is i called ll d a forest DIRECTED GRAPHS Many connections are directional, meaning it is oriented from one actor to another. Directed graph or digraph, has a set of nodes and arcs. Each E h arc is i an ordered d d pair i off distinct di ti t nodes The arc <xi, xj> is direct from xi (the origin or sender) to xj (the termin terminuss or recei receiver). er) In <xi, xj>, node xi is adjacent to xj, and node xj is adjacent from xi The arc is represented p byy an arrow. Three types of directed dyads 1. Null dyads have no arcs, in either direction between the two nodes. 2 A 2. Asymmetric dyad d d has h an arc going i in one direction or the other, but not both x11 x22 x1 x2 oor x1 x2 3 A mutuall or reciprocall dyad 3. d d has h two arcs one going in one direction x1 and the other going in the opposite direction. x2 Actor X1 Allison X2 Drew X3 Eliot X4 Keith X5 Ross X6 Sarah Connection (likes at beginning of year) Drew Ross Drew, Eliot, Sarah Dre Drew Ross Sarah Drew Allison Drew Eliot Keith Ross Sarah Indegree, dI(xi), is the number of nodes th t are adjacent that dj t to t or the th number b off arcs terminating at xi. Outdegree, dO(xi), is the number of nodes that are adjacent from or the number of arcs originating at xi. Outdegrees are measure of expansiveness IIndegrees d measure off receptivity ti it or popularity Mean indegree and outdegree dI = dO = ∑ n d ( x ) I i i =1 n ∑ n d ( x ) O i i =1 n L d I = dO = n Variance of indegree and outdegree ∑ (d n S 2 DI = i =1 I ( xi ) − d I ) n ∑ (d n S 2 DO = 2 i =1 O ( xi ) − d O ) 2 n Measures how unequal the actors are in a network wrt originating or receiving connections Types of nodes in a directed graph Isolate if dI(xi) = dO(xi) = 0, Transmitter if dI(xi) = 0 and dO(xi) > 0, Receiver if dI(xi) >0 and dO(xi) = 0, Carrier or ordinary if dI(xi) >0 and dO(xi) > 0. L Density of a directed graph: Δ = n( n − 1) Distance and Diameter of digraph Distance shortest path from xi to xj Diameter is the length of the longest distance between any pair of nodes. Valued graphs and value directed graphs Weighted graphs, frequency of interaction, dollar amount of exchange, energy flow in ecosystem. Set of graphs whose values are probabilities. Th These graphs h are known k as Markov M k Chains Ch i and their corresponding matrices are referred to as transition matrices. “For the last thirty years, empirical social research has been dominated by b the sample survey. s r e But B t as usually s all practiced, practiced using sing random samplings of individuals, the survey is a sociological meatgrinder, tearing the individual from his social context and guaranteeing that nobody in the study interacts with anyone else in it. It is a little like a biologist putting his experimental animals through a hamburger machine and looking at every hundredth cell through a microscope; anatomy and physiology get lost, structure and function disappear and one is left with cell biology disappear, biology… If our aim is to understand people’s behavior rather than simply record it, we want to know about primary groups, neighborhoods, organizations, social circles, and communities; about interaction, communication, role expectations, and social control.” Barton 1968 reprinted in Freeman 2004. Freeman defines Social Network Analysis, as a defined pparadigm g of research, havingg the following: g 1. Social network analysis is motivated by a structural intuition based on ties linkingg social actors, 2. It is grounded in systematic empirical data, 3. It draws heavily on graphic imagery, and 4. It relies on the use of mathematical and/or computational models. Graph information can be expressed as a Matrix. U f l ffor presenting, Useful ti manipulating, i l ti andd analyzing l i data Adjacency Matrix Matrix, (A=aij) – rows and columns labeled by edges, with a 1 in position (ai, aj) iff ai and aj are adjacent, and 0 otherwise. Graph with no loops, the adjacency matrix must have 0s on the diagonal. In undirected graphs the adjacency matrix is symmetric: aij=aji X1 X2 X3 X4 X5 X6 Actor Allison Drew Eliot Keith Ross Sarah Connection (lives near) Ross,, Sarah Eliot Drew Ross, Sarah Allison,, Keith,, Sarah Allison, Keith, Ross Allison Drew Eliot Keith Ross Sarah ⎡0 ⎢0 ⎢ ⎢0 A= ⎢ ⎢0 ⎢1 ⎢ ⎣1 0 0 0 1 1⎤ 0 1 0 0 0⎥ ⎥ 1 0 0 0 0⎥ ⎥ 0 0 0 1 1⎥ 0 0 1 0 1⎥ ⎥ 0 0 1 1 0⎦ Actor X1 Alli Allison X2 Drew X3 Eliot X4 Keith X5 Ross X6 Sarah Connection (likes at beginning of year) D Drew, R Ross Eliot, Sarah Drew Ross Sarah Drew ⎡ 0 1 0 0 1 0⎤ Allison so Drew ew Eliot Keith Ross Sarah ⎢0 ⎢ ⎢0 A= ⎢ ⎢0 ⎢0 ⎢ ⎣0 0 1 0 0 1⎥ ⎥ 1 0 0 0 0⎥ ⎥ 0 0 0 1 0⎥ 0 0 0 0 1⎥ ⎥ 1 0 0 0 0⎦ Matrix Vocabulary Size (or order) is defined as the number of rows and columns in the matrix. Adjacency matrices have the same number of rows and columns and thus are square. Each entry in a matrix is called a cell or element Main Diagonal – consists of the entries in which the row and column index are the same (aii). A symmetric matrix is one with aij=aji for all i,j Matrix Vocabulary (cont) Matrix addition is possible if the matrices are the same size, Z=X+Y, where zij=xij+yij Matrix Vocabulary (cont) Matrix multiplication is used to study walks and reachability Z=YW Number of columns of Y must equal the number of rows of W Identity matrix (I) is defined such that I (X) ≡ X ⎡1 ⎢0 I= ⎢ ⎢M ⎢ ⎣0 0 L 0⎤ 1 L 0⎥ ⎥ M O M⎥ ⎥ 0 L 1⎦ Matrix Vocabulary (cont) Powers of a matrix XX=X2 XX2 =X3 XX3 =X4 in general, Xm (X to the mth power) is the matrix product of X times itself, p times Powers of a matrix!! The matrix Xm gives exactly the number of walks between two nodes of length m. X1 are the direct walks. X2 are the walks that take two steps X3 are the walks that take three steps, etc. Notice that some elements which were zero originally get filled in. In other words we have a way to identify the indirect, i.e., m>1, walks in the matrix, and hence in the graph. Example l 1 - digraph di h x1 x3 x2 ⎡ 0 0 1⎤ ⎢ ⎥ A = ⎢ 1 0 0⎥ ⎢⎣ 1 1 0⎥⎦ Higher Order (Indirect) Pathways Am , x1 where m > 1 What happens to aij as m @ ? x3 ⎡ 1 1 0⎤ ⎢ ⎥ 2 A = ⎢ 0 0 1⎥ ⎢⎣ 1 0 1⎥⎦ ⎡1 0 1⎤ ⎢ ⎥ 3 A = ⎢1 1 0⎥ ⎢⎣1 1 1⎥⎦ ⎡ 1 1 1⎤ ⎢ ⎥ 4 A = ⎢ 1 0 1⎥ ⎢⎣ 2 1 1⎥⎦ ⎡ 2 1 1⎤ ⎢ ⎥ 5 A = ⎢ 1 1 1⎥ ⎢⎣ 2 1 2⎥⎦ x2 Powers of a matrix!! Over all walk lengths if there is a way to get between any two nodes than they are reachable so one can sum the powers of the matrices to see if there are any gaps in the connectedness. X[R]=X+X2+X3+…Xn–1 Two nodes are reachable if and only if X[R]1 and not reachable if it is 0. MATRIX CALCULATIONS PRACTICE: PRACTICE 1. BY HAND 2 WITH MATLAB 2. z1 = 41.4697 Oyster reef model y1 = 25.1646 f61 = 0.5135 Filter Feeders y6 = 0.3594 x6 = 69.2367 x1 = 2000.00 f21 = 15.7915 Predators f26 = 0.3262 f65 = 0.1721 f25 = 1.9076 y2 = 6.1759 6 1759 Deposited Detritus x2 = 1000.00 f53 = 1.2060 Deposit Feeders y5 = 0.4303 x5 = 16.2740 f24 = 4.2403 f32 = 8.1721 8 1721 y3 = 5.7600 f52 = 0.6431 f42 = 7.2745 f54 = 0.6609 Microbiota Meiofauna x3 = 2.4121 x4 = 24.12140 f43 = 1.2060 y4 = 3.5794 A= 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 1 1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 A2 = 0 1 1 1 1 0 A3 = 0 2 1 2 3 1 0 2 0 1 2 1 0 4 2 2 3 2 0 2 0 0 1 1 0 2 2 2 2 1 0 1 1 1 1 1 0 3 1 2 3 1 0 1 1 1 1 0 0 2 1 2 3 1 0 0 1 1 1 0 A4 = 0 6 2 3 5 3 0 2 0 1 2 1 A5 = 0 0 0 11 17 12 6 7 5 8 11 7 11 17 11 5 8 6 0 7 4 6 8 3 0 5 2 4 6 2 0 6 3 4 6 3 0 6 2 3 5 3 0 4 2 2 3 2 0 0 0 13 11 7 6 6 4 9 8 6 13 11 8 6 5 3 A10 = 0 519 241 353 518 242 0 0 0 759 518 354 241 519 354 760 519 353 241 0 595 277 406 595 277 0 519 241 353 518 242 354 165 241 353 165 A20 = 0 1083304 504356 739169 1083304 504355 0 1587660 739168 1083304 1587660 739169 0 0 0 0 1083305 1243524 1083304 739168 504355 578949 504356 344136 739168 848491 739169 504356 1083304 1243524 1083304 739169 504356 578949 504355 344135 THANK YOU FOR YOUR ATTENTION