Longitudinal Network Analysis Content • Basic model assumptions in SIENA • Exercise • Interpretation of results • Improvement of model • Further relevant topics Basic model assumptions in SIENA What are the assumptions SIENA makes and can you apply SIENA to test your hypotheses? Let‘s start with a small movie…… • Communication in a class room with two teachers • All communication was observed. Model assumptions 1 / 2 • Most observed networks in social sciences are censored data. They desribe the current state of a network but this state is the outcome of unobserved processes and subject to further change – Plausible for many networks: friendship, trust, exchange. – Plausible for your network? • Networks change in micro steps. Micro steps of actors in the network account for large changes in the observed networks (t1, t2, t3, ….. tn). • Markov process: For any point in time, the current state of the network determines probabilistically its further evolution, and there are no additional effects of the earlier past. All relevant information is therefore assumed to be included in the current state. – Social network change as endogeneous process: social network is the social context that influences the probabilties of ist own change – But: one can include exogeneous effects: constant covariates or changing covariates Model assumptions 2 / 2 • Actors control their outgoing ties = Actor-based model: Actors change their outgoing ties on basis of their and others‘ attributes, their position in the network, and their perceptions about the rest of the network – This is more plausible for directed graphs. In undirected graphs one actor will take the iniatiative. – Options for actors are to create a tie, withdraw a tie, or do nothing – Theoretical problem of limited information: Can you justify that actors are aware of others‘ attributes or even the wider network (e.g., actors at distance two)? • Bringing the individual back in (Kilduff & Krackhardt, 1994). • Structural individualism (Udehn, 2002; Hedström, 2005) • No more than one tie can change at a time: – Sequential change. Denies the possibility of coordinated action. The stochastic estimation processes 1. Estimation of frequency of opportunities a actor can make to change a tie (not doing anything is also a choice an actor can make) – How many micro steps does the model require to arrive at the observed network? 2. Estimation of user-specified effects on the probabilities of tie change – E.g., Does and actor‘s attribute or the tendency to reciprocate contribute to explain the observed model? – Parameter estimates are based on a simulation that uses t1 as starting point to predict the subsequent observed network -> conditional method of moments estimation. – t1 is not modeled but only used as input. 3. The estimation process takes time! Depending on your network size and the amount of parameters it can take several hours to run one estimation. – Use fast computers – Use multiple computers – Define your models well Defining a model • Check if the basic assumptions of SIENA are in agreement with your model!!! If not, try to use another mehtod. Social network analysis is (should be) theory driven not driven by the method! • The objective function is the part of SIENA that allows you to define how you expect actors form ties in a network – It is the rule of network behavior we assume in our theory – Like in linear statistical models the probability to change a tie is the linear combination of effects specified by the user accoding to a theoretical model – If a effect is estimated to be positive an actor will make a choice that leads to a network state where the corresponding effect is higher. The converse applies when the effect is negative. If the parameter is estimated to be zero, the effect is irrelevant for actors‘ choices. Example: • Objective function: 0.8 * reciprocity + 0.5 homophily • Imagine the given function and that it is the middle actor‘s turn to make a choice. What will the choice be? Time for an example: • Objective function: 0.8 * reciprocity + (- 0.5 )homophily • Imagine the given function and that it is the middle actor‘s turn to make a choice. What will the choice be? Basic effects: Effects you should consider to control for • Outdegree effect: Actors‘ basic tendency to form ties. If negative (usually the case), it indicates that actors are generally reluctant to form ties. If positive, it indicates that actors form ties no matter what. • Reciprocity effect: Seems to be a basic feature of social structure (Gouldner, 1960; Wasserman & Faust, 1994) • Transitivity: Also seems to be a basic feature of social networks (Davis, 1970; Holland & Leinhardt, 1970) (….but might not be well understood???) In general, however, the choice of effects should be theoretically driven A typology of effects Degree related effects e.g. indegree popuarity endogenous effects Triadic effects e.g. transitivity Covariate effects e.g. alter/ receiver effect can be both, endogenous and exogenous effects Getting SIENA started A guideline for applying SIENA • Data requirements Panel waves >= 2; preferably > 3. – When number gets high (let‘s say > 5) check if effects are homogeneous or if they change with time. See SIENA manual section 6.6.1 • Advisable to have at least 20 actors. Technically, the amount of actors is only restricted by your computer‘s working memory and it‘s speed – and the time you have to finish your thesis. – The larger the network the more difficult it is to assume that each actor is a potential partner for any other actor in the network. Can you assume that for your data? • Design your data collection in a way that you capture enough changes between ties. – Minimum of 40 changes cumulated over all successive panel waves is desired. – But you also don’t want to have too many changes because that could imply that your observations were too far away and that you lost valuable information along the way. • No less that 80% response. But actors may enter and leave the network -> See “composition change” in SIENA manual section 5.7 for an elegant solution Running SIENA • This is done in 5 steps: 1. Data 2. Transformation 3. Selection 4. Model 5. Results • Example here: van den Bunt friendship Data. Available at SIENA homepage 0. Getting started •For starting a new project choose “Start with new project” (who would have not guessed that?) 0. Getting started When running SIENA for the first time you need to define directories where you store your file – after a first time installation StocNet might ask you to do this right away. 1. Data • Network and covariate data have to be entered in .txt or .DAT files and have to be tab separated (consult SIENA manual section 5 for other options). • There cannot be blanks – Network files are adjacency matrices. – Covariates are rows with as many rows as actors. Actors have the same order as in the network. • Changing covariates: When number of observations is m then you need to include m-1 columns. – Example: First column contains covariate at T1, which is then used to predict T2. Second column contains covariate at T2, which is used to predict T3, etc,…. • Constant covariate : File can contain multiple constant covariates (e.g., demographics of an actor). E.g, first column age; second column gender, etc. – Dyadic variables: Enter in same format as network variables (adjacency matrix). Values between 0 – 255. Only integers. • for changing dyadic variables you need m -1 variables. 1. Data – Usually, you can copy/ paste from Ucinet/Excel into .txt files and it is automatically tab seperated. – When using large data sets (several hundreds of actors) you might run into trouble when using Microsoft Notepad to manage .txt files. Use some other software (e.g., EditPad Lite ; its for free). 1. Data 1. Click “Add” to add network files in successive order 2. Choose the file (which you should already have stored in your dedicated network folder) 3. Click into the box to change the name of the network (e.g., 1, 2, 3 ,4 ,5, ..) 1. Data 1. Click “Add” to add covariate files in any order 2. Choose the file (which you should already have stored in your dedicated folder) 3. Click into the boxes to change the names of the file and the covariates it contains. Click apply 2. Transformation • SIENA can only handle dichotomized networks (0 & 1) – Work in progress: valued graphs might be possible in near future • Indicate missing values 2. Transformation 1. Choose all the networks and define the missing values (see data description) 2. Transformation: Networks 1. Choose all the networks and click on “Recode” 2. Recode the network into 1 and 0 (see data description of example: 1 & 2 -> 1 & 3;4;5 -> 0) 2. Transformation: Attributes 1. Choose the covariate you want to recode and click on “Recode” 2. Recode the covariate into 1 and 0 (see data description: 1 & 2 -> 1 & 3;4;5 -> 0) 3. Selection • Here you can remove actors from the analysis. • Does not reflect social reality. Removing an actor in reality probably would affects whole network. • If you want to test differences in effect for a range of actors use covariates (e.g. dummies) 4. Model: Data specification • The networks • Dyadic variables • Constant covariates: E.g., gender • Changing covariates: – If endogenous then should be modeled as dependent variable (e.g., individuals’ performance) – If exogenous then then treated as independent changing covariate • Composition change. See manual section 5.7 4.Model: Data selection 1. Click on “data selection” 2. Put networks (and dyadic covariates) in successive order into the box. 3. Put the file containing gender, program, and smoking into the constant covariate box 4.Model: Model specification 1. Click on “Model specification” 2. Choose the effects you want to test according to your theory. Check under “v” 3. Click “ok” and then “run” to start the estimation process 5. Results 1. Scroll to the bottom of the results section or click on “full report” and go to the end of the report 2. Results of the last estimation process can be found here. New ones are placed beneath. 3. Results will be deleted if you enter new data in the “data specification” section 5. Results • Don’t jump to the parameter estimates first! • First, check if convergence is good. That is, if your model describes your observed data well. If not, then you cannot trust the parameter estimates! – Good convergence is indicated by t-rations close to zero – t-rations below |0.1| indicate convergence. • Check rate parameters: This are the unobserved changes an actor makes between two observations. You have to decide what is reasonable – Remember: An actor can also decide to do nothing • Significance of parameters / T-test: Divide the parameter by the standard error and look in a t-test table if the value is significant for an unlimited amount of degrees of freedom. – Above 1.96 is significant for p < .05 two-tailed • Parameters above 2 and certainly above 5 are doubtful. • Check covariance/ correlation matrix. – If you find high correlations they wont be problematic but might explain high standard errors in your parameters. In this case, you might exclude one of the two variables and re-run the estimation. Improving your model: • Bad convergence is probably due to a mis-specified model. –You can add/remove effects. But again: This should be guided by theory. –Increase the multiplication factor, e.g. to 10, then 15, then 20, etc. –Decrease the initial value of gain parameter, e.g. to 0.01, then 0.001, etc. –Increase the number of iterations (enough time to get a coffee…) •Should be 2000 for results to be reported.. –One change at a time! Improving your model: Model -> Model specification -> Tab: Options More advanced issues • Evolution of covariates: Influence of ties or influence within networks • Multilevel • Endowment effects • Interactions (between parameters, with time, rate parameter) • Score type test -> SIENA manual section 9.1 Multilevel Analysis (SIENA manual section 14) Meta Analysis Multi-Group Option Structural Zeros Parameters are not constrained within Networks Only rate parameters are not constrained within networks All parameters are the same for all networks Networks need to be of sufficient size Networks are combined and, therefore, yield higher power Networks are combined and, therefore, yield higher power Can differ in number of observation moments Can differ in number of observation moments Need to have same amount of observation moments * If one interacts sub-group dummies with rate parameters, same results as in multi-group option Preferred method because makes less strict assumption Preferred above “structural zero” approach with dummies because takes less time in SIENA Least preferred Useful information sources • SIENA homepage http://stat.gamma.rug.nl/siena.html • Yahoo StocNet user Group http://tech.groups.yahoo.com/group/stocnet/ Notes • The literature on SIENA spends some effort in explaining relative effect size. However, in social sciences we are generally interested in the significance of an effect and not its relative effect size because adding or removing an effect would change the relative effect size….and how do we know we added all the effects that “truly” predict the network.