ARGUMENT 'FUN' IS MISSING, WITH NO DEFAULT: AN R WORKSHOP Outline The R “sales pitch” R Basics Data Management Descriptive Statistics in R Inferential Statistics in R General Linear Model Generalized Linear Model Hierarchical Linear Modeling Latent Variable Modeling Why Should I Use R? Free 99 It’s as powerful as SAS and as user friendly as SPSS…really… You ain’t cool unless you use R It’s free…seriously R Basics • • Do not write code directly into the R interface! #Comment #StatsAreCool #Rarrrgh • • R is case sensitive • • Yes the # lets you add comments to your code A≠a <- is the assignment operator • A <- 3; a <- 4 R Basics • Creating objects in R – Creating a scalar • – Creating a vector • – X <- c(2,2,4,5) Creating a matrix • • – X <- 2 X <- matrix(c(1,1,2,2,3,3),nrow=2, ncol=3) Y <- matrix(c(1,1,1,1,1,1),nrow=3,ncol=2) Creating a dataframe • • • A <- c(1,2,3,4) B <- c('T','F','T','F') ds <- data.frame(A,B) R Basics Arithmetic 2 Boolean Operators 2 + 2; 2-2; 2*3;2/3 > 3; 3 < 6; 4 == 4 Matrix Algebra X%*%Y t(X) ginv(X) R Basics Packages in R Like SPSS modules, but free… Upside: Thousands of packages to do just about anything Downside: Placing your trust in freeware…which I’m fine with, but some aren’t library(MASS) ginv(X) I’m an import-exporter: Database Management Importing from a text file Dataset Importing from a csv file Dataset <- read.table(‘filelocation.txt’) <- read.csv(‘filelocation.csv’) Foreign package to read SPSS data files package(foreign) Dataset <- read.spss(‘filelocation.sps’) Database Management Exporting R dataframes to csv write.csv(dataframe, ‘filelocation.csv’) Exporting R dataframe to text file write.table(dataframe, ‘filelocation.txt’) Variables in a dataframe Adding: ds$C <- c(4,3,2,1) Deleting: ds <- ds[,-3] Referencing: ds$A or ds[,1] Database Management Indexing Dataframes ds[,2] gives you column 2 of ds ds[1,] gives you row 1 of ds ds[2,2] gives you row 2 column 2 of ds Descriptive Statistics Measures of central tendency Mean – mean(X) Median – med(X) Mode – table(X) (A little round about, but oh well) Measures of dispersion var(X) sd(X) Descriptive Statistics Measures of Covariation cov(X,Y) – Covariance cor(X,Y) – Correlation Caution! I will not be talking about any of the theoretical underpinnings as to when or why you should use one statistical method over another. We’ll just be doing some PnP statistics… General Linear Model Read Edwards & Lambert, 2007 Z M X Y Generalized Linear Model Uses the generalized linear modeling function glm() Can handle dvs that are binomial, poisson, multinomial, guassian glm(y ~ x1 + x2, family=binomial, data=LRDS) Hierarchical Linear Model HLM allows you to look at between and within group variation Employees nested within organizations Repeated measures nested within an individual Variance Components Analysis Latent Variable Modeling First we have to setup a measurement model: LV1 LV2 X1 X4 X2 X3 LV3 Y1 Y4 Y2 Y3 Y5 Y8 Y6 Y7 Latent Variable Modeling Then we have to setup the structural model: LV2 LV1 LV3