PPT

advertisement
ARGUMENT 'FUN' IS MISSING,
WITH NO DEFAULT: AN R
WORKSHOP
Outline





The R “sales pitch”
R Basics
Data Management
Descriptive Statistics in R
Inferential Statistics in R
 General
Linear Model
 Generalized Linear Model
 Hierarchical Linear Modeling
 Latent Variable Modeling
Why Should I Use R?


Free 99
It’s as powerful as SAS and as user friendly as
SPSS…really…

You ain’t cool unless you use R

It’s free…seriously
R Basics
•
•
Do not write code directly into the R interface!
#Comment #StatsAreCool #Rarrrgh
•
•
R is case sensitive
•
•
Yes the # lets you add comments to your code
A≠a
<- is the assignment operator
•
A <- 3; a <- 4
R Basics
•
Creating objects in R
–
Creating a scalar
•
–
Creating a vector
•
–
X <- c(2,2,4,5)
Creating a matrix
•
•
–
X <- 2
X <- matrix(c(1,1,2,2,3,3),nrow=2, ncol=3)
Y <- matrix(c(1,1,1,1,1,1),nrow=3,ncol=2)
Creating a dataframe
•
•
•
A <- c(1,2,3,4)
B <- c('T','F','T','F')
ds <- data.frame(A,B)
R Basics

Arithmetic
2

Boolean Operators
2

+ 2; 2-2; 2*3;2/3
> 3; 3 < 6; 4 == 4
Matrix Algebra
 X%*%Y
 t(X)
 ginv(X)
R Basics

Packages in R
 Like
SPSS modules, but free…
 Upside: Thousands of packages to do just about
anything
 Downside: Placing your trust in freeware…which I’m
fine with, but some aren’t

library(MASS)
 ginv(X)
I’m an import-exporter: Database
Management

Importing from a text file
 Dataset

Importing from a csv file
 Dataset

<- read.table(‘filelocation.txt’)
<- read.csv(‘filelocation.csv’)
Foreign package to read SPSS data files
 package(foreign)
 Dataset
<- read.spss(‘filelocation.sps’)
Database Management

Exporting R dataframes to csv
 write.csv(dataframe,

‘filelocation.csv’)
Exporting R dataframe to text file
 write.table(dataframe,

‘filelocation.txt’)
Variables in a dataframe
 Adding:
ds$C <- c(4,3,2,1)
 Deleting: ds <- ds[,-3]
 Referencing: ds$A or ds[,1]
Database Management

Indexing Dataframes
 ds[,2]
gives you column 2 of ds
 ds[1,] gives you row 1 of ds
 ds[2,2] gives you row 2 column 2 of ds
Descriptive Statistics

Measures of central tendency
 Mean
– mean(X)
 Median – med(X)
 Mode – table(X) (A little round about, but oh well)

Measures of dispersion
 var(X)
 sd(X)
Descriptive Statistics

Measures of Covariation
 cov(X,Y)
– Covariance
 cor(X,Y) – Correlation
Caution!
I will not be talking about any of the theoretical
underpinnings as to when or why you should use one
statistical method over another.
We’ll just be doing some PnP statistics…
General Linear Model

Read Edwards & Lambert, 2007
Z
M
X
Y
Generalized Linear Model

Uses the generalized linear modeling function
 glm()
 Can
handle dvs that are binomial, poisson, multinomial,
guassian
 glm(y
~ x1 + x2, family=binomial, data=LRDS)
Hierarchical Linear Model

HLM allows you to look at between and within
group variation
 Employees
nested within organizations
 Repeated measures nested within an individual
 Variance Components Analysis
Latent Variable Modeling
First we have to setup a measurement model:
LV1
LV2
X1
X4
X2
X3
LV3
Y1
Y4
Y2
Y3
Y5
Y8
Y6
Y7
Latent Variable Modeling
Then we have to setup the structural model:
LV2
LV1
LV3
Download