Lab 1: A Crash Intro to R - clic

advertisement
Computational Methods for Data Analysis – 2014/15
Lab 1: A Crash Intro to R
Following very closely the first three chapters of Baayen’s Analyzing Linguistic
Data , available from Baayen’s pages at
http://www.sfs.uni-tuebingen.de/~hbaayen/publications/baayenCUPstats.pdf
Getting started with R
- starting R
- the R console
- New documents
Basics (Baayen 1.1)
- R as a calculator:
> 2+ 3
- Assignment
> X <- 2+3
- Comments
# this is a comment
Data: Tables (Data Frames)
- library(languageR)
head(verbs, n=10)
- accessing elements of data frames (Baayen 1.3)
o cell
verbs[1, 5]
o column
verbs[, 5]
X <- verbs[, 5]
o row
verbs[1, ]
o column by column name
verbs$LengthOfTheme
- Changing a value
o verbs[1,3] = “XXX”
- Creating contingency tables out of frames:
o xtabs( ~ RealisationOfRec + AnimacyOfRec, data = verbs)
- writing out
write.table(verbs, file = “dative3.txt”)
- reading in
data = read.table(“dative3.txt”)
data = read.csv()
Vectors
- creating a vector:
o rs = c (638, 799, 390, 569, 567)
- using the vector to select items from a data frame:
o verbs.rs = verbs[rs, ]
- creating a vector of integers in sequence:
o 1:5
- sorting a vector
o sort(rs)
- vectorization
o v1 * v2
- What vectors are for
o Basic statistics:
w <- rbinom(500, 4, .3)
mean(w)
sd(w),
Other data types (R Cookbook, ch. 5)
- Factors
o verbs.rs$AnimacyOfRec
- Scalars
o Really just vectors with one element
o Fundamentally vectorized
- Matrices
o Vectors with dimension
A <- 1:6
dim(A)
print(A)
dim(A) <- c(2,3)
print(A)
o diagonal matrices:
diag(3)
- Lists
o Heterogeneous vectors
o Indexed by index and by name (= like hash maps)
o Data frames as lists
Data exploration and plotting (Baayen, ch.2)
- library(MASS)
- mean(ratings$Length)
- median(ratings$Length)
- Histograms
truehist(ratings$Length, xlab=”words frequency”, col=”grey”)
- Saving plots
- Plotting
plot(ratings$Frequency,ratings$FamilySize)
-
Boxplots
boxplot(lexdec$RT)
Mosaic plots
mosaicplot(verbs.xtabs, main=”dative”)
Control and Functions
- defining a function:
foo = function (x) { …. }
- control structures
Warmup exercise
- open a new document
- define a function that returns an identity matrix
- use getwd(), setwd() to set the working directory appropriately
- use source to load the document in the R console
PlotData exercise
- Download ex1data1.txt from the web pages
- Read the table into the variable data
- data <- read.csv('ex1data1.txt', header=FALSE)
- Set variable X to the first column of data, y to the second column
o NB in R first column has index 1
- In a new document, define a function that plots two vectors x and y, and
source it
- Invoke your function on X and y
Download