Introduction to R
Lecture 1: Getting Started
Andrew Jaffe
8/30/10
Lecture 1
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
About the Course
• Series of 7 seminars
• Covers the usage of R
– Platform for beginning analyses
– NOT covering statistics
– Good programming etiquette
• Bring your laptop – there will be breaks to
allow you to practice the code
About the Course
• This seminar is 1 unit pass/fail
• To pass, attend 5 out of 7 seminars
• Very little outside work
About the Course
• Some learning objectives include:
– Importing/exporting data
– Data management
– Performing calculations
– Recoding variables
– Producing graphics
– Installing packages
– Writing functions
About the Course
• Course communication via E-mail
• Lectures and code will be hosted on my
webpage
– http://www.biostat.jhsph.edu/~ajaffe/rseminar.
html
About the Instructor
• 3rd year PhD student in Genetic Epi
program, concurrent MHS in
Bioinformatics
• Learned R five years ago, been using
regularly the last two
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
What is R?
• R is a language and environment for
statistical computing and graphics
• R is the open source implementation of
the S language, which was developed by
Bell laboratories
• R is both open source and open
development
http://www.r-project.org/
What is R?
• Pros:
– Free
– Tons of packages, very flexible
– Multiple datasets at any given time
• Cons:
– Much more “programming” oriented
– Minimal interface
These are my personal opinions
What is R?
• Often times, a good first step for data
cleaning and manipulation
• Then, export data to STATA or SAS for Epi
analyses
What is R?
Console
Script
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
Installing R
• http://cran.r-project.org/
Installing R - Windows
• Windows: click “base” and download
Installing R - Windows
• Click the link to the latest build
Installing R - Mac
• Mac: click the latest package’s .pkg file
Installing R
•
•
•
•
Double click the downloaded file
Hit ‘next’ a few times
Use default settings
Finish installing
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
Installing a Text Editor
• Windows: R’s built-in text editor is terrible
– It’s essentially Window’s notepad
– We will download a much better one
• Mac: R’s built-in text editor is sufficient
– Color coding, signals parenthesis closing, etc
– I suggest using this until you think you need a
better one
Installing a Text Editor
• I prefer Notepad++:
– http://notepad-plus-plus.org/
– Download the current version:
http://download.tuxfamily.org/notepadplus/5.7/
npp.5.7.Installer.exe
– Install on your computer using defaults
Installing a Text Editor
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
Interfacing with R
• Scripts: documents that contain
reproducible R code and functions that
you can send to the console (and save)
– Files are designated with the “.R” extension
– You can “source” scripts (more later)
• Console: Type commands directly into the
console
– Good for looking at your data, trying things,
and plotting
Interfacing with R - Mac
• Mac: File  New Script
• This opens the default text editor
• To send a line of code to the R console,
press Apple+Enter when the cursor is
anywhere on that line
• Highlight chunks of code and press
Apple+Enter to send
Interfacing with R - Windows
• Using the default text editor, pressing
Ctrl+R sends lines to the console
• However, we want to use Notepad++
• We need to download one more thing…
Interfacing with R - Windows
• “NppToR”: Notepad++ to R
• http://sourceforge.net/projects/npptor/
• It must be running when R and Notepad++
are open
• When properly configured, press F8 to
send lines of code, or highlighted chunks,
to the console
• I will help configure this after class today
Interfacing with R – Windows
• More detailed instructions for installing
NppToR
• http://sourceforge.net/apps/mediawiki/nppt
or/index.php?title=Installing
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
Writing Scripts
• The comment symbol is # (pound) in R
• Comment liberally - you should be able to
understand a script after not seeing it for 6
months
• Lines of #’s are useful to separate sections
• Useful for designating headers
Writing Scripts
#################
# Title: Demo R Script
# Author: Andrew Jaffe
# Date: 7/30/10
# Purpose: Demonstrate comments in R
###################
# this is a comment, nothing to the right of it gets read
# this # is still a comment – you can use many #’s as you want
# sometimes you have a really long comment, like explaining what you
# are doing for a step in analysis. Take it to a second line
Writing Scripts
• Some common etiquette:
– You can use spaces (more generally “white
space”) within functions and commands
liberally as well
– Try to keep a reasonable number of
characters per column – many commands
can be broken into multiple lines
– More to come later…
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
R as a Calculator
• The R console functions as full calculator
• Try to play around with it:
+, -, /, * are add, subtract, multiply, and divide
^ or ** is power
( and ) work with order of operations
Lecture 1
•
•
•
•
•
•
•
•
Course overview
What is R?
Installing R
Installing a text editor
Interfacing text editor with R
Writing scripts
Using R as a calculator
Assignment
Assignment
• The assignment… operator: assigning a
value to a name
• R accepts two operators “<-” and “=“
– Ie: x=8 (remember whitespace!: x = 8, x <- 8)
• Variable names are case-sensitive
– Ie: X and x are different
• Set x = 8, and try using calculator
functions on x
Assignment
• ‘Assignment’ literally puts whatever is on
the right side of the operator into your lefthand side variable
– Note that although you can name variables
anything, you might run into some issues
naming things the same as default R
functions  Np++ turns functions red/pink so
you know…
Examples of assignment,
introducing R data
Enough to get R up and running if this
is the only class you attend. We will
see them in much more detail over the
next three sessions
Assignment
• status <- c(“case”,”case”,”case”,
“control”,”control”,”control”)
status
class(status)
table(status)
factor(status)
[alternatively: status <- c(rep(“case”,3),
rep(“control”,3))]
Assignment
• web <“http://www.biostat.jhsph.edu/~ajaffe/code/lec1_
code.R”
– class(web)
– source(web)
• You also don’t have to save tables/data you find
online to your disk (note read.table works for
most things – below aren’t tables though)
– scan(web, what=character(0), sep = "\n")
– scan(“http://www.google.com”, what=character(0))
Assignment
mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2,
byrow = T) # this is sourced in
class(mat)
mat
mat + mat
mat * mat
mat %*% mat
Assignment
• class(dat) # dat is also sourced in
• head(dat)
• table(dat$sex, dat$status)
• …To be continued…
Questions?