Uploaded by fayvre7

IntroRStudio

advertisement
Introduction to R and RStudio
Welcome to R
¨
¨
R programming language began in 1992 to create a
special-purpose language for use in statistical
applications.
R gained traction as a popular language as it is available
to everyone as a free, open source language developed
by a community of committed developers.
7-2
7-2
Welcome to R
¨
¨
Package: The open source code distributed by
Comprehensive R Archive Network (CRAN), a
worldwide repository of popular R code.
R is an interpreted language where code written is
stored as a script.
¤
The script is executed by the system processing the code.
¤
As an interpreted language, R allows execution of R commands
directly and gives an immediate result
7-3
7-3
The R Language
¨
The R language is available as a free download from the
R Project website at:
https://www.r-project.org
7-4
7-4
RStudio
¨
RStudio: An integrated development environment
(IDE) that offers a graphical interface to assist in
creating R code.
¤
¨
Allows users to manage code, monitor progress, and
troubleshoot issues
RStudio IDE comes in different versions
¤
For the purpose of this course, the open source version will be
more than sufficient.
7-5
7-5
RStudio Desktop
Download the most recent version at:
https://posit.co/download/rstudio-desktop/
7-6
7-6
RStudio Environment
¨
¨
¨
¨
Console Pane: appears in the lower-left corner, allows you to interact
directly with the R interpreter and type commands where R will
immediately execute them.
Script Pane: where you write R commands in a script file that you can
save. An R script is simply a text file containing R commands. R will
color-code different elements of your code to make it easier to read.
Environment Pane: where you can see the values of variables,
datasets, and other objects that are currently stored in memory.
Plots Pane: appears in the lower-right corner and will contain any
graphics that you generate in your R code.
7-7
7-7
R Packages
¨
¨
Packages are the secret sauce of R, consisting of
collections of code created by the community and
shared for public use.
Installing Packages:
¤
¤
Use the install.packages() command
Ex: Installing RWeka package
n
¨
install.packages("RWeka")
Loading Packages
¤
¤
Use the library() command to load a package into session
Ex: Loading RWeka package
n
library(RWeka)
7-8
7-8
Writing & Running R Script
¨
¨
Write a script in the script pane
To execute, click the “run” button as seen below:
7-9
7-9
Data Types in R
¨
Logical: a simple binary variable that may have only two values:
TRUE or FALSE.
¨
Numeric: data type that stores decimal numbers
¨
Integer: data type that stores integers
¨
¨
¨
Character: data type that is used to store text strings of up to 65,535
characters each
Factor: data type that is used to store categorical values. Each possible
value of a factor is known as a level.
Ordered Factor: a special factor data type where the order of the levels
is significant. Ex: Low, Medium, and High
7-10
7-10
Vectors
¨
Vectors: a way to collect elements of the same data type
in R together in a sequence
¤
¨
Use the c()function to create a new vector
¤
¨
Each data element in a vector is called a component of that
vector.
names <- c('Mike', 'Renee', 'Richard’,
'Christopher’)
'Matthew',
Once data is stored as vector, you can access individual
components
¤
names[1] would output 'Mike'
7-11
7-11
Vectors
¨
¨
¨
Functions such as mean(), median(),min(),and
max()work on entire vectors at once
All components of a vector must be of the same data
type for functions to work
Vectors are combined into data structures that resemble
spreadsheets, known in R as data frames
7-12
7-12
Testing Data Types
¨
Use the class()function to return data type of an object.
Example:
> x <- TRUE
> class(x)
[1] “logical”
¨
Use the length()function to return number of components
in vector. Example:
> x <- TRUE
> length(x)
[1] 1
7-13
7-13
Converting Data Types
¨
Use the following functions to convert to the
corresponding data type
as.logical()
¤ as.numerical()
¤ as.integer()
¤ as.character()
¤ as.factor()
¤
7-14
7-14
Missing Values
¨
R uses the special constant value NA to represent
missing values in a dataset.
¤
¨
These values are different from blank or zero values.
Use the is.na() function to test if an object
contains missing values.
7-15
7-15
Download