R: A programming environment for Data Analysis and Graphics Silvia Liverani

advertisement
R: A
programming
environment
for Data
Analysis and
Graphics
Silvia Liverani
R: A programming environment for Data
Analysis and Graphics
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
Silvia Liverani
Department of Statistics
University of Warwick
CSC, 24th April 2008
R: A
programming
environment
for Data
Analysis and
Graphics
Outline
Silvia Liverani
R
1 R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
2 Statistical tools
3 Graphical tools
References
4 Usage in CSC
5 Examples
6 References
R: A
programming
environment
for Data
Analysis and
Graphics
What do you need?
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
• Performance
• Functionality
• Extensibility
• Simplicity
• Compatability
• Graphical Interface
• Low-cost
R: A
programming
environment
for Data
Analysis and
Graphics
R Project
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
Authors Ross Ihaka and Robert Gentleman Statistics,
Department of the University of Auckland, New
Zealand
Licence R is available as Free Software Free Software
Foundations GNU General Public Licence in
source code form
Platform UNIX (FreeBSD, Linux), WINDOWS, MacOs
Contributions Product of international collaboration, top
computational statisticians, computer language
designers
Web sites http://www.r-project.org
R: A
programming
environment
for Data
Analysis and
Graphics
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
Why? What is R?
• All source code is published correction check by expert
statisticians
• Comprehensive technical documentation and user
contributed tutorials
• It is fully programmable, with its own sophisticated
computer language
• Easy to write your own functions
• Easy to write whole packages
• Exchange data in MS-Excel, text, and fixed and delimited
formats
• Easy importing and exporting datasets
• Integrated suite of software facilities for data
manipulation, calculation and graphical display
R: A
programming
environment
for Data
Analysis and
Graphics
What R does. . .
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
• data handling and storage: numeric, textual operators for
calculations on arrays and matrices
• tools for data analysis
• high-level data analytic and statistical functions graphics
• programming language: loops, branching, subroutines
R: A
programming
environment
for Data
Analysis and
Graphics
. . . and what R does not
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
• is not a database, but connects to DBMSs (database
management systems)
• has no graphical user interfaces, but connects to Java,
Tcl/Tk
• language interpreter can be very slow, but allows to call
own C/C++ code
• no spreadsheet view of data, but connects to
Excel/MsOffice
• no professional / commercial
R: A
programming
environment
for Data
Analysis and
Graphics
Statistical tools
Silvia Liverani
R
Statistical
tools
1387 packages
Graphical
tools
base The R Base Package
Usage in CSC
class Functions for classification
Examples
References
cluster Functions for clustering (by Rousseeuw et al.)
ctest Classical Tests
eda Exploratory Data Analysis
EMV Estimation of Missing Values for a Data Matrix
R: A
programming
environment
for Data
Analysis and
Graphics
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
Packages
fields Tools for spatial data
ggm Graphical Gaussian Models
gllm Generalised log-linear model
gss General Smoothing Splines
KernSmooth Functions for kernel smoothing for Wand & Jones
(1995)
References
linprog Linear Programming and Optimization
MCMCpack Markov chain Monte Carlo (MCMC) Package
mle Maximum likelihood estimation
mva Classical Multivariate Analysis
sem Structural Equation Models
shapes Statistical shape analysis
R: A
programming
environment
for Data
Analysis and
Graphics
Packages
Silvia Liverani
sma Statistical Microarray Analysis
R
Statistical
tools
Graphical
tools
Usage in CSC
spatial Functions for kriging and point pattern analysis
splines Regression Spline Functions and Classes
survey Analysis of complex survey samples
survival Survival analysis, including penalised likelihood.
Examples
References
tapiR Tools for accessing UK parliamentary information
in R
ts Time Series Functions
tseries Time series analysis and computational finance
wavethresh Software to perform wavelet statistics and
transforms.
xtable Export tables to LaTeX or HTML
R: A
programming
environment
for Data
Analysis and
Graphics
Graphical tools
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
• Multiple plots in a single graphic window
• Adjusting graphical parameters
• Labels and title; axis limits
• Types for plots and lines
• Colors and characters
• Controlling axis line
• Controlling tick marks
• Legend
• Putting text to the plot; controlling the text size
• Adding symbols to plots
• Examples and tutorials online
• R Graph Gallery
http://addictedtor.free.fr/graphiques/
R: A
programming
environment
for Data
Analysis and
Graphics
Some plots
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
demo(graphics)
demo(image)
demo(persp)
R: A
programming
environment
for Data
Analysis and
Graphics
Using R
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
UNIX In the Shell Console
$ mkdir work
$ cd work
$R
> ...
> q()
If commands are saved in commands. R
$ R CMD BATCH commands.R
WINDOWS Click on the R icon
R: A
programming
environment
for Data
Analysis and
Graphics
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Some commands
Comments
Vector assignement
Arithmetic
Missing Values
Logical Operator
Loops
#
x <- c(11,12,13,14,15)
sqrt(4) * 2
NA
x [ 2 ] ==12
for (i in 1:100) print(i)
Examples
References
There are many types of objects in R:
• Vectors
• Matrices
• Dataframes
• Functions
• Lists
• Factors
R: A
programming
environment
for Data
Analysis and
Graphics
Examples
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
See script.R for examples
R: A
programming
environment
for Data
Analysis and
Graphics
Silvia Liverani
R
Statistical
tools
Graphical
tools
Usage in CSC
Examples
References
R Development Core Team (2006). R: A language and
environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. ISBN
3-900051-07-0, URL www.R-project.org.
W. N. Venables, D. M. Smith and the R Development Core
Team (2008). An Introduction to R. URL
cran.r-project.org/doc/manuals/R-intro.pdf
References
Marta Nogaj (2004). R: A programming environment for
Data Analysis and Graphics. Talk at Geosciences and
Statistics. URL www.ipsl.jussieu.fr/CLIMSTAT/
CARGESE/TALKS/MARTA/Rproject.ppt
S.Liverani@warwick.ac.uk
Download