R: A programming environment for Data Analysis and Graphics Silvia Liverani R: A programming environment for Data Analysis and Graphics R Statistical tools Graphical tools Usage in CSC Examples References Silvia Liverani Department of Statistics University of Warwick CSC, 24th April 2008 R: A programming environment for Data Analysis and Graphics Outline Silvia Liverani R 1 R Statistical tools Graphical tools Usage in CSC Examples 2 Statistical tools 3 Graphical tools References 4 Usage in CSC 5 Examples 6 References R: A programming environment for Data Analysis and Graphics What do you need? Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References • Performance • Functionality • Extensibility • Simplicity • Compatability • Graphical Interface • Low-cost R: A programming environment for Data Analysis and Graphics R Project Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References Authors Ross Ihaka and Robert Gentleman Statistics, Department of the University of Auckland, New Zealand Licence R is available as Free Software Free Software Foundations GNU General Public Licence in source code form Platform UNIX (FreeBSD, Linux), WINDOWS, MacOs Contributions Product of international collaboration, top computational statisticians, computer language designers Web sites http://www.r-project.org R: A programming environment for Data Analysis and Graphics Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References Why? What is R? • All source code is published correction check by expert statisticians • Comprehensive technical documentation and user contributed tutorials • It is fully programmable, with its own sophisticated computer language • Easy to write your own functions • Easy to write whole packages • Exchange data in MS-Excel, text, and fixed and delimited formats • Easy importing and exporting datasets • Integrated suite of software facilities for data manipulation, calculation and graphical display R: A programming environment for Data Analysis and Graphics What R does. . . Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References • data handling and storage: numeric, textual operators for calculations on arrays and matrices • tools for data analysis • high-level data analytic and statistical functions graphics • programming language: loops, branching, subroutines R: A programming environment for Data Analysis and Graphics . . . and what R does not Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References • is not a database, but connects to DBMSs (database management systems) • has no graphical user interfaces, but connects to Java, Tcl/Tk • language interpreter can be very slow, but allows to call own C/C++ code • no spreadsheet view of data, but connects to Excel/MsOffice • no professional / commercial R: A programming environment for Data Analysis and Graphics Statistical tools Silvia Liverani R Statistical tools 1387 packages Graphical tools base The R Base Package Usage in CSC class Functions for classification Examples References cluster Functions for clustering (by Rousseeuw et al.) ctest Classical Tests eda Exploratory Data Analysis EMV Estimation of Missing Values for a Data Matrix R: A programming environment for Data Analysis and Graphics Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples Packages fields Tools for spatial data ggm Graphical Gaussian Models gllm Generalised log-linear model gss General Smoothing Splines KernSmooth Functions for kernel smoothing for Wand & Jones (1995) References linprog Linear Programming and Optimization MCMCpack Markov chain Monte Carlo (MCMC) Package mle Maximum likelihood estimation mva Classical Multivariate Analysis sem Structural Equation Models shapes Statistical shape analysis R: A programming environment for Data Analysis and Graphics Packages Silvia Liverani sma Statistical Microarray Analysis R Statistical tools Graphical tools Usage in CSC spatial Functions for kriging and point pattern analysis splines Regression Spline Functions and Classes survey Analysis of complex survey samples survival Survival analysis, including penalised likelihood. Examples References tapiR Tools for accessing UK parliamentary information in R ts Time Series Functions tseries Time series analysis and computational finance wavethresh Software to perform wavelet statistics and transforms. xtable Export tables to LaTeX or HTML R: A programming environment for Data Analysis and Graphics Graphical tools Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References • Multiple plots in a single graphic window • Adjusting graphical parameters • Labels and title; axis limits • Types for plots and lines • Colors and characters • Controlling axis line • Controlling tick marks • Legend • Putting text to the plot; controlling the text size • Adding symbols to plots • Examples and tutorials online • R Graph Gallery http://addictedtor.free.fr/graphiques/ R: A programming environment for Data Analysis and Graphics Some plots Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References demo(graphics) demo(image) demo(persp) R: A programming environment for Data Analysis and Graphics Using R Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References UNIX In the Shell Console $ mkdir work $ cd work $R > ... > q() If commands are saved in commands. R $ R CMD BATCH commands.R WINDOWS Click on the R icon R: A programming environment for Data Analysis and Graphics Silvia Liverani R Statistical tools Graphical tools Usage in CSC Some commands Comments Vector assignement Arithmetic Missing Values Logical Operator Loops # x <- c(11,12,13,14,15) sqrt(4) * 2 NA x [ 2 ] ==12 for (i in 1:100) print(i) Examples References There are many types of objects in R: • Vectors • Matrices • Dataframes • Functions • Lists • Factors R: A programming environment for Data Analysis and Graphics Examples Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References See script.R for examples R: A programming environment for Data Analysis and Graphics Silvia Liverani R Statistical tools Graphical tools Usage in CSC Examples References R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL www.R-project.org. W. N. Venables, D. M. Smith and the R Development Core Team (2008). An Introduction to R. URL cran.r-project.org/doc/manuals/R-intro.pdf References Marta Nogaj (2004). R: A programming environment for Data Analysis and Graphics. Talk at Geosciences and Statistics. URL www.ipsl.jussieu.fr/CLIMSTAT/ CARGESE/TALKS/MARTA/Rproject.ppt S.Liverani@warwick.ac.uk