R_overview

advertisement
R freeware statistics package
Tara Jenson
NCAR RAL JNT
Tom Hopson
What is R?
• A statistical programming language
• In part, developed from the S Programming
Language from Bell Labs (John Chambers)
• Created to allow rapid development of
methods for use in different types of data.
• Create new graphics. Many default
parameters are chosen, but users retain
complete control.
Why R?
• R has become the dominant language in the statistical
research community.
• R is Open Source and free.
• Runs on all operatingsystems
• Nearly 2,400 packages contributed.
• Packagesand applications in nearly every field of
science, business and economics.
• See R Notes, R Journal and Journal of Statistical
Software www.jstatsoft.org
• More than 100 books with accompanying code
• Very large, active user base.
Why not R?
• NCL, IDL, Matlab, SAS, … are all viable
alternatives to R. If you are a part of an active
community of researchers using another
language, do likewise.
• If we were biostatisticians we would be using
SAS. Book Title: “Analyzing Receiver Operating Characteristic Curves with SAS”
• Consider building verification functions and
utilities as part of code development .
Verification need not be an external process to
forecasting.
The R Community
• Developers
– R Core Group (17 members), only 2 have left since
1997
– Major update in April/October (freeze dates, beta
versions, bug tracking, ...)
• Mailing lists
– Help list ~ 150 messages/day, archived,
searchable.
• 5 International Conferences, 2 US, 1 China
Everything about R is at www.r-project.org
• Source code
• Binary compilations (Windows, Mac OS, Linux
• Documentation ( Main documents, plus numerous
contributed. Some in foreign languages.)
• Newsletter (replaced by R Journal.)
• Mailing list (Several search engines)
• Packages on every topic imaginable
• Wiki with examples
• Reference list of books using R. ( more than 100)
• Task Manager
Use R with scripts
• In Linux - Emacs Speaks Statistics
–
–
–
–
–
Provides syntax-based
Object name completion
Key strokeshort cuts
Commandhistory
Alt-x R to invoke R with Xemacs.
• In Windows, use editor
–
–
–
–
Added GUI features
<control>R sends a line or highlighted section into R.
Install package with GUIs
Save graphics by point and click.
• Mac OS
– Similarto Windows with advantages of system calls.
Packages in R
• Contributed by people world wide.
• Allow scientists or statisticians to push their
ideas.
• Apply and extend R capabilities to meet the
needs of specific communities.
• Accompany many statistical textbooks
A sample of useful packages
•
•
•
•
•
verification
fields (spatial stats)
radiosondes
extRemes
BMA(BayesianModel
Averaging)
• BMAensemble
• circular
• Rsqlite
• Rgis, spatstat (GIS)
• ncdf ( support for
netcdf files )
• Rcolorbrewer
• randomForests
Packages
• Packages must be installed to call.
• Packages must be called to use.
• Base packages are installed by default.
10 most useful function in R
• aggregate - applies a function to groups of
data subset by categories.
• apply - incredibly efficient in avoiding loops.
Applies functions across dimensions of arrays.
• layout - creatively divide a print region.
• xyplot (in the lattice package) slightly advance
graphic techniques
• %in% returns logical showing which elements
in A are in B. (e.g A%in%B)
More top 10
•
•
•
•
•
table – create contingency tabel counts.
boot – apply bootstrap function correctly
read.fwf – read fixed width format data
par – control everything in a graph
system( ) – allows you to call system
command from R
• pairs – the most under utilized plot – plots a
matrix of 4 columns in a 4x4 plot layout
Login, start your windowing system.
$R
Start R as appropriate for your platform. The R program begins, with a
banner. (Within R, the prompt on the left hand side will not be shown
to avoid confusion.)
help.start()
Start the HTML interface to on-line help (using a web browser
available at your machine). You should briefly explore the features of
this facility with the mouse. In particular, work through 1.5, 2.1 – 2.3,
and appendix A (just the first one or two sections)
R Exercises
•
•
•
•
•
Choose groups of 3-4 – find a computer
Log onto machines
Bring up at least 2 xterms
>cd /home/user/Desktop/longlead
>vi intro2R.2013.R
And work through the commands given …
Download