R Packages Davor Cubranic SCARL, Dept. of Statistics Warmup questions • Who here uses packages? • Which ones? • How do you know you’re using a package? – Had to install it – Had to load it • What happens when you load it? – Functions – Data – Help pages • So package is a bundle of “stuff” • But, (next slide) • It’s a structured bundle of… What is a package? • Structured, standardized unit of: – R code – documentation – data – external code Why use packages (talking points) • Installation & administration is easy – Finding, installing, compiling, updating… • Validation – Package checks • Distribution mechanisms – CRAN, Bioconductor, Github • Documentation – Bundle examples, demos, tutorials • Organization – Especially useful for the programmers: – Self-contained (names) – Declare and enforce dependencies on other packages Why use packages • • • • • Installation & administration Validation Distribution Documentation Organization • Knowing how packages work and how to use them effectively will make you more effective R analyst, even if you don’t develop new packages • But you should consider developing packages even if you don’t write the next ggplot • Packages for your own stuff: - Analyses you frequently repeat and/or share with others - Publication: create a package containing the publication as a vignette, and bundle the code and data with it Handling packages • Load with library(name) • Package-level help: – library(help=name) • Unload with detach(package:name) – You shouldn’t have to do this Handling packages with RStudio • • • • See the packages tab in Rstudio Checkmark to load Some are already loaded!! Click on the name for help What happens when you load a package? • When you start R you have an empty workspace • But there is also all this other “basic” R stuff (matrix, plot) • So it’s more like we have two boxes, your workspace and “core” R • Actually, it’s more like a whole bunch of boxes: see search() So what happens when you load? • New package gets inserted near the front of the list • It can pull additional packages (dependencies) • But notice that each package is its own bundle (box) • We’ll talk how you create these bundles next Structure • What makes package a package is that it follows a prescribed structure of files and directories • If you tell R to treat this as a package, it will • You can create it by hand, but we’ll use a shortcut: package.skeleton() package.skeleton() • Convenient for turning a set of existing functions and scripts into a package • Let’s do it with the anova.mlm code that we wrote earlier • New project: – scdemoXX@vscarl1.stat.ubc.ca:~scdemo/pkg • source(‘anova.mlm.R’) • package.skeleton(“anovaMlm”, ls()) DESCRIPTION • The only required part of a package • Name, author, license, etc. R/ • Directory for R code • package.skeleton creates one file per function • This is not a rule, you can put as many functions into a single file man/ • Help files NAMESPACE • Defines which objects are visible to the user and other functions • Public vs. private to the package • The default is to make everything visible that starts with a letter Command-line tools • Check • Install • Build INSTALL • Let’s install our package • R CMD INSTALL anova.mlm • Delete the “man” directory – (it’s optional and we’ll recreate it later) • Redo INSTALL • Restart R studio • library(“anova.mlm”) check • Really important!!! • Finds common errors, non-standard parts • CRAN requires no ERRORS or WARNINGS Optional contents • • • • • • • • man/: documentation data/: datasets demo/: R code for demo purposes inst/: other files to include in the package (PDFs, vignettes) tests/: package test files src/: C, C++, or Fortran source code NEWS: history of changes to the package LICENSE or LICENCE: package license DESCRIPTION • Depends: – packages used by this one – and loaded as part of its loading – i.e., visible to the user • Imports: – packages used by this one – but not loaded – i.e, not visible to the user • Suggests: – used in examples or vignettes – non-essential functionality NAMESPACE • exportPattern(“^[[:alpha:]]”) • export(anova.mlm, est.beta) • S3method(print, anova.mlm) • S3method(plot, anova.mlm) • import(MASS) • importFrom(MASS, lda) Documentation • Let’s re-generate the documentation files • promptPackage(“anova.mlm”) • prompt(anova.mlm) anova.mlm.Rd • Description: Compute a (generalized) analysis of variance table for one or more multivariate linear models. • Arguments: – object: an object of class '"mlm”’ – ...: further objects of class '"mlm"’. – force.int: Force intercept • Value: An object of class “anova” inheriting from class “matrix” Help files for methods • \usage{anova.mlm(…)} • For S3 methods: – \usage{\method{print}{anova.mlm}(….)} Vignettes • .rnw extension • Written in Sweave – similar to knitr • Latex + R code • Produces a PDF available in the installed package • vignette() • vignette(‘Sweave’) Help on writing packages • Lots of tutorials on the Web – many of them are not necessarily correct – NAMESPACES, Imports, etc. • Authoritative guide: Writing R Extensions • R-devel mailing list