DevelopR – formalising R Development Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Agenda • • • • • The R development challenge Mango’s solution The Mango development environment Benefits of doing this The future Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com THE R DEVELOPMENT CHALLENGE Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Mango Customers • Statisticians, data scientists, … • Come from a variety of industries • Are regulated by a number of different agencies • Mango customers are using R to solve a variety of different problems Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com What the Customer Wants from Mango • Help to analyse data • To simplify a process • Build a GUI • Support • To audit us on a regular basis! Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com What Challenge do we Face? • Statisticians write code every day • But we are not trained to write code • We are not taught about: • Version control • Unit testing • Continuous integration Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com MANGO’S SOLUTION Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Solution • Mango Consultants understand data science • We have a development team who are trained to write code • Let’s see what we can learn from them… Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Mango Solutions Training Statistical Consultants Software Developers & IT Consulting Analytic Integrators Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Analysts Training Consulting Business Statistical Consultants Project Managers & Business Analysts IT Compliance Software Developers & IT Analytic Integrators Quality & Test Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com THE MANGO DEVELOPMENT ENVIRONMENT Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Development Environment Key requirements: • Build R packages • Maintain R packages Other considerations: • Collaboration • Integration with business process • Audit trail Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Version Control • The Mango model is driven by collaboration • Multiple consultants working on the same code base • Need to know what each person is doing and why • Version control underpins everything we do Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model Everything begins with version control Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Integrated Development Environment • Helps to use the same IDE • Integration with version control is key • StatET for formal development • RStudio for prototyping / training Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model IDEs connect with SVN Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Documentation • Facilitated help documentation via roxygen2 • Sensible function naming helps • Quick start guides via vignette knitr Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Testing • Must be able to prove that the package does what it is supposed to! • Mango write our tests before writing the code! The R Journal Vol. 3/1, June 2011 Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Testing Now publically available: https://github.com/MangoTheCat/testCoverage • Test coverage level can be checked with testCoverage Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model R Packages Sweave knitr testthat testCoverage devtools Building R Packages: - Test Framework - Documentation Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Building / Continuous Integration • Customers use: • Different environments / operating systems • Different versions of R • Scripted build process • Each package has its own build environment • Packages pulled from internal CRAN Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Other Benefits of Jenkins The Mango Model R Packages Sweave knitr testthat testCoverage devtools Continuous Integration Automated package builds via Jenkins Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Task Management • Software development requires planning! • Requirements must be traceable • Customers require regular progress updates • Workflows Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Bugs • • • • “All software has bugs” Important to be able to track them Important to be able to fix them Unit tests are written for all bugs Ticket to fix bug Commit adds detail to ticket Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model R Packages Sweave knitr testthat testCoverage devtools Process Management Bug tracking Continuous Integration Process coordinated via JIRA SVN commit comments are automatically added to JIRA tickets Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Code Review • Any code used in an application should be reviewed • Analytic code must be Quality Checked • Integration with task management software provides useful audit trail Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Code Review • Any code used in an application should be reviewed • Analytic code must be Quality Checked • Integration with task management software provides useful audit trail Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model R Packages Sweave knitr testthat testCoverage devtools Process Management Bug tracking Continuous Integration Code Review Fisheye / Crucible facilitate code review and integrate with JIRA Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Collaboration • Coding Standards play an important role • Useful code stored centrally and shared via ModSpace Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com ModSpace won’t search for Chuck Norris because it knows that you don’t find Chuck Norris, he finds you! Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model R Packages Sweave knitr testthat testCoverage devtools Process Management Bug tracking Continuous Integration Code Review Knowledge Management ModSpace integrates with SVN and allows simple searching of code Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Ensuring Quality • • • • Building systems is not enough Training SOPs guide the process Continuous review is essential Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Mango Model R Packages Sweave knitr testthat testCoverage devtools Process Management Bug tracking Continuous Integration Code Review Knowledge Management The Development workflow is governed by a set of approved Quality Procedures BENEFITS Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Benefits • System provides a complete picture of a project • Unified cross-language development • Provides a very simple mechanism for deployment • Traceable - keeps auditors very happy Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com The Future • Continuous review • Learn more from other languages • Review of Jenkins build / test process Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Summary • Mango have learned from the development team • We have implemented a successful structure for managing code development • We are continuously looking to adapt improve • Constant re-evaluation of processes Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com Questions? • Never question Chuck Norris! Andy Nicholls – Head of Consultancy anicholls@mango-solutions.com