Statistical Training in the Workplace Ian Westbrooke Department of Conservation, Christchurch, New Zealand, iwestbrooke@doc.govt.nz Maheswaran Rohan Department of Conservation, Hamilton, New Zealand, mrohan@doc.govt.nz 1. Introduction The workplace provides a distinctive context for statistical education, with focussed subject areas, ongoing relationships between participants and trainers, and typically an applied emphasis. This contrasts with other statistical training settings, such as through educational institutions, short courses and online courses, where participants typically come from a wide range of backgrounds, timeframes are often limited, and applications are likely to be more wide-ranging. In this chapter, we focus on our experience in delivering statistical training to staff at our workplace, where we have developed courses for people who are not statistics graduates and who work in an operational rather than a research organisation. In fact, this chapter covers from papers Rohan and Westbrooke (2012) and Westbrooke (2010) with additional materials. After reviewing the literature, in section 2, we describe our workplace and the statistical needs of its staff in section 3, and in section 4 we discuss the important but often ignored area of data handling. Our study design course is outlined in section 5, while section 6 describes our modelling courses. Section 7 explains our choice of R and R Commander software. In section 8, we review some differences between training in the workplace and in the education sector. Finally, we relate some experiences from outside our workplace in section 9 and make concluding comments in section 10. 2. Literature on Statistical Training in the Workplace The statistics education literature focuses, understandably, on the formal education sector, particularly at school and tertiary levels. The core of publications relating to the workplace comes from the proceedings of the 4-yearly International Conferences on Teaching Statistics (ICOTS). ICOTS conferences have included workplaces in their ambit, and since at least 1998 have the workplace as one of about 9 topic areas in their programmes. Relevant ICOTS papers deal primarily with how the education sector can, should or does relate to the workplace or industry. Reports from within workplaces especially about training nonstatisticians are very limited. Hamilton (2010) provides a perspective on in-house training for non-statisticians in a national statistical office, while Forbes et al. (2010) look at training non-statisticians in the state sector from a combined workplace and university perspective. Most other relevant literature comes from a more general perspective. Barnett (1990) asks how different organisations can meet statistical needs. He suggests either employing professional statisticians as employees or consultants; or developing skills in staff not trained as statisticians. He then explains how consultancies or tertiary organisations are providing either open or in-house statistics courses to increase skills of non-statisticians in the workplace. In a keynote address to ICOTS 5, Scheaffer (1998) addressed “Bridging the gaps among school, college and the workplace”, looking to “expand the use of statistics in industry while producing a statistics curriculum in schools and colleges that can be defended and sustained”. A theme in a number of the ICOTS papers is the importance of context for workplace training, and the particular importance of including modern teaching approaches such as emphasising real data, statistical concepts rather than mathematical derivations, and using projects and hands-on computing (e.g. Stephenson (2002), Francis and Lipson (2011)). 3. Our Workplace Context: The Department of Conservation (DOC) New Zealand’s Department of Conservation (DOC) is the central government organisation charged with promoting and implementing the conservation of the country’s natural and historic heritage. Thus, DOC is responsible for managing approximately one-third of New Zealand’s land area, along with a number of marine reserves; protecting and managing much of the country’s indigenous biodiversity, including many unique ecosystems and species; promoting recreation; and facilitating tourism. The 1800 staff members include several hundred science graduates undertaking science and technical work at national, regional and local levels. DOC needs evidence-based information to carry out effective management. Typical questions posed by managers include: What are the trends in abundance and health for native species and ecosystems, and how can management make a difference? How are visitors using parks and conservation lands and facilities, and what issues need to be managed? To answer these questions adequately, managers need to move beyond the broad, qualitative assessments that have often underpinned decision-making to an evidence-based approach, which demands quantitative assessments that are based on data. As in many environmental and social arenas, there is plenty of variability involved in conservation, so statistics become essential. Science and technical staff, and others involved in research and monitoring, are generally graduates in various fields, whose qualifications range from first degrees to PhDs. Increasingly, these staff members are expected to perform duties that require a competence in study design and statistical analysis. In addition, a very wide range of staff require basic skills in effective data entry, management and exploration, including effective graphing. A key number of staff require training in statistical modelling skills, starting from the linear model and up through its extensions, including mixed models for repeated measures. In addition, smaller numbers have specialist subject requirements, such as estimating animal abundance or survival analysis, including mark-recapture models. Since only two DOC staff members are appointed primarily as statisticians, the provision of statistical training for other staff is of vital importance. We initially assessed statistical training needs through analysing the requests made to us, and by talking to staff and managers. Key areas we found for development were: data handling and exploration; modelling and study design. 4. Effective Data Entry, Management and Exploration Practical data handling and exploration are essential pre-requisites for the successful application of statistics in the workplace, but are often insufficiently covered in training. In particular, data entry and preparation is an important but often neglected area of statistical practice, and is essential for the key tasks of data exploration and analysis. In fact, there is a key phase during which data preparation and exploration need to interact to ensure that the data are in a suitable state for analysis. Data errors, e.g. as a result of incorrect data entry, are very common and can lead to serious biases and incorrect inferences if left uncorrected. We have found that Microsoft Excel is a good general tool for data handling. Legitimising the use of Excel for data entry, storage and initial exploration has helped to facilitate moving data from pieces of papers and into computers for analysis (see www.reading.ac.uk/SSC/publications/guides/topsde.html for additional information). At DOC, more than 300 staff members have taken part in a 1-day course named Data Handling in Excel—Entering, Managing and Exploring Data. This course not only introduces tools that assist with data entry, such as freeze panes, protecting data and data validation, but also covers the exploration of data using tables and, if time permits, the production of graphs. One of the key emphases is on the importance of standard data formatting: each observation has its own row; each variable is entered into a column with a meaningful name; and only raw data are entered into data sheets, with no blank rows—analyses and summaries go elsewhere. Fortunately, this layout works not only when using Excel’s excellent cross-tabulation tool, the Pivot Table, but also when data are transferred to dedicated statistical packages. When staff see the advantages of using this layout, particularly through the quick and easy creation of summary tables using Pivot Tables, this approach is readily adopted. 4.1 Creating Effective Graphs To facilitate data exploration and improve the quality of presentation, we developed another 1-day course on graphs, which has an accompanying manual (Kelly, Jasperse & Westbrooke, 2005). This course draws heavily on Tufte (2001) and Cleveland (1994). We developed exercises for this course, including one (Figure 1) that allows participants to learn for themselves about Cleveland’s recommended order of visual perception (Cleveland & McGill, 1985). Other exercises include demonstrations of how easily default Excel graphs can be improved, and an example showing the inadequacies of pie graphs based on an excellent book by Robbins (2005). We plan to add a module on graphing in R, using ggplot2. Position on a common scale Position on identical non-aligned scales 1 50 Length 2 50 50 50 0 Angle 4 0 3 5 50 2 0 4 5 50 3 50 0 0 1 2 3 4 5 0 1 1 0 2 3 4 5 List the seven graphs by how easy it is to estimate the SIZE of the number represented 1 is easiest, 7 hardest 1 5 2 3 1 4 5 3 3 2 4 2 5 4 1 6 1 Slope Area 2 3 Grey scale 4 5 7 Figure 1: An exercise from our graphs course. Participants are asked to order the seven graphs according to how easily the size of the number represented can be estimated. This exercise allows participants to learn for themselves about the options for presenting quantitative data in graphs and leads into considering the accuracy of visual perception of different approaches. 5. Study Design Course Our initial decision to emphasise the handling and analysis of data in our courses was supported by the findings from a study of biodiversity monitoring projects we carried out in 2008, which showed that data analysis was the area that required the most strengthening (see Westbrooke, 2010: Figure 1). However, it also revealed that attention needed to be given to study design. Tertiary statistical courses for non-statisticians generally equip graduates almost exclusively for carrying out experiments and statistical significance testing. However, DOC staff members are mostly involved in observational studies rather than experimental research, and management decisions are generally much better informed by an emphasis on effect sizes rather than hypothesis testing. Therefore, we developed a 3-day course in practical study design for applied conservation ecology, with a focus on observational studies. A major challenge has been ensuring that staff understand the basics of randomisation and replication, and why they matter. Another important aspect has been clarifying the differences between experiments and observational studies, particularly in terms of strength of inference, and providing guidance on when and how to implement different types of study. We focus on two main areas: The four Ws—Why, What, Where and When—with particular emphasis on why, which is the key question for setting clear and realistic objectives. The other three Ws refer to what measure is to be used, and the effective use of replication in time and in space to achieve the objectives. The three Rs—Randomisation, Replication, and stRatification. Participants each bring along an example of a study they are currently involved in designing. They introduce their study on the first morning and we then use these as examples throughout, coming back at the end with the group to evaluate what the topics covered during the course mean for the development of the design for these studies. The use of participants’ own examples, and other relevant examples from the workplace, makes the course more effective in gaining involvement and ensuring that they can relate the lessons to their work both in the course and when they return to their job. 6. Modelling Courses We were often asked early on, ‘How do I fit these data into an ANOVA’, because that was often the only statistical model to which graduates in other subjects had been exposed. Another theme was ‘What test do I apply to these data’, as statistics was equated with hypothesis testing. However, in our courses, we prefer to emphasise model building and the estimation of effect sizes, which are especially important in a management organisation, where there is likely to be much more interest in estimating an effect and a confidence interval than on whether or not it is different from zero. In our 3-day introductory statistical modelling course, we revise the linear model, with sessions on ANOVA and multiple linear regression, including ANCOVA. We then extend this to the generalised linear model (glm), with Poisson and binomial errors for count and binary response variables, respectively. When time has allowed, we have added tree-based models and/or generalised additive models We teach participants to follow five steps when modelling: Step 1: Investigate the data Identify the response and explanatory variables, and their data types (continuous, nominal, ordinal), and construct graphs and tables to explore the distribution of the variables and the relationships between variables. In class, this step provides the opening for teaching the use of statistical software for data exploration and visualisation. Step 2: Fit the model Choose an appropriate error structure based on the design and information from Step 1, and use software to fit the model. This step provides the opportunity for us to explain error structure and the assumptions underlying the modelling. Step 3: Analyse the model Examine the model output from the software, and apply model selection criteria such as likelihood ratio approaches or the Akaike Information Criterion (AIC) for model comparison and variable selection. This step allows us to explain how to interpret model output, and to discuss issues around model and variable selection, and multicolinearity. Step 4: Assess the model Examine the assumptions defined in Step 2 graphically and numerically. If Step 4 fails, go to steps 1 and 2 and try alternative models until a satisfactory model is obtained. Step 5: Interpret the model Interpret the results in the context of the overall problem, with estimates of relevant effects, including confidence intervals. Many DOC staff members have limited mathematical skills, and participants struggle to write down expressions to predict the values from statistical model outputs, or to back-transform onto the original scale. Therefore, we avoid teaching more abstract statistical concepts, aiming to explain the technical aspects that are needed using outputs from the computation of statistical models. Attendees do not learn how to compute parameter estimates and we do not expose them to more than very basic equations; for example, normal equations of the general form ˆ ( X t WX ) 1 X t Wy are too complex for almost all of those who attend. This means that: (i) (ii) (iii) (iv) Participants are not exposed to the design matrix X. This makes it difficult to provide a full explanation of the need for a reference category for factor variables. Instead, we provide an informal explanation of the need for a reference category. We avoid talking about W and iteratively reweighted least squares (IRLS). When the glm output shows the number of iterations, we explain that the computation is repeated a number of times to converge on the estimated value for the parameter. We only touch in passing on the relationship between maximum likelihood estimation and least squares estimation. The concept of starting values for parameter computation is ignored. Our other main modelling course involves the analysis of repeated measures, with either continuous or discrete responses. DOC carries out many large and small monitoring projects throughout the country, which typically involve collecting information from the same sampling unit (subjects) over several years. We provide an introduction to analysing such data using mixed models. First, participants learn from a real example that the standard classical approach is not suitable for analysing this type of data, illustrating the violations of the classical assumptions using residual plots for each subject. We then explain the need to modify the model by allowing for random variation between the subjects; for example, we introduce the random intercept model using the graphical approach shown in Figure 2. We then explain that variation can be divided into two types: (i) Stochastic variation within a subject, similar to our usual classical model errors. (ii) Variation between subjects around the overall mean, known as random effects. Thus, we end up with the response as a function of fixed effects and random effects, which we present as: Response ~ Fixed effects + Random effects + Error and illustrate using Figure 2. There is random variation between subjects, forming a distribution of the intercepts around the overall mean. Overall mean intercept and slope – Fixed effects Variation within subject These random effects are is assumed to be normally distributed Time Figure 2: Random intercept model Although serial correlation may feature in some of our datasets, time constraints have so far prevented us from addressing these during the courses. Therefore, we advise participants that input from a statistician is needed to analyse these types of data. As in the introductory course, a number of technical statistical issues are often discussed in simple language during the course. This approach is well received by the participants, who are happy to accept our word on the more technical aspects and are generally more interested in understanding the application rather than the theory of statistics. Applied topics such as model selection, testing assumptions and interpreting the results are of particular interest to those attending these courses. 7. The Benefits of Using R and R Commander Software We use R (R Development Core Team, 2012) as the software for our statistical training. With its free access, enormous flexibility and the availability of almost all statistical techniques, the usage of R has increased exponentially worldwide. R comes with base libraries and recommended packages, as well as more than 2500 contributed packages. R was chosen as the statistical software for use in DOC because of its power and its free availability. This reduces the cost to the organisation, and ensures a ready access to the software and portability of skills learnt. Initially, we taught our introductory statistical modelling course using standard R, with participants typing and submitting R code. However, in our experience, it is a challenge to teach both statistical methods and the R language for our biology-oriented group of participants—they often ended up with syntax errors, even though we provided R code on the screen and explained how to write the code, and users found it hard to understand and correct these errors. Aside from syntax errors, other difficulties in using standard R included: (i) A number of R-functions and their options need to be understood. For example, to create good graphs in base R, users need to learn available graphical options such as lwd, lty, cex, pch, type, legend, etc. (ii) Users find it difficult to understand where to use the appropriate brackets, such as normal brackets (), square brackets [] and curly brackets {}. (iii) Users need to learn about the availability and location of each function. For example, in order to avoid mathematical details in the binomial model, we use the ilogit function in the faraway library to compute predictions of the binomial parameter p in a logistic regression for various values of the covariates. When we present such a function, participants ask how they can learn about the availability of such things in R. We explain it is not always easy, and that we learn about them by browsing help files and books, searching the internet, talking to colleagues, and asking questions on appropriate forums or mailing lists. (iv) R error messages are often far from friendly to casual or first-time users. For example, what is the meaning of the error message ‘object ilogit not found’? We would suggest typing ??ilogit in the R Console as the first step to solving this problem, and following it up with the options mentioned in (iii). Thus, participants can feel daunted or overwhelmed by the computing aspects of R that are needed to complete their data analysis. Instead, the majority of participants prefer to start by using a menu-based approach, as this is an environment with which they are more familiar and reduces their learning load. We found three menu-based packages in R: R Commander, Deducer and R Excel. After some comparative evaluation, we found that R Commander was well-suited to DOC’s needs. 7.1 R Commander R Commander (Fox, 2005), which is available in the R package Rcmdr, provides a simple point and click interface to R, including linear and generalised linear models, and some graphing capacity. This free, menu-based statistical package within R creates an easy way to learn statistical modelling with R. R Commander comes with a menu at the top plus three windows (see Figure 3): (i) Script Window — R script is automatically written here when a menu is clicked. The user can also create or modify code here. (ii) Output window — This operates much like the R console, echoing script as it is submitted and showing outputs. (iii) Messages window — Information is provided about the active dataset and error messages. Figure 3: R Commander showing the three windows We converted most of our statistical modelling course material so that participants could use R Commander instead of typing R code. We found that this made the course more relaxed for both the trainers and participants, allowing more time to learn about statistical methods rather than coding aspects. For example, instead of writing R code such as xyplot(fc~year.measured | tag, data = FBI, auto.key =TRUE), which participants often needed help with, now we just ask them to click Graphs >> XY conditioning plot and complete the dialogue. This graph allows visualisation of the response for a given condition and often makes it easier to recognise data entry errors. Advantages of using R Commander include: (i) New and infrequent users can start using R without confusion or panic. (ii) R Commander provides standard R code for each operation, which can assist with learning R programming commands and allows us to modify the script when needed. We sometimes give instructions on how to modify the code; for example, to add a regression line to a plot created by xyplot() using the menus, we ask participants to add ,types = c(“p”,”r”) anywhere within brackets of the xyplot code, highlight the whole of this piece of code and click the submit button. Code from R Commander can be saved and later run in R directly, with very minimal modification. (iii) Participants start to learn about R packages, which are automatically loaded in R Commander. For example the lattice package is used without their knowledge when XY conditioning plot is called. Issues with R Commander include: (i) Modelling capability in R Commander focuses on the linear and generalised linear model. It does not provide access to more complex statistical modelling and graphing techniques, such as generalised additive models, regression trees and mixed models. Thus, participants who require functionality that is not available in R Commander need to learn to use R code directly. (ii) Some stability issues were occasionally encountered through our teaching experience, but these were largely overcome by ensuring that we used the sdi (Single Document Interface) rather than the default mdi (Multiple Document Interface) interface to R under Windows; and by using an up-to-date version of Rcmdr. Four main R Commander menus—Data, Statistics, Graphs and Models—are used in our statistical modelling course. A number of data manipulation and statistical operations can be carried out under these menus. All possible operations can be viewed in the manual or by clicking appropriate menus. Some examples are: (i) (ii) (iii) Data menu: Data can be imported to R Commander from various formats, such as text, csv files, SPSS, Minitab, STATA, etc. when Import data is clicked. The Manage variables in active data set submenu allows data manipulation such as transformations, converting numerical variables to factors and recoding variables. Statistics menu: Most statistical analyses that are required at undergraduate level can be carried out using this menu (Figure 4). ANOVAs and t-tests are available under the Means option, while the Fit models options allow for the linear model, generalised linear models, the multinomial logistic model and ordinal regression model. Graphs menu: Tools for graphs in R Commander are excellent, but not totally comprehensive. A number of graphs ranging from histograms to 3D graphs including conditional plots can be generated. Figure 5 shows the range of graphs available. 1 Figure 4: Statistics menu in R Commander 1 s (iv) Figure 5: Graphs in R Commander Models menu: This menu (Figure 6) provides diagnostics for a model, both graphically and numerically, including making available tests for multi-colinearity and auto-correlation. Figure 6: Assessing models in R Commander R Commander provides a limited range of statistical tools, but has plug-in packages that can be manually added using the Tools menu. Plug-in packages include one for survival analysis and one that provides an introduction to the sophisticated graphics package ggplot2. We carried out an informal survey amongst a group of our course participants about R Commander. More than 70% of the 21 participants surveyed agreed that R Commander is useful for their own work, while none disagreed. More than 70% also agreed that they did not need to memorise the R functions as much, while 10% disagreed. Experience at DOC has shown that R Commander significantly eases the steep learning curve for R. In our statistical modelling course, we have found that it allows both participants and trainers to concentrate more on the statistical content. A number of participants have also found this approach to be a useful bridge to writing code in R, and are now programming in R independently. This ability to use code in R provides a good basis for those progressing to our repeated measures course, which involves mixed modelling, where we put R Commander aside and use R directly. Some staff members, especially those who only use statistical tools occasionally, have found that the R Commander environment is adequate and convenient for their needs, and feel no need to progress to writing R code. While software packages as Minitab and SPSS might have some advantages over R Commander for some teaching purposes, we have found that R Commander works very well for our first modelling course, and functions very effectively as a bridge to our main statistical software, R, as the author (Fox, 2005) intended.. We chose R over our previous software SPSS because it meets our needs in a statistical package best, and for its free availability. R can be simultaneously used by a large number of people in the workplace or anywhere else without any licence issues. 8. Differences Between Training in the Workplace and in the Education Sector While the courses we deliver to DOC staff have similar content to those taught in educational institutions, the workplace context has led to some distinct features. First, we emphasise practical applications and examples using real data, with a basic outline of the theoretical background; formulae and mathematical notation are kept to a minimum, with no derivations or proofs. Second, we teach intensive block courses (typically 1 or 3 days long) rather than multiple sessions over a longer period such as a quarter or semester; we have found that it is much easier for staff members who are faced with many competing priorities to commit to attending short courses, particularly since participants are dispersed across New Zealand, as conservation management is often carried out in remote areas. Third, we work with small classes, up to a maximum of about 12, and have a high trainer to participant ratio; one trainer can cope with up to five or six participants, so we usually aim for two trainers per course. Finally, we do not carry out formal assessment of participants, as it would take up precious classroom time, and there is less need for formal qualifications in the workplace context; instead, we ask the participants to assess the course and its applicability to their work. For the statistical modelling course, respondents have assessed 6 statements on a 5 point scale (from “strongly disagree” to “strongly agree” The statements covered whether the overall content was relevant to my work; whether the explanations and practical computing increased understanding of statistical models and R; and whether the 3-day programme met overall expectations. For 30 respondents from 2009, 2011 and 2012, 177 (98%) of 180 responses were evenly split between “agree” or “strongly agree”. An open question that asked what worked well revealed two themes, with 11 mentioning practical exercises or examples, and 1 of these and 6 others mentioning the small class or the availability of 2 tutors. In response to what could be improved, there were no obvious common themes except that 11 gave no response (as against 5 for worked well) and 5 stated little, not much or nothing. 9. Statistical Training Beyond DOC Each of the authors has had a very good response internationally when presenting workshops or seminars involving R Commander. Recently, one author (MR) carried out a 1-day workshop prior to an international conference in Sri Lanka, which used R Commander and briefly covered material that was similar to our 3day statistical modelling course. This was well-received by the 26 participants from seven countries. One author (MR) was also asked on the spot to deliver an informal seminar for fisheries scientists at the Secretariat of the Pacific Community (SPC), New Caledonia, because there was a thirst for learning more about R and R Commander. When shown the graph in Figure 7, participants could not believe it could be so easy to create such a graph either in R Commander or using a package such as gplots; previously, they had taken hours to make a similar graph by writing R code. Figure 7: Tuna count with standard errors Similarly, there was such wide interest on accessing R using R Commander that one author (IW) added an unscheduled seminar to his presentations at the 2011 international Conference on Health Statistics in the Pacific in Suva, Fiji. He also found that R Commander worked very well in a 2-day workshop on statistical modelling at the University of Queensland, which was aimed at non-statisticians. We have noticed that R Commander is generally becoming popular in the Pacific region and beyond, with workshops using R Commander appearing more commonly. 10. Conclusions Our experiences from training DOC staff members and presenting seminars more widely have shown us that: • Training that allows observational data to be distinguished from experimental data and which provides modelling skills that are applicable to different types • • • of data is critical. There needs to be an emphasis on the estimation of effect sizes rather than hypothesis testing. Effective data management and exploration (especially graphing) skills are needed, to provide the basis for data analysis. R Commander works very well for introducing statistical modelling, especially for graduates in non-statistical disciplines. It allows trainees and workshop participants to concentrate on the concepts and application of models to the data, rather than the mechanics of the computations involved. The workplace context means that courses work best as intensive block courses, rather than as a series of shorter sessions, with a practical rather than theoretical emphasis. The use of real examples and datasets that attendees readily understand is also critical, with hands-on computing an integral part of all sessions. The evaluation of how well courses meet workplace objectives is more important than evaluating individuals. The key to a statistician making a difference in a large workplace is to have a strong training and advocacy role. One or two statisticians can make a difference, in our case to help protect New Zealand’s unique biodiversity and protected areas. We receive great support from the wider statistical community through consultation, the receipt and provision of specialist training, and the availability of resources such as R and more specialist software. To make academic statistical training of biological, ecological and social science students more applicable to the workplace, our experience shows that there is a need for a stronger statistical modelling approach, with less emphasis on hypothesis testing. Acknowledgements We would like to thank the many statisticians and others who have helped with the development of training at DOC, especially Neil Cox, Jennifer Brown, Richard Duncan and Tim Robinson who have played major roles in developing some of the courses. We wish to acknowledge colleagues and the chapter referees who assisted us by providing feedback and comments on this chapter and the ICOTS and OZCOTS papers that preceded it. References Barnett, V. (1990), Statistical trends in industry and in the social sector. http://iase-web.org/documents/papers/icots3/BOOK1/C8-3.pdf Cleveland, W.S. (1994), The Element of Graphing Data. Summit, NJ. Hobart. Cleveland, W.S.; McGill, R. (1985). Graphical Perception and Graphical Methods for Analyzing Scientific Data, Science 229: 828–833. Forbes, S.; Bucknall, P.: Pihama, N. (2011), Helping make government policy analysts statistically literate In C. Reading (Ed.), Data and context in statistics education: Towards an evidence-based society. Proceedings of the Eighth International Conference on Teaching Statistics (ICOTS8, July, 2010), Ljubljana, Slovenia. Voorburg, The Netherlands: International Statistical Institute http://iase-web.org/Conference_Proceedings.php?p=ICOTS_8_2010 Fox, J. 2005. The R Commander: A basic-statistics graphical user interface to R. Journal of Statistical Software, 19(9): 1–42. Francis, G.; Lipson, K. (2011), The importance of teaching statistics in a professional context. In C. Reading (Ed.), Data and context in statistics education: Towards an evidence-based society. Proceedings of the Eighth International Conference on Teaching Statistics (ICOTS8, July, 2010), Ljubljana, Slovenia. Voorburg, The Netherlands: International Statistical Institute http://iase-web.org/Conference_Proceedings.php?p=ICOTS_8_2010 Hamilton, G. (2010), Statistical training for non-statistical staff at the office for national statistics. In C. Reading (Ed.), Data and context in statistics education: Towards an evidencebased society. Proceedings of the Eighth International Conference on Teaching Statistics (ICOTS8, July, 2010), Ljubljana, Slovenia. Voorburg, The Netherlands: International Statistical Institute http://iase-web.org/Conference_Proceedings.php?p=ICOTS_8_2010 Kelly, D.; Jasperse, J.; Westbrooke, I. (2005),. Designing science graphs for data analysis and presentation: The bad, the good and the better. Department of Conservation Technical Series 32. Wellington. Department of Conservation. www.doc.govt.nz/upload/documents/scienceand-technical/docts32.pdf R Development Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria., ISBN 3-900051-07-0, www.R-project.org/. Robbins, N.B. (2005), Creating More Effective Graphs, Hoboken, NJ. Wiley. Rohan, M.; Westbrooke, I. (2012), Using R Commander for statistical training in the workplace http://opax.swin.edu.au/~3420701/OZCOTS2012/OZCOTS2012_RohanM_Final_paper.pdf Scheaffer, R.L. (1998),. Statistics education - bridging the gaps among school, college and the workplace http://iase-web.org/documents/papers/icots5/Keynote3.pdf Stephenson, W.R. (2002), Experiencing statistics at a distance http://iase-web.org/documents/papers/icots6/4d3_step.pdf Tufte, E. (2001), The visual display of quantitative information, Cheshire, CT. Graphics Press. Westbrooke, I. (2010). Statistics education in a conservation organisation—towards evidence based management. In C. Reading (Ed.), Data and context in statistics education: Towards an evidence-based society. Proceedings of the Eighth International Conference on Teaching Statistics (ICOTS8, July, 2010), Ljubljana, Slovenia. Voorburg, The Netherlands: International Statistical Institute. http://iase-web.org/Conference_Proceedings.php?p=ICOTS_8_2010