A basic graphical user interface for R: R-interactive Angelo M. Mineo and Alfredo Pontillo Dipartimento di Scienze Statistiche e Matematiche “S. Vianelli” di Palermo elio.mineo@dssm.unipa.it; alf@dssm.unipa.it Summary. In this paper, we describe R-interactive, that is an open-source graphical user interface for R. R-interactive provides a very essential interface to R. In our opinion, the main feature of this software is the possibility to write several code rows that are executed all at once, by allowing an easy debugging of the same code. R-interactive can also be used via web, so it constitutes a web oriented graphical user interface for R, as well. Key words: GUI, R, PHP 1 Introduction The development of open-source software, i.e. of software that can be read, redistributed and modified from the final user, according to his own convenience, is growing more and more. Let us think, for example, about the greater and greater spread of Linux as open-source operating system, even in Personal Computer. In the statistical field, more and more success has the statistical programming environment R [RDC05], both inside the academic field and inside the business field [Scw04]. One of the main features of R is that is a powerful statistical programming language based on the S language; S is also the basic language of the commercial statistical software S+. The S language has been developed by John Chambers1 inside the Bell Laboratories, belonging at that time to the telephone firm AT&T and nowadays belonging to the Lucent Technologies; S has been released in 1980, while R has been written by Ross Ihaka and Robert Gentleman [IG96] and released with the General Public Licence (GPL). Since mid-1997 there has been a core group (the R Core Team) who can modify the R source code archive. Nowadays, R is one of the most appreciated open-source projects of the World and surely is a real competitor with the most known commercial statistical software, such as SAS, SPSS, S+, STATISTICA and so on. 1 On 1998 for the S language development John Chambers has received the ACM Software System Award (see the URL: http://www.acm.org/awards/ssaward. html). 1336 Angelo M. Mineo and Alfredo Pontillo Moreover, R is also a real programming language, even though it is not much developed the graphical interface for the final users. In this paper, we present the R-interactive open-source software, that is a very simple graphical user interface (GUI) for R, developed by means of the PHP scripting language, and that has the feature, among others, to allow to a medium able user to write even complex code, by verifying step by step the correctness of the code itself. In fact, there are some debug functions in R, but we think that this tool ca be very interesting for the R users, because the use of this tool should speed the multi-line code writing that needs a careful debugging. This paper is organized in the following way: after an explanation of why we propose this software, it follows the description of the simple procedures to install R-interactive on a computer either with Windows or with Linux as operating system. Then, the software is described, pointing out the easiness of use, and a simple example of use is shown. For more information about this software, including the possibility to download the source code or to see the installation instructions, the interested reader can visit the web site http://dssm.unipa.it/R-interactive. 2 Why to use R-interactive Nowadays, the statistical programming environment R is a very good alternative to the main commercial statistical softwares. Nevertheless, R does not have a developed graphical user interface; data are input by means of a Command Line Interface (CLI): this fact seems to create some doubts about the use of the software, mainly by users that are not so expert on programming; on the contrary, expert users seem to prefer the CLI, because allows the direct control on computation and on programming and is very flexible, even though it is required a good knowledge of the language. However, even though R does not have an own GUI, several graphical interfaces have been contributed and there are several projects that try to solve this drawback (see the following URL: http://www.r-project.org/GUI.). All these projects are very interesting and are worth of mention, but in this paper we want to cite just two of them: the R Commander and JGR. The R Commander [Fox05] is a basic-statistics graphical user interface provided by the Rcmdr package. According to the author, “the design objectives of the R Commander were as follows: to support, through an easy-to-use, extensible, cross-platform GUI, the statistical functionality required for a basic statistics course [...], to make it relatively difficult to do unreasonable things; and to render visible the relationship between choices made in the GUI and the R commands that they generate”. The R Commander is based on the tcltk package, which furnishes an interface for R to the Tcl/Tk GUI toolkit. JGR [HUT04] is a Java GUI for R. According to the authors, “JGR features a build in editor with syntax highlighting and direct command transfer, a spreadsheet to inspect and modify data, an advanced help system and an intelligent object navigator, which can, among other things, handle data set hierarchies and model comparisons. JGR also allows to run parallel R sessions within one interface. Since JGR is written in Java, it builds the first unified interface to R, running on all platforms R supports.” However, it does not seem that there is a GUI project to facilitate the debugging of R code. In fact, one of the main feature of the CLI is to allow the input of portions A basic graphical user interface for R: R-interactive 1337 of code row by row, by making the code input not much prompt; indeed, when a user has the necessity of writing long portions of code, for example defining a new function, there are some devices to overcome this drawback, such as writing the code in a text editor and then pasting it on R, or after saving the file containing the function, calling directly the file on R by means of the source() command, that executes all the instructions contained in the loaded text file. Even though these procedures solve partially the problem, it is sure that long code writing is slow and woolly, especially when the code needs a careful debugging. It is true that there are very good R functions, designed for the code debugging operations, such as the debug() function, that allows a function debug, or the trace() function, that allows the interactive tracing and debugging of calls to a function or method, but we think that a graphical interface, oriented to the debugging, could speed these operations. This is the reason why we have designed R-interactive, that anyway is a simpler graphical user interface than those ones cited previously, but surely more useful to process simultaneously more R code rows and to debug the same code. R-interactive has been designed by using the scripting language PHP and it is possible to use it on computers with different operating systems. There are very few GUI designed in PHP and that are cross-platform, i.e. that can be used on different operating systems. This makes R-interactive different from other kind of GUIs. Other differences can be found in the following considerations. Even though R-interactive is essentially thought for an individual use, it has been implemented a system to support multi-tasking, by allowing de facto to use it as a server for the R use. In this case, there could be problems for the safety of the hosting server, since R allows the use of a set of commands interacting with the operating system of the computer where is installed. To avoid this drawback, it has been forbidden the use of these R commands. Anyway, it is obvious that such commands could be used when R-interactive is used as a local software. To allow or to forbid the use of these commands, the $controlCommand environment variable has been defined in the HOME /include/confInc.php configuration file (for more details, see Sect. 1). R-interactive is designed entirely to make faster the access to its functionalities; indeed, it allows the access to these functionalities by means of the use of keyboard shortcut commands that speed up the code writing, the output visualization and so on. R-interactive, besides the multi-tasking feature, has features that are very useful for the final user: for example, the multi project management, by means of the open of more tabs that can assume different names, changeable in each moment, allowing the management of several sessions simultaneously. 3 How to install R-interactive R-interactive is a web oriented software and then, for its running, needs a web server installed on the hosting computer; after the R-interactive installation, this software can be reached and then used from any client with any web browser (Mozilla, Firefox, Internet Explorer, Opera, and so on). Then, for the right running of R-interactive it is necessary to install a web server. Either with a Linux operating system, or with a Windows XP operating sys- 1338 Angelo M. Mineo and Alfredo Pontillo tem, it is advisable to install an Apache web server that, nowadays, is the most used web server of the World, according to the Netcraft Web Server Survey2 (see Fig. 4). Fig. 1. Market share for top servers across all domains from August 1995 to December 2005. Moreover, it is necessary to install the PHP scripting programming language (in this paper we refer to the PHP4 version) and ImageMagick that it is necessary to manage properly all the images, such as the graphs produced by R. In fact, the main role of ImageMagick in R-interactive is to convert in .png format the graphs produced by R that instead uses the .ps format; indeed, the .png format is more suitable for web oriented applications. For this reason, it is necessary to have also installed ghostscript (even though in R-interactive it is not invoked directly, but only by means of Imagemagick) that is necessary to manage properly the .ps graphs. In the following of this paper, we are going to suppose that a user has installed the previously cited programs on his own computer, either in the case of a Linux or Windows operating system. At this point, it is very easy to install R-interactive: in fact, it is necessary to “unzip”, with the super user or administrator privileges, the R-interactive-xxx.tgz file (or the corresponding Rinteractive-xxx.zip file) inside the server root. Now, it is sufficient with a web browser to point to the URL http://localhost/R-interactive-xxx/, or to the URL http://your-domain/R-interactive-xxx/, to run R-interactive. Before using R-interactive, it could be necessary to modify the file containing some environment variables, i.e. the HOME /include/confInc.php file (even though usually this operation is not necessary for a correct running of R-interactive). This file contains the following environment variables, that can be modified for a correct configuration of the software: 2 See the URL: http://survey.netcraft.com/archive.html. A basic graphical user interface for R: R-interactive 1339 • $controlCommand: this boolean variable gives directions to R-interactive about the behaviour to take on respect of portions of R code potentially dangerous for the hosting system. By default, this variable is set to 1, that is the input code is checked and is not allowed the execution of code containing one of the commands listed in the HOME /js/dangerous.js file. It is obvious that each user can modify this list, according to his own convenience. When the $controlCommand environment variable is set to 0, no check is executed on the input commands, so all the R commands can be used. Then, in the case that an user wants to set up an internet site to allow the use of R-interactive via web, it is a good thing to leave the default value of this variable to 1; if an user wants to use R-interactive locally, it is useful to set this variable to 0. • $maxNumberTab: it states the maximum number of tabs (and then of sessions) that can be opened and used simultaneously; the recommended value is 12. • $RWinCmd: it states the path where is located the R executable file (R.exe), if R-interactive is installed on a computer with a Windows operating system. • $convertWinCmd: it states the path for the convert executable command, if R-interactive is installed on a computer with a Windows operating system (convert is a command of the ImageMagick software; this command is necessary to convert graphs from .ps format to .png format). • $RLinuxCmd: if R-interactive is installed on a computer with a Linux operating system, this variable states the path where is located the R executable file. • $convertLinuxCmd: it states the path for the convert executable command, if R-interactive is installed on a computer with a Linux operating system. 4 Description of R-interactive In this section, we give a short description of R-interactive, by showing a simple example of use. By pointing the internet browser to http://localhost/R-inter-active-xxx/, the R-interactive home page is opened (see Fig. 2). This page is divided in two main parts: the higher part, where is input the R code to process, and the lower part, where is visualized the output of the processed R code. It is possible to open different input windows (tabs); each of them acts in an independent way on respect to the other ones: this means that inside the same tab, R-interactive keeps memory of the commands input from the beginning of the current session, while by opening a new tab a new R working session is started. The R-interactive use can be, in some way, sped up by the execution of some keyboard shortcut commands, whose short description is shown in a window that is located on the right of the input tab; this window can be alternatively visualized or hidden by clicking on the keyboard shortcut link. Next, we give a brief description of what the keyboard shortcut commands do: • alt+c: gives the control to the area where the user can input the R code; • alt+t: changes the tab name; • alt+k: opens or closes, alternatively, the window where the keyboard shortcut commands are shortly described; • alt+o: executes the R code, input in the active tab; • alt+l: deletes the commands contained in the active tab; 1340 Angelo M. Mineo and Alfredo Pontillo Fig. 2. R-interactive home page. alt+r: allows the change of the active tab name; alt+n: opens a new tab; alt+d: deletes the active tab (at least one tab has to be always opened); alt+s: saves in a text file the contents of the active tab (by default, the name of the tab, with the .txt extension, is given to this file); • alt+u: allows to load a data set text file from the user computer. • • • • Each of these operations can be also executed by using the computer mouse. In particular, in the central part of the home page, in between the part where the code is input and the part where the output is visualized, there are some buttons that allow the loading of a text file containing a data set from the user computer: the result of using these buttons is to change the working directory, by setting up, as new working directory, that one where the text file is contained; in this case, the following command is input in the active tab: dati <- read.table("text file name", header=TRUE) if the “yes” point is selected from the “header” window. In Fig. 3 it is shown a very simple example of use: as input, we have edited a command for the generation of 10 pseudo-random numbers from a standard normal distribution (x <- rnorm(10)) and their visualization (x), while the output, having the 10 generated pseudo-random numbers, is shown in the lower part of the figure (output area). 5 Conclusions In this paper, we have described R-interactive, that is an open-source R graphical user interface. R-interactive furnishes a very essential interface to R. In our opinion, the main feature of this software is the possibility to write several code A basic graphical user interface for R: R-interactive 1341 Fig. 3. R-interactive page with code in the input tab and the related output in the output part of the page. rows that are executed all at once, by allowing an easy debugging of the same code. R-interactive can also be used via web, so it constitutes a web oriented R graphical user interface, too. It is intention of the authors to develop further this tool by correcting possible bugs and by implementing new functionalities, such as the development of some menu to facilitate the execution of some frequently used R commands, or the implementation of a system to highlight, with different colors, portions of the written code. Aknowledgement Thanks to the University of Palermo for supporting this research. References [Fox05] Fox, J.: The R Commander: A Basic-Statistics Graphical User Interface to R. Journal of Statistical Software, 14(9), 1–42 (2005) [HUT04] Helbig, M., Urbanek, Theus, M.: JGR: a unified interface to R. In: useR! 2004 - The R User Conference, VIENNA, 20/5/2004-22/5/2004, http: //www.ci.tuwien.ac.at/Conferences/useR-2004/ (2004) [IG96] Ihaka, R., Gentleman, R.: R: A language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics, 5, 299–314 (1996) [RDC05] R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2005) 1342 [Scw04] Angelo M. Mineo and Alfredo Pontillo Scwartz, M.: The Decision To Use R: a consoulting business perspective. R News, 4(1), 2–5 (2004)