A basic graphical user interface for R: R-interactive

advertisement
A basic graphical user interface for R:
R-interactive
Angelo M. Mineo and Alfredo Pontillo
Dipartimento di Scienze Statistiche e Matematiche “S. Vianelli” di Palermo
elio.mineo@dssm.unipa.it; alf@dssm.unipa.it
Summary. In this paper, we describe R-interactive, that is an open-source graphical user interface for R. R-interactive provides a very essential interface to R. In
our opinion, the main feature of this software is the possibility to write several code
rows that are executed all at once, by allowing an easy debugging of the same code.
R-interactive can also be used via web, so it constitutes a web oriented graphical
user interface for R, as well.
Key words: GUI, R, PHP
1 Introduction
The development of open-source software, i.e. of software that can be read, redistributed and modified from the final user, according to his own convenience, is
growing more and more. Let us think, for example, about the greater and greater
spread of Linux as open-source operating system, even in Personal Computer. In the
statistical field, more and more success has the statistical programming environment
R [RDC05], both inside the academic field and inside the business field [Scw04]. One
of the main features of R is that is a powerful statistical programming language
based on the S language; S is also the basic language of the commercial statistical
software S+. The S language has been developed by John Chambers1 inside the Bell
Laboratories, belonging at that time to the telephone firm AT&T and nowadays
belonging to the Lucent Technologies; S has been released in 1980, while R has been
written by Ross Ihaka and Robert Gentleman [IG96] and released with the General
Public Licence (GPL). Since mid-1997 there has been a core group (the R Core
Team) who can modify the R source code archive. Nowadays, R is one of the most
appreciated open-source projects of the World and surely is a real competitor with
the most known commercial statistical software, such as SAS, SPSS, S+, STATISTICA
and so on.
1
On 1998 for the S language development John Chambers has received the ACM
Software System Award (see the URL: http://www.acm.org/awards/ssaward.
html).
1336
Angelo M. Mineo and Alfredo Pontillo
Moreover, R is also a real programming language, even though it is not much
developed the graphical interface for the final users.
In this paper, we present the R-interactive open-source software, that is a very
simple graphical user interface (GUI) for R, developed by means of the PHP scripting
language, and that has the feature, among others, to allow to a medium able user to
write even complex code, by verifying step by step the correctness of the code itself.
In fact, there are some debug functions in R, but we think that this tool ca be very
interesting for the R users, because the use of this tool should speed the multi-line
code writing that needs a careful debugging.
This paper is organized in the following way: after an explanation of why we
propose this software, it follows the description of the simple procedures to install
R-interactive on a computer either with Windows or with Linux as operating system. Then, the software is described, pointing out the easiness of use, and a simple
example of use is shown. For more information about this software, including the
possibility to download the source code or to see the installation instructions, the
interested reader can visit the web site http://dssm.unipa.it/R-interactive.
2 Why to use R-interactive
Nowadays, the statistical programming environment R is a very good alternative to
the main commercial statistical softwares. Nevertheless, R does not have a developed
graphical user interface; data are input by means of a Command Line Interface
(CLI): this fact seems to create some doubts about the use of the software, mainly
by users that are not so expert on programming; on the contrary, expert users
seem to prefer the CLI, because allows the direct control on computation and on
programming and is very flexible, even though it is required a good knowledge of
the language. However, even though R does not have an own GUI, several graphical
interfaces have been contributed and there are several projects that try to solve
this drawback (see the following URL: http://www.r-project.org/GUI.). All these
projects are very interesting and are worth of mention, but in this paper we want
to cite just two of them: the R Commander and JGR.
The R Commander [Fox05] is a basic-statistics graphical user interface provided by
the Rcmdr package. According to the author, “the design objectives of the R Commander were as follows: to support, through an easy-to-use, extensible, cross-platform
GUI, the statistical functionality required for a basic statistics course [...], to make it
relatively difficult to do unreasonable things; and to render visible the relationship
between choices made in the GUI and the R commands that they generate”. The R
Commander is based on the tcltk package, which furnishes an interface for R to the
Tcl/Tk GUI toolkit.
JGR [HUT04] is a Java GUI for R. According to the authors, “JGR features a
build in editor with syntax highlighting and direct command transfer, a spreadsheet
to inspect and modify data, an advanced help system and an intelligent object
navigator, which can, among other things, handle data set hierarchies and model
comparisons. JGR also allows to run parallel R sessions within one interface. Since JGR
is written in Java, it builds the first unified interface to R, running on all platforms
R supports.”
However, it does not seem that there is a GUI project to facilitate the debugging
of R code. In fact, one of the main feature of the CLI is to allow the input of portions
A basic graphical user interface for R: R-interactive
1337
of code row by row, by making the code input not much prompt; indeed, when a
user has the necessity of writing long portions of code, for example defining a new
function, there are some devices to overcome this drawback, such as writing the
code in a text editor and then pasting it on R, or after saving the file containing
the function, calling directly the file on R by means of the source() command, that
executes all the instructions contained in the loaded text file. Even though these
procedures solve partially the problem, it is sure that long code writing is slow and
woolly, especially when the code needs a careful debugging. It is true that there
are very good R functions, designed for the code debugging operations, such as the
debug() function, that allows a function debug, or the trace() function, that allows
the interactive tracing and debugging of calls to a function or method, but we think
that a graphical interface, oriented to the debugging, could speed these operations.
This is the reason why we have designed R-interactive, that anyway is a simpler
graphical user interface than those ones cited previously, but surely more useful to
process simultaneously more R code rows and to debug the same code.
R-interactive has been designed by using the scripting language PHP and it is
possible to use it on computers with different operating systems. There are very few
GUI designed in PHP and that are cross-platform, i.e. that can be used on different
operating systems. This makes R-interactive different from other kind of GUIs.
Other differences can be found in the following considerations.
Even though R-interactive is essentially thought for an individual use, it has
been implemented a system to support multi-tasking, by allowing de facto to use
it as a server for the R use. In this case, there could be problems for the safety of
the hosting server, since R allows the use of a set of commands interacting with
the operating system of the computer where is installed. To avoid this drawback,
it has been forbidden the use of these R commands. Anyway, it is obvious that
such commands could be used when R-interactive is used as a local software. To
allow or to forbid the use of these commands, the $controlCommand environment
variable has been defined in the HOME /include/confInc.php configuration file (for
more details, see Sect. 1).
R-interactive is designed entirely to make faster the access to its functionalities;
indeed, it allows the access to these functionalities by means of the use of keyboard
shortcut commands that speed up the code writing, the output visualization and so
on.
R-interactive, besides the multi-tasking feature, has features that are very
useful for the final user: for example, the multi project management, by means of
the open of more tabs that can assume different names, changeable in each moment,
allowing the management of several sessions simultaneously.
3 How to install R-interactive
R-interactive is a web oriented software and then, for its running, needs a web
server installed on the hosting computer; after the R-interactive installation, this
software can be reached and then used from any client with any web browser
(Mozilla, Firefox, Internet Explorer, Opera, and so on).
Then, for the right running of R-interactive it is necessary to install a web
server. Either with a Linux operating system, or with a Windows XP operating sys-
1338
Angelo M. Mineo and Alfredo Pontillo
tem, it is advisable to install an Apache web server that, nowadays, is the most used
web server of the World, according to the Netcraft Web Server Survey2 (see Fig. 4).
Fig. 1. Market share for top servers across all domains from August 1995 to December 2005.
Moreover, it is necessary to install the PHP scripting programming language (in
this paper we refer to the PHP4 version) and ImageMagick that it is necessary to
manage properly all the images, such as the graphs produced by R. In fact, the
main role of ImageMagick in R-interactive is to convert in .png format the graphs
produced by R that instead uses the .ps format; indeed, the .png format is more
suitable for web oriented applications. For this reason, it is necessary to have also
installed ghostscript (even though in R-interactive it is not invoked directly, but
only by means of Imagemagick) that is necessary to manage properly the .ps graphs.
In the following of this paper, we are going to suppose that a user has installed the previously cited programs on his own computer, either in the case of
a Linux or Windows operating system. At this point, it is very easy to install
R-interactive: in fact, it is necessary to “unzip”, with the super user or administrator privileges, the R-interactive-xxx.tgz file (or the corresponding Rinteractive-xxx.zip file) inside the server root. Now, it is sufficient with a web
browser to point to the URL http://localhost/R-interactive-xxx/, or to the
URL http://your-domain/R-interactive-xxx/, to run R-interactive.
Before using R-interactive, it could be necessary to modify the file containing
some environment variables, i.e. the HOME /include/confInc.php file (even though
usually this operation is not necessary for a correct running of R-interactive). This
file contains the following environment variables, that can be modified for a correct
configuration of the software:
2
See the URL: http://survey.netcraft.com/archive.html.
A basic graphical user interface for R: R-interactive
1339
• $controlCommand: this boolean variable gives directions to R-interactive
about the behaviour to take on respect of portions of R code potentially dangerous for the hosting system. By default, this variable is set to 1, that is the
input code is checked and is not allowed the execution of code containing one of
the commands listed in the HOME /js/dangerous.js file. It is obvious that each
user can modify this list, according to his own convenience. When the $controlCommand environment variable is set to 0, no check is executed on the input
commands, so all the R commands can be used. Then, in the case that an user
wants to set up an internet site to allow the use of R-interactive via web, it
is a good thing to leave the default value of this variable to 1; if an user wants
to use R-interactive locally, it is useful to set this variable to 0.
• $maxNumberTab: it states the maximum number of tabs (and then of sessions)
that can be opened and used simultaneously; the recommended value is 12.
• $RWinCmd: it states the path where is located the R executable file (R.exe), if
R-interactive is installed on a computer with a Windows operating system.
• $convertWinCmd: it states the path for the convert executable command, if
R-interactive is installed on a computer with a Windows operating system
(convert is a command of the ImageMagick software; this command is necessary
to convert graphs from .ps format to .png format).
• $RLinuxCmd: if R-interactive is installed on a computer with a Linux operating
system, this variable states the path where is located the R executable file.
• $convertLinuxCmd: it states the path for the convert executable command, if
R-interactive is installed on a computer with a Linux operating system.
4 Description of R-interactive
In this section, we give a short description of R-interactive, by showing a simple
example of use.
By pointing the internet browser to http://localhost/R-inter-active-xxx/,
the R-interactive home page is opened (see Fig. 2). This page is divided in two
main parts: the higher part, where is input the R code to process, and the lower part,
where is visualized the output of the processed R code. It is possible to open different
input windows (tabs); each of them acts in an independent way on respect to the
other ones: this means that inside the same tab, R-interactive keeps memory of
the commands input from the beginning of the current session, while by opening a
new tab a new R working session is started.
The R-interactive use can be, in some way, sped up by the execution of some
keyboard shortcut commands, whose short description is shown in a window that is
located on the right of the input tab; this window can be alternatively visualized or
hidden by clicking on the keyboard shortcut link. Next, we give a brief description
of what the keyboard shortcut commands do:
• alt+c: gives the control to the area where the user can input the R code;
• alt+t: changes the tab name;
• alt+k: opens or closes, alternatively, the window where the keyboard shortcut
commands are shortly described;
• alt+o: executes the R code, input in the active tab;
• alt+l: deletes the commands contained in the active tab;
1340
Angelo M. Mineo and Alfredo Pontillo
Fig. 2. R-interactive home page.
alt+r: allows the change of the active tab name;
alt+n: opens a new tab;
alt+d: deletes the active tab (at least one tab has to be always opened);
alt+s: saves in a text file the contents of the active tab (by default, the name
of the tab, with the .txt extension, is given to this file);
• alt+u: allows to load a data set text file from the user computer.
•
•
•
•
Each of these operations can be also executed by using the computer mouse. In
particular, in the central part of the home page, in between the part where the code
is input and the part where the output is visualized, there are some buttons that
allow the loading of a text file containing a data set from the user computer: the
result of using these buttons is to change the working directory, by setting up, as
new working directory, that one where the text file is contained; in this case, the
following command is input in the active tab:
dati <- read.table("text file name", header=TRUE)
if the “yes” point is selected from the “header” window.
In Fig. 3 it is shown a very simple example of use: as input, we have edited a
command for the generation of 10 pseudo-random numbers from a standard normal
distribution (x <- rnorm(10)) and their visualization (x), while the output, having
the 10 generated pseudo-random numbers, is shown in the lower part of the figure
(output area).
5 Conclusions
In this paper, we have described R-interactive, that is an open-source R graphical user interface. R-interactive furnishes a very essential interface to R. In our
opinion, the main feature of this software is the possibility to write several code
A basic graphical user interface for R: R-interactive
1341
Fig. 3. R-interactive page with code in the input tab and the related output in
the output part of the page.
rows that are executed all at once, by allowing an easy debugging of the same code.
R-interactive can also be used via web, so it constitutes a web oriented R graphical user interface, too. It is intention of the authors to develop further this tool
by correcting possible bugs and by implementing new functionalities, such as the
development of some menu to facilitate the execution of some frequently used R
commands, or the implementation of a system to highlight, with different colors,
portions of the written code.
Aknowledgement
Thanks to the University of Palermo for supporting this research.
References
[Fox05]
Fox, J.: The R Commander: A Basic-Statistics Graphical User Interface
to R. Journal of Statistical Software, 14(9), 1–42 (2005)
[HUT04] Helbig, M., Urbanek, Theus, M.: JGR: a unified interface to R. In: useR!
2004 - The R User Conference, VIENNA, 20/5/2004-22/5/2004, http:
//www.ci.tuwien.ac.at/Conferences/useR-2004/ (2004)
[IG96]
Ihaka, R., Gentleman, R.: R: A language for Data Analysis and Graphics.
Journal of Computational and Graphical Statistics, 5, 299–314 (1996)
[RDC05] R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
(2005)
1342
[Scw04]
Angelo M. Mineo and Alfredo Pontillo
Scwartz, M.: The Decision To Use R: a consoulting business perspective.
R News, 4(1), 2–5 (2004)
Download