Saving files in R.
I received a question or two about working with different types of files in R, so I thought
I’d read the manuals a bit and put together some notes.
Script files
Script files are written in a text editor (R comes with one, but any text editor will do).
These files contain one or more commands or functions that you want to execute in R. A script file is a text version of the commands that you could just type directly at the R prompt. The value of using script files is that they allow you to keep a record of what you are doing, work line by line to look for errors, and replicate what you are doing again sometime down the road.
Fixing mistakes in script files is much easier than having to retype a full command or operation at the R prompt. There is no substitute for script files, and you should use them all the time.
R assumes that script files are saved with the default extension of .R. I have noticed that if I do not type this extension when I’m saving the file, that the R text editor does NOT add it automatically. Then the next time you are looking for your script files, R will look by default only for files with the .R extension, but your file won’t have it.
Some folks recommend that you start every script file with the following command: rm(list=ls(all=TRUE))
This command will remove all objects currently stored in the active memory of your R session. It is something like using the “clear _all” option in STATA. This command in R is also available under the pull-down menu heading ‘Misc.” Using this command clears out objects from previous sessions that might cause confusion for you. For example, suppose you define an object as X1 in a previous session. Then, you run a new script file in which you planned to define another object as X1, then use X1 as an independent variable in a regression model. If you made a typo in your script file, and called the object x1 instead of X1, then run the regression, R will just use the previously defined X1 (if it can). Using the above command would prevent this sort of error.
The exception to this rule would be if you plan to build a sequence of script files that need to be run in order, and the second builds on the first, the third on the second, and so-forth.
If it were me, I might work on parts of a project in separate script files, but in the end I would put them all together into one large file.
RData files
We have learned how to read in data files from other formats, but suppose you create a file in R or do some data cleaning in R and you want to save that as a data file? You can do that in R with the save() function. This can be invoked from the pull down menus under the “File” option by selecting “Save Workspace”. Using the pull down menu, R appears to assign the default file extension of .RData. You can then read these files back into R using the load()
function, which is also available in the “File” pull down menu as “Load Workspace”. When you save a file using the save function, it will save all of the objects in the current R session by default. You can also provide a list of objects that you would like to save. Remember that objects in R can be items other than just the data set (or data.frame in R). For example, if you assign the results of a linear model to a name like this:
Model1 <- lm(Y~X1 + X2)
The object ‘Model1’ is created and will be saved as part of the .RData file by default.
The popular transfer program put out by STATA called Stat/Transfer does not list .RData files or just R files as an option. It does list S-Plus files, and S-Plus is basically just a commercial version of R. I don’t know what would happen if you saved an .RData file that included objects like ‘Model1’ defined above and then tried to transfer that file into STATA.
Certainly you could transfer a STATA file into an S-Plus file and read it into R that way. But, since we can use the library(foreign) function to permit R to read in STATA, SPSS, or SAS files
(among others), there is no need. So, the thing to watch out for is if you are creating data in R that you want to export to STATA, you will need to read the R help more carefully. You can look online: http://cran.r-project.org/ , then click the “manuals” link, and then the “R Data
Import/Export manual link. This manual was also downloaded by default with my version of R, and can be found under the “Help” pull down menu using the “Manuals (in pdf)” option.
History files
You can also save (and load) a file that records a history of the commands that you executed during an R session. These options are available under the “File” pull down menu as
“Save History” and “Load History”. These files are text files of the commands executed by R.
The expected file extension is .Rhistory. While potentially useful, if you followed my advice and used a script file, you already have a record of all of your commands. Thus, history files are only useful if you work interactively with R and you want a record of what you have done.
“Console” Files
Another type of “Save” command on the “File” pull down menu is given as “Save to file.” This will save everything in the R console window to a text file, with the default extension of .txt. This is just an ASCII text file and can be opened in any text editor. This would be useful if you want a printout of what appears on the screen. Of course, you can also just highlight what you want and just copy and paste the result into a text editor or word processor.
Graphics files
When you construct a graphic or figure in R, it normally opens a separate window in which it then produces that graphic. That window will include a “File” pull down menu that has options for saving the file in several different formats or copying the graphic to the clipboard in a couple of different formats. My advice here is just to experiment a bit with the different formats as they related to the word processor you plan to use.