GAUSS S T A R T E R H INTS (ORIGINAL AUTHORS: PAULA DESPINS AND RAGAN PETRIE) Commands for General Use of GAUSS: We are going to be using the Windows version of GAUSS because the interface is better and much easier to use. The DOS version does exist, and all the program commands that apply to DOS apply to Windows and vice-versa. There are some commands for maneuvering around the program that are unique to the DOS version, but we will not cover those here. To start GAUSS from the desktop of any computer in the HECC, double-click on the GAUSS icon. Or, if the icon is not on your desktop, click on the start button, then click on programs, then click on statistics packages, then click on GAUSS. This will put you in the GAUSS-Command window where there will be a [gauss] prompt after which you can type commands directly into GAUSS (more on these later). This is not the window in which you write and edit your own GAUSS code. This is done in the GAUSS-Edit window. To open this window, click on the Window menu at the top of the screen and then click on GAUSS-Edit. The editing capabilities of this window are very similar to those of a simple text editor such as Microsoft Notepad. Other functions available through GAUSS pull-down menus: file menu in this menu, you can save your program files, find a saved file to edit, debug programs (unfortunately, this isn't a magic debug function! In fact, I've never used it.), run a GAUSS program, and exit GAUSS. Note that some functions are listed twice (edit, run, save). The ones listed at the top refer to actions on saved files, and the ones at the bottom refer to actions on currently active (open) files. edit menu in this menu, you can cut, copy, and paste text and also undo previous steps. search menu in this menu, you can find text and replace text. This can be particularly useful if you need to replace text that occurs several times in your program. windows menu in this menu, you can switch between edit and command mode. This can also be done by clicking on the buttons on the bottom of the screen (just as you would switch between Word and Excel, for example). help menu in this menu, you can find the Windows version of the help manual. Below the menus are four large buttons, two text boxes, and four smaller arrow buttons. The top box is the run box and displays a selected file from the run list (files, typically that you have written, that may be run in GAUSS). Clicking on the arrow to the right of the run box displays the entire run list. Similarly, the bottom box is the edit box. run button clicking on the run button will run the entire file currently displayed in the run box. You can add files to the run list by using the first "run" option in the file menu. If you have output from a program run, the output will be printed in the command window (as well as in an output file that you specify … see below), and you can scroll up and down the window to view your output. If you have graphs, a window for each graph will be displayed. Before running a GAUSS program, GAUSS automatically saves the current version of the program. So if you are experimenting with new code, you might want to save your work under a new file name before overwriting your old code. You can do this by pulling down the file menu in the GAUSS-Edit window and clicking on Save-As. save button clicking on the save button will save the file currently displayed under its existing file name, thus overwriting the previous version. edit button this functions analogously to the run button. You can also choose files to edit by pulling down the File menu in the GAUSS-Edit window and selecting Edit. stop button this button will stop your program while it is running. This is particularly useful if you appear to be stuck in an infinite loop. Be patient when clicking the stop button – sometimes there is a significant delay before GAUSS recognizes the command. Debug button again, nothing to get excited about here. This just compiles your program without running it to check for syntax errors. But GAUSS does this before running a program, anyway. running only part of your program if you want to run only a portion of your program, highlight that section and click on the run button. This is useful for debugging your code, but it won’t work if there is an error somewhere else in the program. GAUSS is actually a very simple language for writing programs, but it takes practice. Before getting into the mathematical commands allowed in GAUSS, note the following basic operating commands that are needed in many if not all GAUSS programs. One thing to keep in mind is that blank lines and extra spaces in your code typically are ignored by GAUSS. ; the semi-colon must end every command in GAUSS (many hours have been spent debugging code because of missing semi-colons). The command itself can be written over several lines, but every command must have a semi-colon between it and the next command regardless of where line breaks occur. new; this command should start all of your programs end; this command should end all of your programs. There are other choices such as “closeall;” that may be more appropriate, but “end;” will always work. 2 output file=<path>\filename.out reset; this command creates an ASCII output file for your program and the reset portion allows that file to be overwritten at every run. If instead of reset, you use “on”, then all printed output will be appended in the specified file. You can also specify that the output file be saved as a text file by replacing “.out” with “.txt” if you prefer. outwidth this command sets the width of your output file. To avoid lines wrapping around on an 8-1/2 by 11 inch page, use “outwidth 256;” format controls the format of numerical output. Adding “/rd” will return a right-justified signed decimal number of the form [-]####.####, where #### is one or more decimal places. The number of spaces for the field width and for the decimal place is specified by “#,#.” For example, to format output with a field width of seven and two decimal places, the command is “format /rd 7,2;” This leaves four spaces to the left of the decimal, one space for the decimal, and two spaces to the right. screen off this prevents output from being printed to the screen (but it still is saved in your output file). This is often useful when your program has a lot of output because it takes much less time to calculate something than it does to simultaneously calculate and show it. For small output programs this is no problem and you won’t need to use this. “screen on” will turn the screen back on in the middle of a program; it will automatically come back on when the program finishes executing. /*…*/ or @…@ these are comment markers. GAUSS will ignore everything you type between them. They are equivalent, except that “/*” opens a comment and “*/” closes a comment, whereas “@” both opens and closes a comment. Comments are very useful for organizing your code (especially when the number of variables gets large) and for testing sections of your code (you can temporarily prevent certain lines from running without deleting text to isolate a problem). print “whatever” use this to print text to the screen or to an output file. Here, “whatever” will be printed. The quotation marks are necessary, but the print command is not – print is the default command in GAUSS. So, the commands “print “whatever”;” and ““whatever”;” are equivalent. To print a GAUSS object, such as a matrix, just type its name with no quotation marks. So, “print “whatever ” name;” will print “whatever name” – notice the space after “whatever” is within the quote marks - that is how you must insert spaces. However, you need a space between the quote marks and “name” as well because they are treated separately. Simply typing “name;” will print the object. GAUSS supposedly has a nice graphics program (pgraph), but I’ve never used it – I always import output files into Excel to create graphs and tables. It’s up to you. ? or “” this is used to add a blank line when printing output. 3 timestr(0) this prints the current time. datestr(0) this prints the current date. Mathematical, Matrix and Logical Operators: Almost all common functions are included in GAUSS, like the absolute value function, the log function, the sine, square root, gradient, integral, determinant, eigenvalue, rank, etc. A list is given in the front of Volume 2 of the manual (the command reference, ch. 21). It might be worth a quick read. Commands also are listed in the pull-down help file. Also note that many common statistical operation like the mean, median, and correlation matrices are supported, as well. Below, “x” signifies a mathematical object – typically either a scalar, vector or matrix – but almost any text string (without spaces) could be substituted in its place. Exceptions to this are the “reserve words” list (such as proc, col, row, etc.) that is given in Appendix G of the manual. “x” must be initialized before it may be included in any operation (more on this later). Note that typing “y=ln(x);” will assign the value given by ln(x) to the variable y and store it in memory, but typing “ln(x);” only will print the value given by ln(x) to the screen and/or output file. Virtually all of these commands may be used both in your code and at the [gauss] prompt in the command window. Also note that “(m x n)” should be read as “m by n”. * / + ^ = eq ne ge gt le lt . x’ x~y x|y rndu(m,n) multiplication division addition subtraction raised to the power of... e.g. “x^2” is x-squared equal to – typically used to assign values (e.g., y = ln(x);) equal to – typically used to check equality (e.g., if x eq 0 then y = 1; endif;) not equal to greater than or equal to greater than less than or equal to less than combining the period with any of the above operators causes the operation to be done element-by-element. If you are dealing with matrices the default is to use matrix operations, not element by element, so be sure which you want. transpose of x horizontally concatenates x and y. That is,it takes these two matrices or vectors and combines them into one by stacking them side by side. vertically concatenates x and y. This works just like “~” except it stacks them on top of one another. generates an (m x n) matrix of random numbers drawn from the uniform distribution 4 cdfn(x) computes the cdf for the normal distribution. If “x” is (m x n), then “cdfn(x)” generates an (m x n) matrix of cdf’s. “cdfnc” computes “1-F(x)”. The same process is supported for the chi-square, F, t, beta, exponential, and gamma distributions. These commands all start with “cdf” as well and you can look them up in the command reference. We won’t have many opportunities to use other distributions so these should be enough for you. Similarly, “pdfn” computes the pdf for the normal distribution. ceil(x) floor(x) round(x) rounds x up rounds x down rounds x to the nearest integer exp(x) ln(x) log(x) sqrt(x) abs(x) computes the exponential function of x computes the natural log of x computes the log (base 10) of x computes the square root of x computes the absolute value of x gradp(&f,x0) computes the gradient vector or matrix (Jacobian) of a function that has been defined by a procedure. computes the matrix of second partial derivatives (Hessian) of a function defined in a procedure. hessp(&f,x0) det(x) inv(x) invpd(x) rank(x) eig eigv computes the determinant of a matrix (x) computes the inverse of a matrix computes the inverse of a symmetric positive definite matrix (what is the difference? None, mathematically, but a lot in terms of computation time, especially for large matrices. For symmetric, positive definite matrices (such as moment matrices), “invpd” is about twice as fast as “inv”.) computes the rank of a matrix computes the eigenvalues of a matrix computes both the eigenvalues and the eigenvectors of a matrix. (As with calculating an inverse, there are faster computation methods available under certain conditions. Refer to the manual for these.) Most of the following commands work on either a vector or a matrix. When “x” is a vector, each command return a scalar; when “x” is a matrix, each command returns a vector. sumc(x) cumsumc(x) prodc(x) cumprodc(x) computes the sum of the elements in each column of a matrix computes the cumulative sum of the sum of the elements in each column of a matrix computes the product of the elements in each column of a matrix computes the cumulative product of the product of the elements in each column of a matrix 5 maxc(x) minc(x) rows(x) cols(x) meanc(x) medianc(x) corrx(x) corrvc(x) crossprd sortc(x,x[.,1]) returns the maximum element of a column vector returns the minimum element of a column vector returns the number of rows of a matrix returns the number of columns of a matrix computes the mean of a column vector computes the median of a column vector computes the correlation matrix of x computes the correlation matrix of a variance-covariance matrix computes a cross product sorts the elements of a matrix according to the numeric order of the first column. There are other “sort...” commands which get a little fancier. eye(m,n) zeros(m,n) creates an (m x n) identity matrix creates an (m x n) null matrix. This is commonly used to initialize matrices. For example, if we want to record the results from each run of a loop in a matrix, we need to initialize the matrix prior to starting the loop. creates an (m x n) summer matrix of ones. Useful for defining a constant term in a regression. ones(m,n) if “logical expression”; then “what you want to happen”; endif; this is how you set up a conditional expression. Notice the “endif;” – you always need to close these expressions with “endif;”. The logical expressions supported here include the mathematical operators shown above plus “and” and “or.” You can include as many “and” / “or” expressions as you like, but it’s often useful to use parentheses to keep track of how they relate to one another. To express “either or” write a command line directly after the “if” line and use the command “elseif “logical expression”; then “what you want to happen”; -- you also can use as many “elseif” expressions as you like. x[m,n] this is not a command by itself but is an integral part of many commands -- it is referring explicitly to the (m x n) matrix x. y=x[2:5,1:3] This assigns to the variable y a submatrix of the matrix x. Here, y is now a (4 x 3) matrix with values given by the corresponding elements in rows 2 – 5 and columns 1 – 3 of the x matrix. The operation also supports scalar variables in place of numerical values, such as y = x[1:rows(x)-1,1:cols(x)-1], which assigns all but the last row and column of the matrix x to the variable y. Or y = x[a:b,c:d], where a,b,c and d have been defined previously in the program. If you write y=x[.,c:d], then all rows of x are preserved; if you write y=x[a:b,.], then all columns of x are preserved (notice the “.” inside the square brackets). y=selif(x, [logical expression]) this selects rows or columns of the x matrix that satisfy the logical expression and assigns them to y. See the command reference for an example. For very large data 6 sets, this procedure requires a lot of memory because it loads the entire x matrix each time it checks the logical expression. In such cases, it may be better to write-out the code explicitly to accomplish this task. y=delif(x, [logical expression]) this deletes rows or columns of the x matrix that satisfy the logical expression and assigns the resulting matrix to y. See the command reference for an example. Data Issues: Several of the commands shown below use “let” to begin the command line. In many cases, this is not necessary – in fact, GAUSS may sometimes produce an error message if you use “let”. For example, you cannot use “let” to concatenate matrices you have already made (e.g. “let x=a~b” is not supported where “a” and “b” are matrices; but “x=a~b” is supported). However, sometimes GAUSS seems to produce an error message if you don’t use “let” – this inconsistency is due to the fact that “let” has been phased out of the GAUSS source code, but not entirely. The old command reference shows a lot of “let” commands, but you should ignore most of these. let x = { 1 2, 3 4 } this returns a (2 x 2) matrix with 1 and 2 in the first row and 3 and 4 in the second row. The comma is necessary to demarcate the rows. “let” isn’t necessary. let x = { 1 2, 3 4 } GAUSS does not distinguish between this command and the one shown above. Use this one if you find it convenient to type in data in matrix form. Note that you have to manually stack the matrix by hitting return and inserting spaces. “let” isn’t necessary. let x[2,2] = 1 2 3 4 this returns the same matrix as above. “let” is necessary. let x = 1 2 3 4 this returns a column vector of “1 2 3 4.” Notice that the column vector is the default data format for GAUSS – unlike the previous commands, GAUSS doesn’t know what dimensions to use here, so it uses a column vector. “let” is necessary. let x[2,2]=1 this returns a (2 x 2) summer matrix (i.e., a matrix of ones). You also could use “ones(m,n)” but either is supported. “let” is necessary. let x[2,2] this returns a (2 x 2) null matrix (i.e., a matrix of zeros). You also could use “zeros(m,n)” but either is supported. “let” is necessary. reshape(x,r,c) = y this reshapes a matrix x of arbitrary dimensions into an (r x c) matrix. The first c elements are put into the first row of y, the second c elements into the second row of y, and so on. If there are more elements in x than in y, the remaining elements are discarded. If there are not enough 7 elements in x to fill y, then when reshape runs out of elements, it goes back to the first element of x and repeats. seqa(starting point,increment,number of elements in sequence) this is a quick way to generate a vector which contains regular intervals (e.g. integers 1-10). “seqm” works the same way but multiplies rather than adds the successive elements. GAUSS runs in your computer’s RAM, but data and output are stored on disk. Therefore, data must be loaded into RAM before GAUSS can use it, and output must be stored on disk if you intend to use it after quitting GAUSS (note: screen output is stored in the specified output file as discussed previously. The commands presented here are for storing data not printed to the screen.). There are several different ways to load data into a GAUSS program and to save objects in GAUSS to disk. Various “load” and “save” commands are the most common, but others such as “import” also exist. Before you load or save data you must specify a path, one for loading and one for saving. The easiest way to do this is just to type “load path = drive:\folders;” and “save path = drive:\folders;”. These will remain the defaults unless you overwrite them later in the program. load x=name of matrix; or load x=name of text file.txt this will load a previously saved GAUSS matrix (with the default extension “fmt”) or text file and assign it to the matrix “x”. “loadm” will also work for matrices. load x[m,n]=name of ASCII file.asc this will load an ASCII file and assign it to the matrix “x”. Note that if you mess up the dimensions, GAUSS will not notice and will just reshape the data to fit the “m” and “n” you specified (e.g. if you have a file with 8 entries and you mean it to be a 4 by 2 matrix but you type “load x[2,4]=name of file”, you will get a 2 by 4 instead of a 4 by 2). x=loadd(“name of file”) this will load a previously saved GAUSS data set or small ASCII file and assign it to the matrix “x”. Note the quotation marks are required even though you will NOT see them in the manual. Read the section on the “Atog” utility for instructions on converting an ASCII file into a GAUSS data set prior to loading - this will be required with any substantial ASCII files. The program DBMSCOPY can be used to convert ASCII files to GAUSS data sets, as well. save x; this saves the object x as a GAUSS object named “x”. Here, “x” can be a procedure, function, matrix, string, etc. This is much more convenient than using the output file (which essentially is just a printout) because “save” allows you to easily reload the object into another GAUSS program. DO NOT put an extension on the object – the computer will assign the file a default extension. Leave this alone, it will make your life easier when you reload the object. If you want to save an object without specifying a default path, you can write “save x=drive:\folders\file;”. y=saved(x,name of file,variable names string); 8 this saves the matrix x as a GAUSS data set. This may be convenient if you want to transform some variables and then save them for later use in a regression, particularly when your dataset is large. “save” also allows you to do this. clear x saving an object to disk does not clear the RAM associated with that object. This can be a problem as when you are working with large data sets. Use “clear” to free-up RAM after you have saved an object. If you clear x before saving x, you will permanently lose x. x=missrv(x[.,1],y[.,1]) this replaces any missing values in the first column of x with the corresponding values in the first column of y. “miss” will reverse the process and substitute GAUSS’ missing value code wherever it finds the number you specify. “code” “recode” and “substute” also does roughly the same thing except you define exactly what to look for (e.g. some number) and to what to substitute (e.g. another number). Graphics: GAUSS has a pretty good graphics program, and it may be worth learning. If you have more than one series to graph, you can graph them together in one graph or in several different windows on the same page. GAUSS also supports 3-D graphs. Personally, I prefer to load data into Excel to generate graphs, so I am fairly unfamiliar with GAUSS’ graphics capabilities. But here is a brief intro. library pgraph this lets GAUSS know that you will be using the graphics module - there are several separate modules for use with GAUSS e.g. one for maximum likelihood estimations and you must always ‘open the door’ to them at the beginning of the program if you will need to use them. title(“title of the graph”) this puts a title on your graph. note the quotation marks are required. ylabel(“label for the y axis of your graph”) this puts a label for the vertical axis on the graph. note the quotation marks are required. xlabel(“label for the x axis of your graph”) this puts a label for the horizontal axis on the graph. note the quotation marks are required. xy(x,y) this makes a 2-d graph of your x and y vectors (x and y must be column vectors, so transpose them if necessary). You can also concatenate two series together in order to graph them on the same graph. For example, say you have the distribution of income for two groups (denote x and y as the distribution for group A and B) which you want to graph together. Let z=x~y and w is income, then “xy(w,z)” would graph both series together. _plegstr=”graph1\000graph2” this gives legend text in a box on your graph (here, “graph1” and “graph2”). If you have several curves, separate the text with “\000.” 9