numerical/statistical - Agricultural & Applied Economics

Agricultural and Applied Economics 637 Lab Section 1: Introduction to GAUSS January 24, 2007 As noted in class, we believe the best way to really learn econometrics is to go “inside the black box”. As such throughout the semester you will not only be asked to undertake empirically based assignments but you will also be asked to develop your own parameter estimation method(s). We have found that the GAUSS software package is a very useful tool for obtaining an understanding of applied econometrics as it allows you to directly translate what is presented in class into an operational software program. This software has been used extensively by econometricians since the early 1980’s. I started using this system in 1983 (before there were hard drives available on the typical desktop computer). The best way to start learning GAUSS is to use it. Today’s lab exercise will take you through a basic introduction of the software and familiarize you with the programming language. You’ll want to save your code when you are finished, as we’ll build on it next week. Be aware: the steps outlined here are just one way of doing things. There are many ways to edit, run, code, etc., and coding is a very personal thing. As we go through the class, you will see more methods and you will invariably develop your own style. Through the years, I have found a number of strategies that help with the development and debugging of my own computer code. I will offer these insights to you over the course of the semester. I hope you will find them useful as well. The GAUSS software is a very extensive system that can be used for a variety of numerical/statistical operations. You will not use all of the features and commands associated with this program during this course. You will, however be able to undertake a variety of basic as well as fairly advanced tasks by the end of the course. I would recommend that you become familiar with Chapter 4 (Introduction to the Windows Interface) and Chapter 5 (Using the Windows Interface) of the GAUSS Users Guide that you can download from the class website. We also encourage you to learn to use the on-line help system available from the main menu once GAUSS is started (Reference and Command Reference Sections) and from the above GAUSS Users Guide. The goal today is to make sure everyone knows how to get started developing your own GAUSS code, accessing GAUSS data sets and outputting the results in a readable format. So let’s get started. 1. LOG INTO THE AAE NETWORK If you have an AAE account, log in as usual. If you do not have an AAE account you can login using a guest account: Username=heccquest Pwd=Ag&AppEc 2. OPEN GAUSS 1 Go to the Start menu → Programs → Statistics → GAUSS 7.0 (Note: If you trying this on your own machine, you will probably have an icon on your Desktop. You will also probably locate the GAUSS program in a different location than the above) This opens the GAUSS PROGRAM. Close the Tip of the Day Box if it appears. The “Command Input-Output” window is open. The cursor is blinking at the gauss prompt. Here you can enter commands line-by-line and see your output in the same window (Note: you can also include the following commands in an ASCII file with each command followed by a “;” and tell GAUSS to run this file). For example, type cls <hit the return key> (This clears the screen of any characters) y=16+15 <hit the return key> (This adds 16 and 15) y <hit the return key> (This displays the current value of y, which happens to be a scaler) and see what happens. 3. WRITING A PROGRAM. You can undertake various tasks using GAUSS one command at a time (Command mode) or by running a bunch of commands at once (Batch mode). To run a file in batch mode you first create an ASCII file containing these commands. This file can be created using your own favorite ASCII file editor (e.g., Notepad, Tedit, etc) or you can use the native GAUSS program editor. (Note, I use a specialized editor designed for programmers. I use it because of some of its features) Lets assume that you will be using the GAUSS editor. Once GAUSS is started click on the first button on the tool bar (or File→new). This will open up an editor window where you can write an entire program and run it all at once. It helps to go to the Window menu and choose “tile horizontal”. Now you have your edit window in the top half and the command I-O window on the bottom half of the right side of your screen. Open up the file example_1.gas which is located in the network temporary drive (T:\) via the GAUSS program editor. You should see something like the following: new; cls; let a={1 2 3, 4 5 6}; b=ones(2,3); c=a + b; “a” a; “b” b; “c” c; end; /*Use this to start every program. This clears out previous information from past programs*/ /* Clears the screen of previous gauss output*/ /*one of several ways to create a (2x3) matrix with these particular elements- see Felix Ritchie’s “Basic operations”. NOTE the let command is necessary*/ /*creates a matrix with 2 rows and 3 columns, full of 1’s*/ /*defines the variable c*/ /*Prints the letter “a” and then the matrix called a. */ FYI: a few general comments about the code. 2       There must be a semi-colon to end every command. You can have many commands in the same line, but they must be separated by a semicolon. Much time has been wasted trying to debug code that doesn’t run because of a forgotten semicolon!! You can place a long command onto separate lines of your code so long as a “;” is not used. You should not have lines longer than 80 characters (for printing reasons). If you line is more than 80 characters, continue with your code on a 2nd line, indent, and keep typing. Every program should start with new; and finish with end; Any variable to the left of an equals sign is a name I made up, e.g. that I want to define for GAUSS Any thing to the right of an equals sign is a variable name I already defined or a GAUSS command/operation/function/reserved word; Comments, that don’t impact the operation are located between /* and */ The GAUSS editor has the normal Windows-based file management system so you can edit and save this file like any other program you have used before. I would recommend that you save this file to your H drive. Go to File→Save, and save it on your H drive, or a removable disk. (Hint: It is recommended that when creating gauss program files that you give it a unique file extension different from “txt”. Personally I use the extension “gas” as the file extension for gauss program files.) With your cursor located in GAUSS editor window, click on the Run Active File button on the toolbar (the 12th from the left) or go to Run→Run Active File. What happened? Is the matrix c what you expected it to be? Now in your edit window, add the following to the end of your program JUST BEFORE THE END STATEMENT. What do you expect it to do? “d”; d=a*b; d; Run the program again and notice what happens- it doesn’t work. That is, you receive the error message in the bottom window that looks similar to the following: M:\aae637\spring_07\example_1.gas(9): error G0036: Matrices are not conformable The number in parenthesis provides an indication of the line number that generated the error. (Note: For more complicated code and depending on the error, the error will be in the neighborhood of this line number). This new command doesn’t make sense since the “a” and “b” matrices are not dimension compatible with respect to matrix multiplication. This error was found while GAUSS was compiling the program file. As such, given the severity of the error GAUSS did not run any of the program. There is a useful message in the command window, though. When you get this message, it is often helpful to verify the dimensions of your matrices. You will find the Source View window very useful for the debugging of your code. From within the Source View window, if you click on the Symbols tab, and then click on matrices, you see a list of the matrices your created and their size. If you click on a particular matrix, a matrix viewer window opens up and shows you the values of the particular matrix of concern. 3 The problem here is that the GAUSS command * performs matrix multiplication, and here you are telling it to multiply a (2x3) matrix times a (2x3) matrix (which you should remember is not possible). Change the last line to read d=a .* b;”d”; d; Also add the line e=a*b’; “e”; e; /*Note the transpose*/; Run the code again. Does it work? What is different about these? What do these commands do? 4. READING IN A DATA SET For most of the assignments and in-class discussion, we will work with data sets that have already been created. So, we need to read in and open this data file. I will always provide the data as a GAUSS data set, although you can also read in ASCII files and even Excel spreadsheets directly into GAUSS. GAUSS datasets (and similar to STATA and SAS datasets) are binary files that can not be opened up and examined using an ASCII (text) editor. DBMSCOPY is a useful program that can transform any ASCII or binary file of a particular type (e.g., SAS, STATA, EXCEL, GAUSS) into any other file type. Besides being able to transform data from one form to another, this software also has the nice characteristic of allowing you to view any type of data, undertake data management activities and to obtain descriptive statistics of this data. The DBMSCOPY manual can be downloaded from the class website (It can be found in the GAUSS Programming & Programming Hints Section) For now, I have put a gauss data set on the temp drive, T, in the folder aae637. The data set is called china_00 (Warning: This file has a “dat” extension. Do not use your file manager to try to open this file as it is binary and will try to use Notepad to open this file). Copy this file to your H drive, or removable disk. Normally you will have to download and unzip the data set from the course webpage to your H drive, temp drive, disk, etc. A GAUSS data set (in V92/V96 format) consists of the actual data and information as to variable names, variable type and column location in the data set (in binary form). One complaint I had when first learning GAUSS is that I couldn’t start by “seeing” my data. DBMSCOPY helps me here too. You can open a GAUSS dataset in DBMSCOPY and look at the variable names, locations, create new variables and a lot of other stuff. You can try this now or later, ask me if you want to know more. (Note: One caveat when using DBMSCOPY is that if you want to convert for example a SAS to a GAUSS dataset make sure you save the file in the GAUSS V92/V96 format.) . Start DBMSCOPY and look at the GAUSS dataset china_00. 4.1 Accessing DBMSCOPY Remotely Note, you can do the above at home or some other remote location even if you do not have access to DBMSCOPY so long as you have internet access, XP Professional is your computer’s OS and you have a valid AAE account. If you meet this requirement, you can use the Remote Desktop feature of the XP OS to connect to the AAE Remote Server. If you are at a remote location, click on START, then PROGRAMS then ACCESSORIES then COMMUNICATIONS 4 then REMOTE DESKTOP CONNECTION. The computer you want to access is our remote server which has the computer name (address) of: remote.aae.wisc.edu . Once you enter this address, you will then see what looks like another Windows session. What is actually occurring is that your remote machine is in fact starting a session on another Windows machine (i.e., the remote machine). You can then log-in to the remote computer using your AAE username and password. The remote machine is set up exactly (or very similar to) the regular AAE lab machines. This means that you have access to all the programs in the AAE lab from your remote location. Since you are logging in using your AAE account, you have the same drive mappings as you have when you log into a lab machine. This means that so long as your data, programs etc are stored on a network resource, you have access to it from your remote location. 4.2 GAUSSS Code for Reading in a Dataset Now, I would like you to create a new gauss command file. Use File→Close to close the above GAUSS command file. Then use File→Open to open up the file: example_2.gas. Use the Window→Tile Horizontal command sequence to place your various windows in an orderly manner. Type the following lines into the GAUSS edit window. (Note: Where I’ve written t:\\aae636\\ type the location of the folder in which you’ve copied the data, making sure you use the double backslash (ie., h:\\private\\). Also note the use of quotation marks). The following is a listing of the commands contained in the above GAUSS command file. new; /* Command to clear memory */ cls; /* Clears screen from previous output */ format /rd 8,2; /* Formats screen output, 8 places, 2 decimal pts */ outwidth 256; /* Output width of your pinter/screen */ basepath="t:\\aae637\\"; /* Create path acronym */ outpath="t:\\aae637\\output\\"; /* Create path to where output file is to be place*/ datafile = basepath $+ “china_00"; /* Identifies Gauss data set */ outfile=outpath $+ "china_ex.out"; /* Complete path for output ascii file */ output file=^outfile reset; /* Identifies output file */ open fp=^datafile varindxi; /* Open data file for reading */ numobs=rowsf(fp); /* Determine number of observations */ mydata=readr(fp,numobs); /* Read in data all at once*/ vvv=close(fp); /* Close data file */ print "I have successfully run the program"; end; This block of text, or something very similar to it, will be needed at the beginning of virtually all of your GAUSS programs. What do the above commands do? format controls the format of numerical output. The values “/rd 8,2” gives rightjustified decimal number allowing 8 total spaces, 5 to the left and 2 to the right of the decimal (ie. 62534.78) THIS IS IMPORTANT, ON ASSIGNMENTS I WANT THE NUMBERS FORMATED TO A REASONABLE NUMBER OF DECIMALS. Multiple format commands can be used in a single GAUSS run to control the look of 5 specific output. You can look up the format command in the on-line help system. Give it a try. outwidth sets the width of your output file. You’ll almost always use 256 basepath is a user-supplied name I made up and defined as the words in quotes. This saves typing below. This basically contains a path statement that will be appended to a file name later on to completely identify the location and name of a particular file. Make sure the directory exists. outpath is similar to basepath but defines the location (directory) where your output is to be placed. Make sure that this directory exists prior to you running your program. datafile tells GAUSS what GAUSS data set to open and where it is located. In this case, I want to open the file containing the GAUSS data set, china_00, located in the folder t:/aae637. I could type datafile = “t:\\aae636\\china_00”; /* Note the double back-slashes */ Since I defined a variable called basepath, I can save typing by using the syntax above. This is useful if your folder path is very long. For example your data is stored in h:/private/school/spring2007/aae637. Note that you do not need to include the “.dat” extension to the GAUSS dataset name. For some reason people don’t seem to remember this. This just saves you typing. outfile is a user supplied name I made up to identify the location and name of an ASCII output file I want GAUSS to create. After I run my program, I can go to my computer, look in my t:\aae636\output folder and see a file called china_ex.out. output file actually creates an ASCII output file for your program and the “reset” command overwrites the contents of this file with every run. You can specify that the output file be saved as a text file by replacing “.out” with “.txt” if you prefer. The “^outfile” says to GAUSS to place the value of “outfile” here. open fp opens the data file identified by datafile. The varindxi commands allows one to identify variables by their name instead of column location. varindxi creates new variables in memory equal in number to the number of variables in a GAUSS data set where the letter “i” is added to each variables file name (the first 7 letters). The values of these scaler variables will be the column number of the variable in the gauss data set being accessed. This information will facilitate your ability to access specific variables in a data set without knowing the column location of the variables. You only need to know the variable name. Use the Source View window to verify the contents of these varindxi variables. 6 rowsf(fp) identifies the number of rows in the file fp. rowsf is a native GAUSS command. Here we call this number numobs. This name is arbitrary and you can call it whatever your want. Don’t confuse the rowsf command with the rows, which returns the number of rows in a matrix x. Use the online help system for more information about these commands. readr(fp,numobs) reads numobs rows from the file fp, and assign this data to a matrix referred to using the arbitrary name mydata. The value of numobs was obtained from the previous command. close(fp) closes the file fp FYI: You must first open and then read the data. It is important that you read the correct number of rows. Too many rows and things don’t work, too few and you lose data. 5. RUN THE CODE, AND CHECK TO MAKE SURE YOU HAVE SUCCESSFULLY LOADED THE DATA. Hit the run button. Go to the source view window and then click on the symbols tab and then click on the matrices section. You should see that the matrix of mydata should have a dimension of (2050 x 4). This means there are 4 variables in the dataset with 2050 observations. 6. IDENTIFY YOUR VARIABLES The variables in this data set are Per Capita Income At home food expenditures Away from home food expenditures Region(either 1, 2, or 3) percinc fah fafh region After the “readr” command you now have a (2050 x 4) matrix of data. The matrix name is “mydata”; To make things easier, we might like to identify the columns of the data matrix by name. The varindxi command we used allow us to do this easily, if we know the variable names even if we don’t know which column is which variable. Go to the source view window and then click on the symbols tab and then click on the matrix ipercinc. You should see the value “4” indicating that the per capita income variable is in column “4” of the GAUSS dataset. In your command window, type each of the following: inc=mydata[.,ipercinc] <Hit the return key> fafh=mydata[.,ifafh] <Hit the return key> reg=mydata[.,iregion] <Hit the return key> 7 This tells GAUSS that the user-defined variable inc is all observations in the 4th column of the userdefine mydata matrix (i.e., the column of information associated with per capita income), etc. The use of the “.” in the above commands tells GAUSS to grab all the rows. If you wanted to grab only the 120th -220th observations for some reason you would type something like: inc_2=inc=mydata[120:220,ipercinc] Go to the source view window and then click on the symbols tab to look at the income (INC) and food-away-from-home (FAFH) matrices. They should both be (2050 x 1) in size. If you want to create a (2050 x 2) matrix containing per capita income and food-away-from home expenditures you could enter the following either in your command file or interactively: income_region=mydata[.,ipercinc ifafh]; .Note there is no comma or other delimiter between ipercinc and iregion. This would grab the 4th and 2nd columns (the columns associated with PERCINC and REGION) from the mydata matrix using all the observations (which is what the “.” means). I could have created this new variable via the following: inc_reg_2=inc~fafh The “~” operator represents horizontal concatenation of two matrices (vectors). In the command I/O window type the following command: begtind=numobs-10; fah[begind:numobs]~reg[begind:numobs] and see what comes up. FYI: The square brackets attached to the end of a matrix name allow you to specify part of a matrix. For example, mydata[2,4] is the single element in the 2nd row and 4th column of mydata mydata[2,1:3] is the (1x3) vector of element in the 2nd row and 1st through 3rd columns mydata[.,4] is the vector of elements in the entire 4th column. FYI: If we did not know the variable names we could get them with the following commands in the commad window. varnames=getname(datafile); /* varnames is arbitrary name */ $varnames; /*the $ tells GAUSS that varnames is a character matrix*/ You could also look at this character vector using the source view window, symbols tab. If you click on the matrix varnames all you see are zeros. That’s because GAUSS does not know that varnames is a character vector. If you click on the column header for (1), then click on format, then choose character, the window will then show the elements of the varnames vector. In this 8 example, the order of the variable names also reflects the order of the variables in the mydata matrix. 7. NOW LET’S DO SOME MATHEMATICAL MANIPULATIONS AND DATA INTERPRETATION Refer to the Gauss reference material, help manual, and the list of commands below to answer the following questions: What is the average income and food expenditures? What is the maximum level of income? What percentage of households live in region 1? What is the min, max, and mean fafh in region 1? /********************************************************************/; ADDITIONAL COMMANDS YOU SHOULD BE FAMILIAR WITH sumc(x) returns the sum of the elements in matrix x cumsumc(x) returns the cumulative sum of the columns of matrix x meanc(x) returns the mean of every column of matrix x stdc(x) returns the standard deviation of of the elements in each column of matrix x minc(x) returns a column vector containing the smallest element in each column of matrix x maxc(x) returns a column vector containing the largest element in each column of matrix x vcx(x) computes the variance-covariance matrix zeros(r,c) creates a matrix with r rows and c columns full of zeros ones(r,c) creates a matrix with r rows and c columns full of ones eye(N) creates an NxN idendity matrix (you must either use a number in place of N or define N) rndn(r,c) creates a matrix with r rows and c columns of c independent std. normal random numbers sortc(x,c) sorts matrix x in increasing order according to column c rows(x) returns the number of rows in matrix x cdfn(x) returns the Prob(z<=x) where z is a N(0,1) random variable y=selif(examdata[.,2:5], data_matrix2[.,4] .ge 10) This creates a matrix y that contains selected parts of the previously defined matrix mydata. Here y is all rows of the 2nd through 5th columns of the examdata matrix when the 4th column of the matrix data_matrix2 is greater than or equal to 10. (Note: This assumes that the number of rows of data_matrix2 is the same as the number of rows of the examdata matrix.) Some logical expressions (Note these are all element by element comparisons), .gt → greater than e.g. vvv =(x .gt y) will create a vector of 0,1’s of dimension rows(x) depending on whether the comparison is true or false, =1 if xi > yi, 0 otherwise .lt → less than e.g. vvv =(x .lt y) will create a vector of 0,1’s of dimension rows(x) depending on whether the comparison is true or false, =1 if xi < yi, 0 otherwise 9 .ge → greater than or equal to e.g. vvv =(x .ge y) will create a vector of 0,1’s of dimension rows(x) depending on whether the comparison is true or false, , =1 if xi ≥ yi, 0 otherwise .le → less than or equal to e.g. vvv =(x .le y) will create a vector of 0,1’s of dimension rows(x) depending on whether the comparison is true or false, , =1 if xi ≤ yi, 0 otherwise .eq → equal to e.g. vvv =(x .eq y) will create a vector of 0,1’s of dimension rows(x) depending on whether the comparison is true or false, , =1 if xi = yi, 0 otherwise .and → allows you to combine logical expressions e.g., vvv= (x .gt 0) .and ( y .le 0) will create a vector of 0,1’s of dimension rows(x) =1 if xi>0 and yi ≤0 .or → allows you to examine the union of two logical expressions e.g., vvv= (x .gt 0) .or ( y .le 0) will create a vector of 0,1’s of dimension rows(x) =1 if xi>0 or yi ≤0 or both. The following illustrates the use of a “For Loop”. You should refer to your reference material for instructions on its use. The For or Do loops can make your life much easier if you are doing repetitive tasks. x=zeros(20,1); for i (1,20,2); x[i]=i; endfor; /* This initializes the x matrix so there are place holders for later use In general this only needs to be done when using for or do loops. /* i is an temporary variable used to control the loop In this example the loop goes from 1 to 20 in increments of 2, e.g., 10 steps. That is every other element of x will be non-zero*/ /* the ith element of the vector x is assigned the current value of i */ /* end of for loop */ Another way to accomplish the above: for i (1,20,1); if i .eq 1; x=i; else; x=x|i; /* This horizontally concatenates the previous x vector with another element equal to “i” */ endif; endfor; /* comments*/ anything in the comments area will not be read by GAUSS but can help you organize your program. Matrix concatenation can be achieved via the following assuming conformability: ~ → horizontally concatenates two matrices In GAUSS type the following: 10 let a1={1,2,3} <Hit return key> let a2={4,5,6} <Hit return key> a3=a1~a2 <Hit return key> “a1” a1 <Hit return key> “a2” a2 <Hit return key> “a3” a3 <Hit return key> | → vertically concatenates two matrices In GAUSS type the following: a4=a1|a2 <Hit return key> “a1” a1 <Hit return key> “a2” a2 <Hit return key> “a4” a3 <Hit return key> 8. RUN THE DEMO_GAUSS PROGRAM Now you are ready to run a longer GAUSS program. Download from the Lab section of the class website the program DEMO_GAUSS_07.GAS command file and store locally. This file undertakes some basic matrix manipulations. After examining this file, run it in GAUSS and make sure you understand the various operations. A FINAL NOTE: Try to develop a habit of writing clean and neat code (e.g. use of indentation for loops, lines of code that don’t go on forever, etc). This will facilitate your debugging of code and your ability to understand what is actually being undertaken. Refer to the GAUSS users manuals for guidance. 11

numerical/statistical - Agricultural & Applied Economics

Related documents

Products

Support

numerical/statistical - Agricultural & Applied Economics

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib