Lab 2 - Stat 5511 - Fall 2011 Please read the handout all the way through! 1 ) Our textbook offers the complete package of data sets that are listed in the appendix. I have downloaded and formatted them for your use. Since we're on the Macs, we are unable to take advantage of the command proc import. Because of this, you will need to input the variable names by hand instead of having them pull through to the files (still, this is much nicer than typing in all the data...). To "import" the files, follow the instructions: Open terminal (again use ssh -Y usemame@ub to log in.) Move into your 5511 directory. Then make a new folder called “txtfiles”. (You can name it whatever you like, just stick with that). (i.e. type in cd 5511 <return> mkdir txtfiles <return>) Open an Internet browser and go to http://www.d.umn.edu/~tang0333/stat5511.html Click on the link Data files from the textbook. Download files.zip (save to desktop, it’s easier this way) Log in to My files (this is in the bottom toolbar on your computer. There is an icon "Connect to My files directory") Drag files.zip into My files Double click files.zip to unzip it Go back to terminal Make sure you are in the 5511 directory, then type mv ~/windir/files/data* txtfiles (there is a space between * and txtfiles) These will be your txtfiles. To use them, you will need to have an infile statement as follows: data temp; infile ‘~/5511/txtfiles/data-table-B01.txt’; /* here, B01 is for table B.1 of the textbook*/ input var1 var2 var3 …; /*var1, var2…are column names, so you also can use y, x1,x2,x3…*/ In the "input" line, if a variable is alphanumeric (i.e. it is a character) and not a number, then you need to put a dollar sign ( $ ) after you enter the variable. This allows SAS to read the variable as alphanumeric instead of numeric. If you don't do this, your program will have an error. If you don't like this method, you can also import each table individually, or you can always type the data in by hand. (Personally the way I showed you will make it easier.) 2) To get SAS windows, in terminal, type: sas –dms You may want to set options for the editing window at the top of the Program Editor window Tools/Options/Program Editor On the General tab you can select (have the square darkened) Promt to save on window close by making that square darkened. If the square is already darkened, leave it darkened. This reminds you to save the commands in a .sas file. On the Editing tab Choosing Split lines on a carriage return adds a new line of commands when you press the Enter key. Next, make sure Clear text on submit is unchecked if this is not already unchecked. This keeps command in the editor window after a submit. This makes it easier to make changes in commands after they are run. In Tools/ Options/Preferences/Editing, choose Insert for the Cursor option if you don’t want text overwritten. Have Automatically store selection unchecked to allow yourself to keep text highlighted. To copy and paste text, you need to use Edit/Copy and Edit/Paste In the editor window, type in the following commands: options ls=80; data one; /*create a sas data set called “one” */ infile 'weightandbp.dat'; input weight bp; /*weight and bp are column names of your 'weightandbp.dat'*/ proc univariate; var weight; /*to look at the descriptive statistics about variable “weight” */ proc print data=one; /*print data set “one” */ proc reg data=one; /*do regression with data “one”*/ model bp=weight; proc plot data=one; /*plot “one”, bp is y value and weight is x value */ plot bp*weight; run; The proc univariate command allows one to look at several descriptive statistics about any given variable. The above is the program from Lab 1 with the proc univariate command inserted. The univariate procedure will be displayed first (because this is the first procedure we typed into our SAS file). 3) The run command executes the previous statements. Run statements allow one to create different titles for different sections of the program. Run statements should also be placed at the end of each data step. options ls stands for “Line Size”, and ls=80 sets your output to 80 characters per line. The default line size (i.e. if you leave options ls=80; out of your program) is 132 characters. SAS also centers the output, so it is generally a good thing to limit line size.