Stat 407 Lab 3 (Plotting Multivariate Data using S-Plus) Fall 2001 This lab is an introduction to using graphics in S-Plus with data collected on crabs from Australia. This data is taken from “Modern Applied Statistics with S-Plus (3rd ed)” by Venables and Ripley. (Chapters 3, 4, 5 of the Splus users manual gives detailed information on the generating the plots for this Lab.) The data contains measurements on 2 species of crabs (blue - sp= 1, orange - sp= 2) and males (sex= 1) and females (sex= 2) from both species. The variables in the data set are: Index = Obs number within group Sp = Species Sex = Gender FL = Frontal Lobe RW = Rear Width CL = Carapace Length CW = Carapace Width BD = Body Depth 1. You should already have a copy of this data on you zip disk. Locate it and load it into S-Plus. 2. Generate a scatterplot matrix of the 5 physical measurements. To do this highlight the last 5 columns in the spreadsheet. Open the Graph menu, select 2D plot, select Matrix, click OK. Describe the relationships between the 5 variables. (Note also, that double left-clicking on the plot brings up some plot options so that you can edit colors, backgrounds, ... Also the top panel has a selection of plot editing tools.) 3. Generate a coplot (using trellis plot display) of RW against CL, and Sex as the conditioning variable, with a smoothed loess curve overlay. Make sure no column of data is highlighted in the Data view. In the Graph menu, select 2D plot, Linear, Smoothing - Loess, OK. Select the X and Y variables accordingly. Now here is the tricky conditioning part. Make sure both Data window and Plot window are visible. Highlight the Sex column in the Data window. Click half-way down the column and drag this to the plot window. You will see a dashed rectangle highlight in the plot window, release the mouse button when the cursor is in this box. What can you see from this plot? (Note that sex= 1 =male, sex= 2 =female.) 4. Generate a scatterplot of RW againts CL, and use symbol or color to represent the two sexes. Make sure that no column is highlighted in the Data view. In the Graph menu, select 2D plot, Linear, Scatterplot, OK. Select the CL and RW as the x and y variables respectively, then select Sex as the w variable. In the Symbol/Color panel select color of the symbol to be set by the w variable. What do you notice from this plot? Is it easier to see differences between the sexes from this plot or from the coplot (previous question)? 5. Using graphics find which of the physical measurements are important for separating the two species. (You can also use the summary statistics that you calculated in the last lab.) 6. Explain the differences in the joint distribution between the 4 groups (2 species, 2 sexes). Are there some values of the variables for which the 4 groups would be difficult to separate? 1