Statistics 479 Assignment #5 Answer Key Fall 2013 Problem #1 a) SAS Program libname mylib "U:\Documents\Stat479\"; title "Scatterplot of Fuel Consumption vs Miles of Highways"; proc sgplot data=mylib.fueldat; scatter x=Roads y=Fuel/datalablel=St markerattrs=(color=darkcyan size= 5 px symbol=circlefilled); run; Output b) SAS Program libname mylib "U:\Documents\Stat479\"; title "Histogram of Income (in thousands of dollars)"; proc sgplot data=mylib.fueldat; histogram Income/binstart=3 binwidth=.5 scale=count fillattrs=(color=lightsalmon); density Income/type=normal; run; Output c) SAS Program libname mylib "U:\Documents\Stat479\"; title "Regression Fit of Fuel on Income (in thousands of dollars)"; proc sgplot data=mylib.fueldat; reg x=Income y=Fuel/CLM CLI; run; Output d) SAS Program libname mylib "U:\Documents\Stat479\"; title "Scatter Plot Matrix of Variables Related to Fuel Use "; proc sgscatter data=mylib.fueldat; matrix Fuel Roads Income Numlic/group=TaxGrp; run; Output e) SAS Program libname mylib "U:\Documents\Stat479\"; title "Dot Plot of Miles of Roads by State"; proc sgplot data=mylib.fueldat; dot State/response=Roads categoryorder=respasc; run; Output Problem #2 SAS Program data as5; infile "U:\Documents\Classwork\stat479\F13\iron.txt"; array feed{*} Fe3High Fe3Med Fe3Low Fe2High Fe2Med Fe2Low; input Fe3High Fe3Med Fe3Low Fe2High Fe2Med Fe2Low; do i = 1 to 6; FeedType = i; Iron = feed{i}; output; end; title "Box Plots of % Iron Retention by Feed Type"; proc sgplot data=as5; vbox Iron/category=FeedType datalabel; label Iron ="% Iron Retention"; run; Output Discussion a) The six boxplots indicate that the iron retention distributions for mice have different locations, spreads and shapes. In general, they all appear to be right-skewed with either long-right tails and/or outside values on the right-tail. The mean is on the right side of the median for all six distributions. b) Obviously, there is an increasing trend in the median iron retention (or the means, as well) as the dosage level decreases within each type of iron (Fe2+, Fe3+) ingested. There is also an increasing trend in the median (and the means) at each level of dosage fro Fe3+ to Fe2+, although this does not appear to be significant. c) Clearly, most distributions appear to be skewed to various degrees and may not satisfy the normality assumption. Also the spread (as measured by IQR) are different for all six distributions, and this indicates non-homogeneous variance. The IQR appear to uniformly increase with the median level of iron retention, thus showing that the variance may be a function of the mean. Some type of transformation (e.g., like square root or logarithmic) may be necessary to normalize the data before using standard analysis of variance methods. Problem #3 SAS Program libname mylib "U:\Documents\Stat479\"; data fuelnew; set mylib.fueldat; if Percent=<54 then LicGrp=1; else if 54<Percent=<58 then LicGrp=2; else LicGrp=3; label LicGrp="No. of Drivers"; run; proc format; value ing 1 = 'Low Income' 2 = 'Middle Income' 3 = 'High Income'; value lg 1='below 54%' 2='54 to 58%' 3='above 58%' ; run; title "Horizontal Barchart of Fuel Use by % of Licensed Drivers"; proc sgplot data=fuelnew; hbar LicGrp/response=Fuel stat=mean group=IncomGrp; keylegend /title='Per capita Income' location=inside position=topright; yaxis offsetmin=.2; format IncomGrp ing. LicGrp lg.; run; Output Problem #4 SAS Program libname mylib "U:\Documents\Stat479\"; data fuelnew; set mylib.fueldat; if Percent=<54 then LicGrp=1; else if 54<Percent=<58 then LicGrp=2; else LicGrp=3; label LicGrp="No. of Drivers"; run; proc format; value ing 1 = 'Low Income' 2 = 'Middle Income' 3 = 'High Income'; value lg 1='below 54%' 2='54 to 58%' 3='above 58%' ; run; title "Vertical Barcharts of Miles of Roads by % of Licensed Drivers"; proc sgpanel data=fuelnew; panelby IncomGrp/rows=1; vbar LicGrp/response=Roads stat=mean; format IncomGrp ing. LicGrp lg.; run; Output