Statistics 479 Assignment #3 ...

advertisement
Statistics 479
Assignment #3
Answer Key
Fall 2013
1.
a)State
CA
Nrooms
3
Income
.
Nfamily Id
.
0020
Type
h
Age
.
Sex
Educ _N_ _ERROR_
.
1
0
b)State
CA
Nrooms
3
Income
42000
Nfamily Id
3
0020
Type
f
Age
.
Sex
Educ _N_ _ERROR_
.
2
0
c)State
CA
Nrooms
3
Income
42000
Nfamily Id
3
0020
Type
p
Age
34
Sex
M
Educ _N_ _ERROR_
3
3
0
d)State
CA
Nrooms
3
Income
42000
Nfamily
3
Sex
M
Educ
3
Age
34
e) If retain statement is omitted the variables State, Nrooms, Income,
and Nfamily will be initialized to missing values whenever SAS goes to
process a new line of data. The effect of this is that when an
observation is written to the data set there will be no values output
for those variables because the values that were input to them in
previous iteration of the data step would have been initialized to
missing values at the time the observation is written to the data set.
The SAS System
f)
Obs
1
2
3
4
5
6
7
8
9
State
CA
CA
CA
CA
CA
NB
NB
NB
NB
Nrooms
3
3
3
3
3
3
3
3
3
Income
42000
42000
42000
52000
52000
38000
38000
38000
38000
Nfamily
Age
Sex
Educ
3
3
3
2
2
4
4
4
4
34
31
9
34
31
35
30
11
5
M
F
F
F
F
M
F
F
M
3
2
1
4
2
4
3
1
1
The SAS System
Obs
1
2
3
4
5
6
State
Nrooms
_TYPE_
CA
NB
CA
NB
.
3
.
.
3
3
0
1
2
2
3
3
_FREQ_
9
9
5
4
5
4
18:40 Monday, October 5, 2009
18:40 Monday, October 5, 2009
Age
Educ
24.4444
24.4444
27.8000
20.2500
27.8000
20.2500
2.33333
2.33333
2.40000
2.25000
2.40000
2.25000
s_Age
s_Educ
12.2893
12.2893
10.6160
14.5000
10.6160
14.5000
1.22474
1.22474
1.14018
1.50000
1.14018
1.50000
1
2
g)
Means and Standard deviations are computed for the variables Age and Educ
for groups of observations formed by combinations of values of State
and Nrooms as described below:
Obs=1
Obs=2
Obs=3
Obs=4
Obs=5
Obs=5
:
:
:
:
:
:
for
for
for
for
for
for
all
the
the
the
the
the
9
9
5
4
5
4
persons
persons
persons
persons
persons
persons
in
in
in
in
in
in
the 3 families in the 2 states (_TYPE_=0)
houses with 3 rooms in the 2 states (_TYPE_=1)
CA (_TYPE_=2)
NB (_TYPE_=2)
CA living in houses with 3 rooms (_TYPE_=3)
CA living in houses with 3 rooms (_TYPE_=3)
2.
a) SAS Program as given in b6.sas with the following libname,
infile, and ods statements in the appropriate lines:
libname mylib "U:\Documents\Stat479\";
infile “U:\Documents\Stat479\fuel.txt”;
ods rtf file= “U:\Documents\Stat479\prob2a_output.rtf”;
ods rtf close;
See attached output from proc print.
b)
libname mylib "U:\Documents\Stat479\";
proc means data=mylib.fueldat noprint;
class Incomgrp TaxGrp;
var Income Fuel Numlic;
ways 2;
output out=stats_1 mean=Av_Inc Av_Fuel Av_Lic
stderr=SD_Inc SD_Fuel SD_Lic;
run;
ods rtf file= “U:\Documents\Stat479\prob2b_output.rtf”;
proc print data=stats_1 label;
title 'Statistics from Proc Means';
run;
ods rtf close;
c)
libname mylib “U:\Documents\Stat479\”;
ods rtf file= “U:\Documents\Stat479\prob2c_output.rtf”;
ods select BasicIntervals TestsForLocation TestsForNormality;
proc univariate data=mylib.fueldat cibasic normal mu0=4 50 5;
var Income Percent Roads;
id State;
title 'Use of Proc Univariate to Examine Distributions:1';
run;
ods rtf close;
Variable Income:
Basic Confidence Limits Assuming Normality
Parameter
Estimate 95% Confidence Limits
Mean
4.24183
4.07527
4.40840
Std Deviation
0.57362
0.47752
0.71851
Variance
0.32904
0.22803
0.51626
Tests for Location: Mu0=4
Test
Statistic
Student's t
t
Sign
M
p Value
2.920853 Pr > |t|
0.0053
7 Pr >= |M|
0.0595
248.5 Pr >= |S|
Signed Rank S
0.0093
The p-value given (.0053) is for the two-sided test as shown above.
Since the p-value is smaller than .05 we reject the null hypothesis
at alpha=0.05.
Tests for Normality
Test
Statistic
p Value
W
0.975229 Pr < W
0.3988
Kolmogorov-Smirnov D
0.080296 Pr > D
>0.1500
Cramer-von Mises
W-Sq
0.058391 Pr > W-Sq
>0.2500
Anderson-Darling
A-Sq
0.375093 Pr > A-Sq
>0.2500
Shapiro-Wilk
The Shapiro-Wilk test results in a very big p-value; thus the null
hypotheses that the distribution is normal is not rejected.
Variable Income:
Basic Confidence Limits Assuming Normality
Parameter
Estimate 95% Confidence Limits
Mean
Std Deviation
Variance
57.02844
55.41821
58.63867
5.54545
4.61641
6.94612
30.75196
21.31124
48.24853
Tests for Location: Mu0=50
Test
Statistic
Student's t
t
Sign
M
p Value
8.780982 Pr > |t|
<.0001
21 Pr >= |M|
<.0001
564 Pr >= |S|
Signed Rank S
<.0001
We need to calculate the p-value for the right-tailed test. The pvalue is <.0001/2 which is smaller than .05 we reject the null
hypothesis at alpha=0.05.
Tests for Normality
Test
Statistic
p Value
W
0.961084 Pr < W
0.1117
Kolmogorov-Smirnov D
0.118653 Pr > D
0.0892
Cramer-von Mises
W-Sq
0.122071 Pr > W-Sq
0.0560
Anderson-Darling
A-Sq
0.745349 Pr > A-Sq
0.0488
Shapiro-Wilk
The Shapiro-Wilk test results in a very big p-value; thus the null
hypotheses that the distribution is normal is not rejected.
Variable Roads:
Basic Confidence Limits Assuming Normality
Parameter
Estimate 95% Confidence Limits
Mean
5.56125
4.54681
6.57569
Std Deviation
3.49360
2.90832
4.37602
12.20527
8.45831
19.14955
Variance
Tests for Location: Mu0=5
Test
Statistic
p Value
1.113021 Pr > |t|
Student's t
t
Sign
M
Signed Rank S
0.2714
-1 Pr >= |M|
0.8854
60 Pr >= |S|
0.5439
We need to calculate the p-value for the left-tailed test. The pvalue is 0.2714/2=0.1357 which is larger than .05 we fail to reject
the null hypothesis at alpha=0.05.
Tests for Normality
Test
Statistic
p Value
W
0.924902 Pr < W
0.0044
Kolmogorov-Smirnov D
0.11309 Pr > D
0.1258
Cramer-von Mises
W-Sq
0.10228 Pr > W-Sq 0.1025
Anderson-Darling
A-Sq
Shapiro-Wilk
0.744132 Pr > A-Sq
0.0491
The Shapiro-Wilk test results in a very small p-value; thus the
null hypotheses that the distribution is normal is rejected.
Download