Chu Hai College - City University of Hong Kong

advertisement
MA3518: Applied Statistics
Page 1
Department of Mathematics
Faculty of Science and Engineering
City University of Hong Kong
MA 3518: Applied Statistics
Tutorial 3 (Suggested Solutions)
Question 1:
(a) Let  denote the average saving rate
The null hypothesis is H0:  = 0.2 and the alternative hypothesis is H1:  < 0.2
This is a one-sided test
(b) Since the sample size N is 100 which are greater than 30, the sampling distribution for
the sample average X can be accurately approximated by a normal distribution with
unknown mean  and unknown variance  2/N by Central Limit Theorem, where
 2 is the unknown population variance.
An unbiased estimate for  2 is the sample variance s2
Test statistic under H0:
T = ( X - 0.2) / (s / N ) ~ t (N - 1)
where t (N - 1) denotes a students’ t-distribution with degree of freedom N-1
Since N is large, t (N - 1) tends to a standard normal distribution
Hence, T follows a standard normal distribution approximately
The observed value Tobs of T is given by:
Tobs = (0.16 - 0.2) / (0.08 / 100 ) = -5
The p-value of the test is given by:
MA3518: Applied Statistics
Page 2
P(T < Tobs | H0) = P(T < - 5 | H0) =  (-5)  0
The p-value can be interpreted as the likelihood that the test statistic is less than its
observed value when H0 is true
(c) Since the p-value is less than 0.05, we reject H0 and conclude that there is enough
evidence to refute the economist’s claim at 5% significance level
Question 2:
(a) Let p denote the unknown proportion of first time buyers in the whole population of
home buyers over the past three years
The null hypothesis is H0: p = 0.4 and the alternative hypothesis is H0: p  0.4
This is a two-sided test
(b) Let X denote the number of first time buyers in the sample and pe denote an estimator
for the unknown population proportion p
Clearly, X ~ Bin (500, p)
Then, E( pe) = E(X/500) = p and Var( pe) = [p(1-p)] / 500
Since the sample size is greater than 30, the sampling distribution of pe can be
accurately approximated by a normal distribution with unknown mean p and
unknown variance [p(1-p)] / 500
Test statistic under H0:
Z = (pe – 0.4) / [
0.4(1  0.4) 1/2
] ~ N(0, 1) approximately
500
The observed value Zobs of Z is given by:
Zobs = (100/500 – 0.4) / [
0.4(1  0.4) 1/2
] = - 9.1287
500
For the two-sided test, the p-value is given by:
P(Z > | Zobs | | H0) = 2 P(Z > | Zobs | | H0) = 2 P(Z > 9.1287 | H0)  0
Since the p-value is less than 0.05, we reject H0 and conclude that the percentage of
home sales to first time buyers has changed from what it was three years ago
MA3518: Applied Statistics
Page 3
Question 3:
(a) Let  A and  B denote the daily volatility for Stock A and Stock B, respectively
The null hypothesis is H0:  A =  B and the alternative hypothesis is H1:  A >  B
This is a one-sided test
(b) Test statistic under H0:
F = s12 / s22 ~ F(287, 287)
where F(287, 287) is a F-distribution with degrees of freedom 287 and 287
The observed value Fobs for F is given by:
Fobs = (0.5882)2 / (0.3256)2 = 3.2635
The p-value is given by:
P(F > Fobs | H0) = P(F > 3.2635 | H0)  0
Since the p-value is less than 0.05, we reject the null hypothesis and conclude that
there is no enough evidence to refute the claim made by the investment advisor
Question 4:
(a) The data were obtained from Yahoo Finance. It can be viewed from the course
website with name ‘T3Q4.csv’
We use the ‘Import Data’ option to import the data directly to the SAS Work Library
and create a dataset with name ‘T3Q4’
The SAS procedure is shown as follows:
PROC UNIVARIATE Data = T3Q4;
RUN;
The SAS Output is given by:
MA3518: Applied Statistics
Page 4
The SAS System
15:55 Wednesday, February 11, 2004 1
The UNIVARIATE Procedure
Variable: Close
Moments
N
3276
Sum Weights
Mean
784.4712
Sum Observations
Std Deviation
375.291338 Variance
Skewness
0.39966166
Kurtosis
Uncorrected SS 2477296978 Corrected SS
Coeff Variation 47.8400402 Std Error Mean
3276
2569927.65
140843.588
-1.3135314
461262751
6.55687031
Basic Statistical Measures
Location
Variability
Mean 784.4712 Std Deviation
Median 667.4800 Variance
Mode 375.3500 Range
Interquartile Range
375.29134
140844
1232
676.66000
NOTE: The mode displayed is the smallest of 2 modes with a count of 3.
Tests for Location: Mu0=0
Test
-Statistic-
Student's t t
119.6411
Sign
M
1638
Signed Rank S 2683863
-----p Value-----Pr > |t| <.0001
Pr >= |M| <.0001
Pr >= |S| <.0001
Quantiles (Definition 5)
Quantile
Estimate
100% Max 1527.46
99%
1493.74
95%
1419.89
90%
1347.35
75% Q3
1122.71
50% Median 667.48
25% Q1
446.05
10%
377.75
5%
340.08
1%
312.49
0% Min
295.46
The SAS System 15:55 Wednesday, February 11, 2004 2
The UNIVARIATE Procedure
MA3518: Applied Statistics
Page 5
Variable: Close
Extreme Observations
-----Lowest-----
-----Highest-----
Value
Obs
Value
295.46
298.76
298.92
300.03
300.40
3108
3104
3105
3107
3109
1517.68
1520.77
1523.86
1527.35
1527.46
Obs
609
608
719
721
720
(b) From the SAS output, the skewness is 0.39966166. Hence, the distribution of the data
is slightly positively skewed
(c) From the SAS output, the excess kurtosis is -1.3135314. Hence, the distribution of the
data has a lighter tail than a normal distribution.
Question 5:
(a) The data were obtained from Yahoo Finance. It can be viewed from the course
website with name ‘T3Q5.csv’
We use the ‘Import Data’ option to import the data directly to the SAS Work Library
and create a dataset with name ‘T3Q5’
The SAS procedure is shown as follows:
PROC MEANS Data = T3Q5 alpha = 0.05 CLM;
RUN;
From the SAS output, a 95% confidence interval for the mean of the daily close
values for NASDAQ is (2181.08, 2270.02)
(b) The SAS procedure is shown as follows:
PROC UNIVARIATE Data = T3Q5;
RUN;
The part of SAS output for the t test is shown as follows:
MA3518: Applied Statistics
Page 6
Tests for Location: Mu0=0
Test
-Statistic- -----p Value------
Student's t t
119.6411
Sign
M 1638
Signed Rank S 2683863
Pr > |t| <.0001
Pr >= |M| <.0001
Pr >= |S| <.0001
From the SAS output, the p-value for the t test is less than 0.0001. Hence, we reject
the null hypothesis and conclude that the mean of the daily close values for
NASDAQ is greater than zero at 5% significance level
(c) The result in part (b) is an approximate one since we do not know whether the data
comes from a normal distribution or not
Question 6:
(a) The SAS procedure is shown as follows:
Data Prices;
INPUT Company $ Close @@;
Datalines;
A 21.1 A 28.3 A 17.1 A 16.6 A 28.5 A 25.1 A 13.5 A 30.2 A 20.8 A 16.6 A
12.2
B 20.2 B 36.6 B 29.8 B 28.8 B 38.8 B 36.8 B 38.8 B 37.8 B 35.8 B 38.2 B
28.7
RUN;
(b) The SAS procedure to perform the t test is shown as follows:
PROC TTEST COCHRAN;
CLASS Company;
VAR Close;
RUN;
The SAS output is shown as follows:
The SAS System 15:55 Wednesday, February 11, 2004 14
The TTEST Procedure
Statistics
Variable Company
Close
Close
Close
A
B
Diff (1-2)
N
11
11
Lower CL
Upper CL Lower CL
Upper CL
Mean Mean Mean Std Dev Std Dev Std Dev Std Err
16.668 20.909
29.644 33.664
-18.23 -12.75
T-Tests
25.15 4.4112 6.3132 11.079 1.9035
37.683 4.1804 5.983
10.5 1.8039
-7.284 4.7054 6.1503 8.8815 2.6225
MA3518: Applied Statistics
Variable Method
Close
Close
Close
Page 7
Variances
Pooled
Equal
Satterthwaite Unequal
Cochran
Unequal
DF t Value Pr > |t|
20
19.9
10
-4.86
-4.86
-4.86
<.0001
<.0001
0.0007
Equality of Variances
Variable Method
Close
Folded F
Num DF Den DF F Value Pr > F
10
10
1.11
0.8684
From the SAS output, the p-value for the variance ratio test is 0.8684 > 0.05. Hence, we
do not reject the null hypothesis and conclude that the population variances of the two
samples are the same at 5% significance level
Since the population variances are equal but unknown, the t test with pooled estimate of
the population variances is appropriate
From the SAS output, the p-value is less than 0.0001. Hence, we reject the null
hypothesis and conclude that the means of the daily close prices of the two stocks are
different at 5% significance level
~ End of the Solutions~
Download