Exam in Multivariate Statistical Methods, 2011-03-26

advertisement
Linköpings Universitet
IDA/Statistik
LH
732A37 Multivariate Statistical Methods, 6hp
Exam in Multivariate Statistical Methods, 2011-03-26
Time allowed:
Allowed aids:
kl: 8-12
Calculator, The book: Johnson, Wichern: Applied Multivariate
Statistical Analysis. Notes in the book and Copy of the book are
allowed.
Assisting teacher: Lotta Hallberg
Grades:
A=19-20 points, B=16-18p, C=12-15p, D=9-11p, E=6-8p
Provide a detailed report that shows motivation of the results.
_________________________________________________________________________________________
1
You are given the random vector X’=[X1, X2, X3] with mean vector 𝝁′𝑋 =
[3, 2, −2] and covariance-matrix
2 0 0
Σ𝑋 = (0 2 0)
0 0 2
Let
1 −1 0
𝐴=(
)
1 1 −2
Let further Y=AX.
a) Find E[Y] and Var[Y].
2p
b) Calculate the total variance and the generalized variance of X and of Y.
1p
2
Let X be 𝑁3 (𝝁, 𝚺) where 𝛍′ = [1, −1, 2] and
4 0 −1
Σ=( 0 5 0 )
−1 0 2
a) Find out if the following variables are independent: Explain 2p
i)
(X1, X3) and X2
ii)
X1 and X1 + 3X2 - 2X3
b) Find the distribution of X1 + 3X2 - 2X3
2p
3
Measurements of x1=stiffness and x2=bending strength of a sample of n=30
pieces of a particular grade of lumber shall be analyzed. Below you find some
statistics and graphs:
Variable
Stiffness
Bending
Mean
1860,5
8354
StDev
352,2
1867
Minimum
1115,0
4175
Estimated covariance matris S
124055
361620
361620 3486333
Invers of S
0,0000116 -0,0000012
-0,0000012
0,0000004
Eigenvalues of S
3524786
85601
Maximum
2540,0
12090
Eigenvectors of S in columns of matrix P
0,105740
0,994394
0,994394 -0,105740
Scatterplot of Stiffness vs Bending
2600
2400
Stiffness
2200
2000
1800
1600
1400
1200
1000
4000
5000
6000
7000
8000
9000
Bending
10000
11000
12000
Histogram of Stiffness
Histogram of Bending
Normal
Normal
9
7
7
6
6
5
4
4
3
2
2
1
1
6000
8000
Bending
10000
12000
1861
352,2
30
5
3
4000
Mean
StDev
N
8
Frequency
Frequency
9
Mean 8354
StDev 1867
N
30
8
0
13000
0
1200
1400
1600
1800
2000
Stiffness
2200
2400
2600
a) Use the three graphs and determine if it is reasonable to assume
normality. Explain.
1p
2
b) Test, using Hotellings T , if 𝜇1 = 2000 and 𝜇2 = 10 000 are plausible
values of the mean of stiffness and blending. Assume normality.
Significance value 5%.
3p
c) Calculate the two Bonferroni confidence intervals with simultaneous
confidence level of 95%.
2p
4
 5 2
.
Let the random vector X  ( X 1 , X 2 ) have covariance matrix   
2
2


Determine the principal components and find the proportion of the total
variance of X explained by the first component.
3p
5
You got five variables from 14 different counties:
 Total population (thousands)
 Median school years
 Total employment (thousands)
 Health services employment (hundreds)
 Median value homes ($10 000s) (income)
Descriptive Statistics: tot pop; school; employ; helth service; income
Variable
tot pop
school
employ
helth service
income
Mean
4,323
14,014
1,952
2,171
2,454
StDev
2,075
1,329
0,895
1,403
0,710
Minimum
1,523
12,200
0,597
0,750
1,720
Maximum
8,044
17,000
3,641
5,520
4,250
Histogram of tot pop; school; employ; helth service; income
Normal
tot pop
school
4
4
3
3
2
2
1
1
employ
tot pop
Mean 4,323
StDev 2,075
N
14
3
Frequency
2
0
1
0
0
2
4
6
8
school
Mean 14,01
StDev 1,329
N
14
0
, 2 ,0 , 8 , 6 , 4 ,2 , 0 , 8
1 1 12 12 1 3 14 1 5 16 1 6
helth serv ice
0
1
2
3
income
6,0
4,8
3,6
4,5
2,4
3,0
1,2
1,5
0,0
0
1
2
3
4
5
1,
0 , 5 ,0 , 5 ,0 , 5 ,0 , 5
1 2 2 3 3 4 4
Factor Analysis: tot pop; school; employ; helth service; income
Maximum Likelihood Factor Analysis of the Correlation Matrix
* NOTE * Heywood case
Unrotated Factor Loadings and Communalities
Variable
tot pop
school
employ
helth service
income
Variance
% Var
Factor1
0,971
0,494
1,000
0,848
-0,249
Factor2
0,160
0,833
0,000
-0,395
0,375
Communality
0,968
0,938
1,000
0,875
0,202
2,9678
0,594
1,0159
0,203
3,9837
0,797
Rotated Factor Loadings and Communalities
Varimax Rotation
Variable
tot pop
school
employ
helth service
income
Variance
% Var
employ
Mean
1,952
StDev 0,8948
N
14
helth serv ice
Mean 2,171
StDev 1,403
N
14
0,0
-1
4
Factor1
0,718
-0,052
0,831
0,924
-0,415
Factor2
0,673
0,967
0,556
0,143
0,173
Communality
0,968
0,938
1,000
0,875
0,202
2,2354
0,447
1,7483
0,350
3,9837
0,797
income
Mean
2,454
StDev 0,7102
N
14
Factor Score Coefficients
Variable
tot pop
school
employ
helth service
income
Factor1
-0,165
-0,528
1,150
0,116
-0,018
Factor2
0,246
0,789
0,080
-0,173
0,027
a) What assumptions have to be fulfilled to do the analyses above?
1p
b) What do the communality measure?
1p
c) Try to put names on the two factors.
1p
d) One observation is: (5,935 14,2 2,265 2,27 2,91)
and its standardized value is: (0,777 0,140 0,350 0,070 0,642).
Calculate the two factor scores.
1p
Download