Assignment 1

advertisement
STAT 415 – Multivariate Statistics – Assignment #1
(77 points)
9 −2
]
−2 6
a) Is 𝐴 symmetric? (1 pt.)
b) Determine the eigenvalues and eigenvectors of 𝐴. (4 pts.)
c) Write out the spectral decomposition of 𝐴 = 𝑃Λ𝑃′ , i.e. find the matrices
𝑃 and Λ . (4 pts.)
d) Verify that 𝑃𝑃′ = 𝑃′ 𝑃 = 𝐼 using 𝑃 from part (c). (2 pts.)
e) Find 𝐴−1 . (2 pts.)
f) Find the eigenvalues and eigenvectors of 𝐴−1 . (4 pts.)
g) Write out the spectral decomposition of 𝐴−1 = 𝑃Λ−1 𝑃′, i.e. find the matrices
𝑃 and Λ−1 . (4 pts.)
1.) Let 𝐴 = [
1
h)
1
1
1
1
1
Find the matrices 𝐴2 and 𝐴−2 . Verify that 𝐴2 𝐴2 = 𝐴 and 𝐴−2 𝐴−2 = 𝐴−1 .
(5 pts.)
4 8 8
]
3 6 −9
a) Calculate 𝐴𝐴′ and obtain its eigenvalues and eigenvectors. (4 pts.)
b) Calculate 𝐴′𝐴 and obtain its eigenvalues and eigenvectors. Confirm that the
nonzero eigenvalues are the same as in part (a). (4 pts.)
c) Obtain the singular-value decomposition (SVD) of 𝐴, i.e. determine the
matrices 𝑈, 𝐷, and 𝑉′, then verify that 𝐴 = 𝑈𝐷𝑉′. (6 pts.)
2.) Let 𝐴 = [
1 1
3.) Let 𝐴 = [2 −2]
2 2
a) Calculate 𝐴𝐴′ and obtain its eigenvalues and eigenvectors. (4 pts.)
b) Calculate 𝐴′𝐴 and obtain its eigenvalues and eigenvectors. Confirm that the
nonzero eigenvalues are the same as in part (a). (4 pts.)
c) Obtain the singular-value decomposition (SVD) of 𝐴, i.e. determine the
matrices 𝑈, 𝐷, and 𝑉′, then verify that 𝐴 = 𝑈𝐷𝑉′. (6 pts.)
4.) Fatty Acid Analysis of Italian Olive Oils
Researchers are interested in characterizing differences in the fatty acid content of olive
oils made from olives grown in different regions of Italy. There are two geographic
classifications in these data. The first classification is nine individual growing areas in
Italy (Area Name) – East Liguria, West Liguria, Umbria, North-Apulia, South-Apulia,
Sicily, Coastal Sardinia, Inland-Sardinia, and Calabria. A broader classification is the
growing region in Italy (Region Name) – Northern, Southern, and Sardinia. The map
below should help in your understanding of where these areas/regions are located in
Italy.
Puglia = Apulia
Sardegna = Sardinia
Sicilia = Sicily
The bar graph above shows
the number of olive oils in
these data from each area.
The fatty acids measured are as follows:
Palmitic
𝐶16 𝐻32 𝑂2
Palmitoleic
𝐶16 𝐻30 𝑂2
Stearic
𝐶18 𝐻36 𝑂2
Oleic
𝐶18 𝐻34 𝑂2
Linoleic
𝐶18 𝐻32 𝑂2
Linolenic
𝐶18 𝐻30 𝑂2
Arachadic
𝐶20 𝐻32 𝑂2
Eicosenoic
𝐶20 𝐻38 𝑂2
Molecular formulae taken from Wikipedia, so if these are wrong it is not my fault. I don’t understand how small
differences in the number of carbon and hydrogen molecules make distinct fatty acids. Chemistry is weird!
a) Use visualization methods to find identify the fatty acids that would be most
useful in discriminating between olive oils grown in the nine different growing
areas represented in these data. Include at least: one 1-D plot, one 2-D plot, and
a pseudo 3-D plot that you found useful in justifying your choice of fatty acids
that are good discriminators. (10 pts.)
b) There are several outliers in these data but one olive oil (within a specific
growing area) in particular stands out from the rest. What area is this olive
from? What combination of characteristics makes this olive oil unique? (4 pts.)
c) Which of the nine growing areas would you say produces the most homogenous
olive oils in terms of their fatty acid composition? Include an appropriate plot or
collection of plots to justify your answer. (4 pts.)
d) Would we categorize the fatty acid compositions of the olive oils from each of the
growing areas as having a multivariate normal distribution? Why or why not?
Provide graphical evidence to support your answer. (5 pts.)
Download