PCA 1

advertisement
USING PRINCIPAL COMPONENTS ANALYSIS AS A DATA REDUCTION AND
TRANSFORMATION TECHNIQUE
Below are 2 project descriptions involving the use of principal components analysis to explore
morphological variation in lizards and big cats. The main goals are to gain experience
interpreting the results of PCA and to generate output data sets that can be used in subsequent
univariate and bivariate statistical tests.
Project 1
Brown anoles (Anoles sagrei) are lizards native to Cuba and the Bahamas that have been
introduced into the United States over the past several decades. They are highly invasive and
their populations have exploded in density in recent years. Some students at a university in Ohio
were interested to know if brown anoles introduced into the Florida Everglades have changed
over time. So, they measured the body dimensions of female specimens collected from the mid1900s (when the species was first noticed in the wild in large numbers) until the 1980s when the
populations declined (reason unknown) and compared them with individuals collected during the
current population explosion (1990s – present).
Your task is to explore morphological variation across the collected individuals using a variancecovariance matrix-based PCA. You need to perform a natural-log + 1 transformation on the raw
data prior to performing the PCA. The transformation is needed for 2 reasons: 1) using the var.cov. matrix and the natural-log transformation allows you to interpret allometric patterns that
may show up on the first PC axis and 2) it is necessary to add 1 to the raw data prior to the
natural-log transformation because some of the raw data values equal 1.0 (nat.-log of 1 is zero).
Adding 1 won't change the relative variation across the specimens (i.e. it won't mess up your
PCA interpretation). Provide an interpretation of the eigenvalues and eigenvectors for one or
more PC axes (as appropriate). Use the output PCA scores from one or more PC axes (as
appropriate) in a comparison of means (be sure to test assumptions). Your null hypothesis is
(roughly): there is no significant morphological difference in female brown anoles collected
before 1990 and those collected after 1990. Be sure to provide appropriate graphical output to
help illustrate your results. All measurements are reported in mm. SVL refers to snout-vent
length and is the ventral-side distance measured from the tip of the nose to the cloaca under the
tail. The identity of the other characters should be obvious.
The raw data can be found in an Excel file on my webpage: http://cstl-csm.semo.edu/jhrobins/
Go to Other, Go to XDesign, Select FemaleBrownAnole.xls
Project 2
Some researchers at an Illinois university wanted to understand better the way saber-toothed cats
may have utilized their over-sized canines when killing their prey. They realized they needed a
baseline for comparison, so they measured some skull and canine dimensions of modern large
felids in which the prey-taking behavior was fairly well known. Additionally, they included data
on a well-known extinct saber-tooth for comparison.
Again, your task is to explore the morphological variation across the measured specimens using
a variance-covariance matrix-based PCA on natural-log transformed skull data (you don't need to
add 1 this time). Provide a meaningful interpretation of the first two axes, then use the output
scores from the first axis as an independent variable in a regression with an indicator of canine
dimension as the dependent variable. The canine variable is named UPPER and is an indicator
of canine dimension. Hint: check the distribution of UPPER, you may need to perform a
transformation to bring the values closer to a normal distribution. Make a graph with PC1 on the
x-axis and UPPER on the y-axis. Use the average value of each variable for each species and
label them on the graph. That way, you can compare the relative position of each species in the
morphospace you have created.
Species:
A) Pantera tigris – tiger, largest living felid, generalist predator on large-bodied prey.
B) Neofelis nebulosa – clouded leopard, medium-sized, highly arboreal ambush specialist.
C) Acinonyx jubatis – cheetah, medium-sized, pursuit specialist on small-medium ungulates.
D) Felis concolor – mountain lion, medium-sized, generalist predator on small-medium prey.
E) Felis rufus – bobcat, small-sized, generalist predator on small-bodied prey.
F) Smilodon californicus – extinct saber-toothed tiger, lion/tiger-sized predator.
Morphological characters (in mm) for use in the PCA:
1) Greatest Length of Skull: Distance from the most anterior part of the rostrum (excluding
teeth) to the most posterior point of the skull.
2) Greatest Zygomatic Breadth: Greatest distance between the outer margins of the zygomatic
arches.
3) Mastoid Breadth: Greatest width of skull including the mastoid.
4) Maximum Width of Maxilla: Distance between the anterior edges of the upper canine alveoli.
5) Maximum Depth of Skull: Distance from dorsal "peak" of frontal to whatever position on the
roof of the mouth that is directly ventral to that point.
Upper canine variable to be used in the regression:
UPPER – Measure the anterior-posterior length of the upper canine at the alveolus and divide by
the medial-lateral length of the upper canine at the alveolus. Large values indicate canines that
are narrow (shearing, slicing prey), whereas small values indicate canines that are wide
(puncturing, holding prey).
The raw data can be found in an Excel file on my webpage: http://cstl-csm.semo.edu/jhrobins/
Go to Other, Go to XDesign, Select K9data.xls
Download