Hw4 Key - Pelagicos

advertisement
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
NOTE: Do not Turn This In. Homeworks will not be graded
Homework in preparation for quiz #6 – on October 15th
Use SPSS exercises to start developing your computer skills.
1) Explain the following terms using the words provided in parenthesis (+0.25, each):
-Parametric statistics (normality) = Built upon the assumption that the observations (data) come from a
normal distribution, and make inferences about the specific parameter of that distribution
- Non-parametric statistics (distribution-free) = A type of distribution-free techniques do not rely on
any assumptions concerning the distributions of the observations (data). Two examples of distributionfree statistics involve randomization tests, comparing observed and expected patterns for shuffled data,
and non-parametric statistical tests, using ranked data to compare the medians, rather than the means.
- P-P plot (proportions) = Two-dimensional plots of the cumulative probabilities of two different
distributions. Usually applied to compare the observed distributions against a theoretical expectation
(e.g., one-sample K-S test), or two observed distributions (e.g., two-sample K-S test). P-P plots are used
to visually assess the similarity between the two variable distributions. If all the data points fall on the
diagonal of the plot then the variable is normally distributed. Any points that do not follow the diagonal
show deviations from normality.
- Shapiro - Wilks test (normality) = A statistical test developed to determine whether a given frequency
distribution originated from a normally distributed population, with the same mean parameters as the
sample (mean, S.D.). The null hypothesis is that the sample came from the normal distribution. A
significant result suggests that the sample is not normally distributed.
2) Highlight the difference between a 1-sample and a 2-sample Kolmogorov-Smirnov test by filling in
the blanks below. With either: “1-sample K-S test” , “2-sample K-S test” or “both” (each is 0.10 points):
- compares observed data distribution against theoretical distribution with same mean and S.D.:
1-sample K-S test
- compares two observed data distributions against each other: 2-sample K-S test
- quantifies the location (central tendency) and spread (shape) of continuous distributions: both
- null hypothesis for this test states that the two data distributions are the same (all data come from the
same biological population): 2-sample K-S test
1
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
- null hypothesis for this test states that the observed data and the theoretical distribution are the same
(the observations come from the same theoretical biological population): 1-sample K-S test
3) Open the file BIOL4090_Hw4_data.xls with SPSS and use these observations, drawn from three
random variable distributions (all derived from theoretical distributions with a mean = 10 and a variance
=10) for the following exercise. Make sure the variables are “numeric” and the measure is “scale”.
Create a frequency table for these three datasets of 100 data points each and use this information to fill
in the table below (+0.05 each entry). Note – because these are random samples from theoretical
distributions , the parameter estimates (x-bar, S.D.) will vary from the real parameters ( µ, σ).
DATASET
MEAN
STDEV
MEDIAN
5 PERCENTILE
95 PERCENTILE
Distribution_1
9.724
2.943
9.644
5.208
15.799
Distribution_2
9.720
3.000
10.000
5.000
14.950
Distribution_3
9.705
3.033
9.762
4.937
15.167
Use SPSS to create calculate the skewness and kurtosis of each distribution and to create a histogram –
with a superimposed normal distribution. Note: you can make multiple calculations at once, by dragging
several distributions into the statistics “box”. Paste the information below (+0.25 for each distribution):
NOTE: this is the SPSS output
2
Biometry (Biol4090) - Fall 2015

Homework #4
Student name: ______KEY__________
Distribution_1:
Skewness: 0.625 (S.E of skewness = 0.241)
Interpret this skewness: Looks very symmetrical, but it has positive skewness, suggesting that
the right tail is longer and the mass of the distribution is concentrated on the left of the figure.
Kurtosis: 0.843 (S.E of kurtosis = 0.478).
Interpret this kurtosis: Distribution is leptokurtic, meaning that the distribution has a taller and
skinnier peak around the mean (more observations) and fewer around the tails (less observations)
than expected, as indicated by the positive kurtosis. Note: however, kurtosis is not significant
(the 95% C.I. overlaps “0”).
Paste Histogram – with superimposed normal curve below:
3
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
Briefly explain: Briefly – discuss, whether this distribution looks like a normal distribution, on the basis
of the skewness / kurtosis and the shape of the histogram? Explain why / why not:
Looks like a normal distribution. While the skewness is less than the “rule of thumb” threshold (1), it does
seems to be significant, since skewness +/- 2 S.E.s ranges from 0.143 to 1.107 (does not overlap 0). The
kurtosis is not very pronounced either: it is less than the “rule of thumb” threshold (1), and it does not
seem to be significant, since kurtosis +/- 2 S.E.s ranges from -0.113 to 1.799 (overlaps 0)

Distribution_2:
Skewness: 0.180 (S.E of skewness = 0.241).
Interpret this skewness: Looks very symmetrical, but it has positive skewness, suggesting that
the right tail is longer and the mass of the distribution is concentrated on the left of the figure.
Kurtosis: 0.362 (S.E of kurtosis = 0.478)
Interpret this kurtosis: Distribution is leptokurtic, meaning that the distribution has a taller and
skinnier peak right on the mean (more observations) and slimmer tails (fewer observations) than
expected, as indicated by the positive kurtosis. Note: however, kurtosis is not significant (the
95% C.I. overlaps “0”).
4
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
Paste Histogram – with superimposed normal curve below:
Briefly explain: Briefly – discuss, whether this distribution looks like a normal distribution, on
the basis of the skewness / kurtosis and the shape of the histogram? Explain why / why not:
______________________________________________________________________________
However, the skewness is less than the “rule of thumb” threshold (1), and it does not seem to be
significant, since skewness +/- 2 S.E.s ranges from -0.302 to 0.662 (overlaps 0). The kurtosis is
not very pronounced: it is less than the “rule of thumb” threshold (1), and it does not seem to be
significant, since kurtosis +/- 2 S.E.s ranges from -0.594 to 1.318 (overlaps 0)

Distribution_3:
Skewness: -0.082 (S.E of skewness = 0.241).
Interpret this skewness: Looks very symmetrical, but it has negative skewness, suggesting that
the left tail is longer and the mass of the distribution is concentrated on the right of the figure.
Kurtosis: -0.095 (S.E of kurtosis = 0.478)
Interpret this kurtosis: Distribution is platykurtic, meaning it has a “wider” peak around the mean
(less observations right on the mean) and thicker tails (more observations) than expected, as
indicated by the negative kurtosis. Note: however, kurtosis is not significant (the 95% C.I.
overlaps “0”).
5
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
Paste Histogram – with superimposed normal curve below:
Briefly explain: Briefly – discuss, whether this distribution looks like a normal distribution, on
the basis of the skewness / kurtosis and the shape of the histogram? Explain why / why not:
It looks like a normal distribution. The skewness is less than the “rule of thumb” threshold (1),
and it does not seem to be significant, since skewness +/- 2 S.E.s ranges from -0.564 to 0.400
(overlaps 0). However, the kurtosis is not very pronounced: it is less than the “rule of thumb”
threshold (1), and it does not seem to be significant, since kurtosis +/- 2 S.E.s ranges from -0.051
to 0.861 (overlaps 0)
4) Compare each distribution to a normal distribution with the same mean / S.D. (use parameters
estimated in question #2, above. For each test, use the Shapiro – Wilk test and paste the table of results
and the Q-Q plot (+0.50 for each distribution):
This is the output from SPSS, showing the three test results. Focus on the Shapiro-Wilk results:
6
Biometry (Biol4090) - Fall 2015

Homework #4
Student name: ______KEY__________
Distribution_1:
Paste results table here, and interpret the S-W test result:
-
was this result significant: (Y / N), why? Yes, p = 0.022 is < 0.05
-
is distribution 1 normally distributed: (Y / N), why? No – because the null was rejected
-
using the Q-Q plot, are there more observations than expected in the tails or on the center
of mass of the observed distribution? Does this agree with the sign (+/-) of the kurtosis of
this distribution you calculated in question #3? Why / why not?
Paste the detrended normal Q-Q plot here – for reference:
This plot agrees with the answers in question
3. The Q-Q plot shows an excess of
observations (positive values) from the two
tails of the distribution (the smaller and
larger values), and a deficit of values
(negative values) around the mean (from ~ 7
to ~ 12). This suggests that the observed
distribution has more observations in the
tails (leptokurtic). The asymmetry of the
deviations around the mean, with smaller left
deviations (smaller values), suggests that the
distribution has a positive skew, with a longer
right tail and more observations to the left.
NOTE: deviations are calculated as “observed proportion – expected proportion”
7
Biometry (Biol4090) - Fall 2015

Homework #4
Student name: ______KEY__________
Distribution_2:
Paste results table here, and interpret the S-W test result:
-
was this result significant: (Y / N), why? No, p = 0.154 is > 0.05
-
is distribution 2 normally distributed: (Y / N), why? Yes - because the null was not rejected
-
using the Q-Q plot, are there more observations than expected in the tails or on the center
of mass of the observed distribution? Does this agree with the sign (+/-) of the kurtosis of
this distribution you calculated in question #3? Why / why not?
-
Paste the detrended normal Q-Q plot here – for reference:
This plot agrees with the answers in
question 3. The Q-Q plot shows an excess
of observations (positive values) from the
right tail of the distribution (the larger
values), and a deficit of values (negative
values) in the left tail. The asymmetry of the
deviations about the mean, with smaller
positive deviations to the left (smaller
values), suggests that the distribution has a
right skew (positive skew) with a longer
right tail.
NOTE: deviations are calculated as “observed proportion – expected proportion”
8
Biometry (Biol4090) - Fall 2015

Homework #4
Student name: ______KEY__________
Distribution_3:
Paste results table here, and interpret the S-W test result:
-
was this result significant: (Y / N), why? No, p = 0.154 is > 0.05
-
using the Q-Q plot, are there more observations than expected in the tails or on the center
of mass of the observed distribution? Does this agree with the sign (+/-) of the kurtosis of
this distribution you calculated in question #3? Why / why not?
Paste the Normal Q-Q plot here (not the detrended plot) – for reference:
This plot agrees with the answers in
question 3. The skewness is less clear,
because the Q-Q plot shows a scatter of
positive and negative points, with an
excess of observations (positive values)
along the right tail of the distribution (the
larger values), and a deficit of values
(negative values) in the left tail. This
suggests that the skewness is very small,
since large asymmetries in the distribution
are not clearly visible. Similarly, the
kurtosis is also hard to evaluate visually,
since the positive and negative deviations
are spread throughout the range of
observed values. These graphs reinforce
the notion that the skew and kurtosis
quantified in question 3 are very small.
NOTE: deviations are calculated as “observed proportion – expected proportion”
9
Biometry (Biol4090) - Fall 2015
Homework #4
Student name: ______KEY__________
5) Given that distribution_1 was log normal, distribution_2 was Poisson, and distribution_3 was normal,
answer the following questions (+0.25 each). Note: Re-read the notes from lecture 5:
- What is the only parameter of the Poisson distribution (Lambda) and what does it quantify?
The Poisson distribution only has one parameter (Lambda), which quantified its mean and the variance.
In the Poisson distribution, the mean = the variance.
- Briefly describe how the shape of the Poisson distribution changes as the parameter Lambda increases
from 0.1 to 10 (range of values from lecture and this example)? Specifically, describe what happens to
the skewness and the kurtosis of the distribution.
The Poisson distribution changes shape as lambda increases from a value of 0.1 to a value of 10 (the
current value in this example). Please consult the following figure from your class notes (taken from the
Gotelli book), showing that the distribution starts as being highly asymmetrical, and then becomes
increasingly symmetrical. Thus, the skewness ranges from a large positive value (when Lambda = 0.1) to
a very small positive value (when Lambda = 10). Please check out this slide from lecture 6.
10
Download