Uploaded by Majd Abody

Stat project

advertisement
1) Find the sample mean and the sample standard deviation of the men’s annual salaries.
Answer:
Summary statistics:
Column
Mean Std. dev.
Men's Salaries 66404.9 15955.566
2) Find the sample mean and the sample standard deviation of the women’s annual
salaries.
Answer:
Summary statistics:
Column
Mean Std. dev.
Women's Salaries 58268.831 12240.336
3) Based on your answers in parts 1 & 2, which data the men’s salaries or the women’s
salaries set has more variability in the salaries than the other set?
Answer:
Variability refers to how spread scores are in a distribution out; that is, it refers to the
amount of spread of the scores around the mean.
Men’s salaries are more packed or clustered around the mean, whilst Women’s Salaries
are more spread out. That’s mean women’s salaries set has more variability in the salaries
than the men’s set.
4) Create a box-and-whisker plot for the men’s annual salaries and the women’s annual
salaries.
Answer:
5) Create a QQ plot for each of the men’s and women’s annual salaries. Do you notice that
the dots are very close to the line? If yes, you should write a conclusion that the data set
comes from a population that is approximately normal.
Answer:

This data (Men’s Salaries) shows that the dots are very close to the line and that’s
mean data set comes from a population that is approximately normal.

This data (women’s Salaries) shows that the dots are very close to the line and that’s
mean data set comes from a population that is approximately normal.
6) Create a histogram of the men’s annual salaries by starting your bin at 20000 and a
select a bin width of 5000. What can you state about the shape of the histogram?
Answer:
The shape is bell-shaped and that considered to be a normal distribution.
7) Create a 95% confidence interval for the population mean annual salary for women.
Based on your interval, would you support the claim that the average annual salary for a
woman is more than $60000?
Answer:
95% confidence interval results:
Variable
Sample Mean Std. Err.
Women's Salaries

58268.831
DF
L. Limit
1073.5484
129
U. Limit
56144.789
60392.873
I would support the claim that the average salary for a woman is more than $60,000
because the upper limit is $60392.873
8) Create a 95% confidence interval for tor the population mean annual salary for men.
Interpret your results in words. Based on the interval you created, would you agree with a
statement that the average annual salary of men is more than $60000?
Answer:
95% confidence interval results:
Variable
Sample Mean Std. Err.
Men's Salaries

66404.9
DF
1261.3983
159
L. Limit
63913.643
U. Limit
68896.157
I would agree with a statement that the average annual salary of men is more than
$60000 because the upper limit is $68896.157
9) Perform a hypothesis test (t-test) to test the claim that the mean annual salary for men is
more than $62000? Use alpha = 5%.
Answer:
One sample T hypothesis test:
μ: Mean of variable
H0: μ = 62000
HA: μ > 62000
Hypothesis test results:
Variable
Sample Mean Std. Err.
Men's Salaries

66404.9
1261.3983
DF
159
T-Stat
P-value
3.4920771
0.0003
Reject the H0, there is enough evidence at the 0.05 level of significance to not
support the claim.
10) Perform a hypothesis test (t-test), to test the claim that the true mean annual salary for
a woman is less than $62000. Use alpha = 5%.
Answer:
One sample T hypothesis test:
μ: Mean of variable
H0: μ = 62000
HA: μ < 62000
Hypothesis test results:
Variable
Sample Mean Std. Err.
Women's Salaries

58268.831
DF
1073.5484
T-Stat
129
P-value
-3.4755481
0.0003
Reject H0, there is enough evidence at the 0.05 level of significance to not support
the claim
11) Use alpha = 0.05 to test the claim that the mean annual salary for mean is higher than
the mean annual salary for women? Would you support this claim?
Answer:
Two sample T hypothesis test:
μ1: Mean of Men's Salaries
μ2: Mean of Women's Salaries
μ1 - μ2: Difference between two means
H0: μ1 - μ2 = 0
HA: μ1 - μ2 > 0
(without pooled variances)
Hypothesis test results:
Difference
Sample Diff. Std. Err.
μ1 - μ28136.0692

1656.3912
DF
287.09946
T-Stat
P-value
4.9119249
<0.0001
Reject H0, there is enough evidence at the 0.05 level of significance to not support
the claim.
12) Is there a correlation between shoe size and height?
a. a scatter plot of the shoe size and height data
b. find the value of the correlation r and coefficient of determination r^2 and interpret the
meaning of each in words.
Answer:
Equation: y= 51.134754 + 1.8853541x
R (correlation coefficient) = 0.95143845
R^2 = 0.90523513


The correlation coefficient takes on values ranging between +1 and -1.
R (correlation coefficient) in this data is showing that there’s a kind of strong linear
relationship between the two variables.
R^2 has also kind of strong linear relationship between the two variables after it is
close to +1.
c. Test the claim that there is a significant positive correlation. Use alpha = 5%
Parameter estimates:
Parameter
Estimate
Std. Err
Intercept
51.134754
1.4810951
Slope
1.8853541
0.14378039
Alternative
>0
≠0
DF
T-Stat
18
34.524964
18
13.112735
P-value
<0.0001
<0.0001
Analysis of variance table for regression model:
Source DF
SS
Model 1
111.03558
111.03558
Error 18
11.6238
0.64576664
Total 19
122.65938

MS
F-stat
171.94381
P-value
<0.0001
The P-value is less than 0.05, that’s mean is a significant positive correlation.
d. Find the equation of the regression line and if possible, use it to estimate the height of a
person who wears shoe size 10.
Answer:
Equation: y= 51.134754 + 1.8853541x
X= 10
y= 51.134754 + 1.8853541 (10)
= 69.99
Download