Construction Engineering 221 Sampling and Mean Comparison

advertisement
Construction Engineering 221
Sampling and Mean Comparison
Sampling and Mean Comparison
• We have talked about several sampling
distributions:
– z (normal) distribution used to estimate
probability of a measurement occurrence
– n (binomial) distribution used to estimate
probability of a counting occurrence
– r (correlation) distribution used to estimate
relatedness between two variables within a
population (extension is regression)
Sampling and Mean Comparison
• Another important distribution is the t-distribution
used to estimate population means
• If you draw a sample from one population
(engineers, men, heavy drinkers, truck buyers)
and compare them to a different population
(accountants, women, non-drinkers, van buyers)
on some randomly distributed variable, how will
you know if the differences are “real” or merely a
fluke of random measurement errors (spurious)
Sampling and Mean Comparison
• You use the t-statistic for population mean
comparisons.
• Estimating the population mean:
– For large samples, the sample mean is an
unbiased estimator of population mean
– Sample mean is normally distributed
– Normally distributed sample means can be
used with confidence intervals and margin of
error to make judgements about mean
comparisons
Sampling and Mean Comparison
• How big should a sample be:
– For 95% confidence interval (2 standard deviations);
n= 1/e2, where e is the margin of error (1%, 2%, etc.)
– If you want to be 99% sure that the sample mean will
be within 2 standard deviations of the population
mean, you must sample 10,000 people. If you can
live with being 95% sure, you need only sample 400
people
– Usually pick confidence interval and margin of error
ahead of time based on criticality and other factors
Sampling and Mean Comparison
• If you are comparing 2 means, use the tstatistics
• If you are comparing two percentages, use
the z-statistic
• If you are comparing 3 or more means,
use the F-statistic
• If you are comparing 3 or more
percentages, use the Chi Square statistic
Sampling and Mean Comparison
• The hypothesis that assumes the
populations are alike (no differences in the
means) is the null hypothesis
• You test the null hypothesis to determine
the likelihood that it is true
Unlikely (95%) that
the samples came
from the same
populations
Mean
sample 2
Mean
sample 1
Sampling and Mean Comparison
• Assume you are testing an admixture to
make concrete more “pumpable”, but don’t
want to diminish early strength
• You test 25 cylinders of regular concrete
(control group) and 25 cylinders of
concrete with the admixture.
Variable 1
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
Variable 2
2496.6
2438.6
16791.08
15851.08
25
25
-0.10432
0
24
1.527462
P(T<=t) one-tail
0.06986
t Critical one-tail
1.710882
P(T<=t) two-tail
0.139719
t Critical two-tail
2.063898
Download