Statistics for Dummies

advertisement
Statistics fun and
exciting
Workshop
What’s going to be covered
• Diagrams
• Data Summary and Presentation
• Binomial distribution
• Engineering/Statistics Toolbox
• Z-test
•
Type 2 Error
• T-test
• C2 Test
Dot Diagram
Box Plot
Q1
Q2
Q3
x = 1550
1.5 IQR
IQR
IQR = Inter Quartile Range
1.5 IQR
1030
1040
1050
1060
1070
1080
Histogram
Frequency
Cumulative Frequency
25
70
60
20
50
15
40
30
10
20
5
10
0
20
30
40
50
60
70
80
0
20
30
40
50
60
70
80
Data Summary
Stem and Leaf Diagram
Correlation
Coefficient
n
R=
S
i=1
(xi – x)(yi – y)
n
(S )( S
n
(xi – x)2
i=1
i=1
)
(yi – y)2
Stem
1
2
3
4
5
7
8
Leaf
3
245
36814
4624563
5252
4
Freq.
1
3
5
7
4
0
1
Quartile/Percentile Calculation
Quartile
1st
(n + 1)
4
2nd
2(n + 1)
4
3rd
3(n + 1)
4
Percentile
5th
.05(n + 1)
95th
.95(n + 1)
Value will give ordered
observation
Interpolate as needed
Binomial Distribution
P(X = x) =
( ) p (1-p)
n!
( ) = x!(n – x)!
n
x
n
x
x
n-x
We use Binomial Distribution when:
1. Trials are independent
2. Each trial results in one of two possible
outcomes, success or failure
3. The probability, p, remains constant
Example 3-27
• Samples of water have a 10% chance of containing high levels of organic
solids. Assume the samples are independent with regards to the presence of
the solids. Determine the probability that in the next 18 samples, exactly 2
contain high solids.
Solution
• X = the # of samples that contain high solids
• P= 0.1
• N = 18
•
•
18
P(X=2) =
0.1
2
P(X=2) =0.284
2
0.9
16
Engineering/Statistics Toolbox
•
Known as the procedure for hypothesis testing
Steps for Generic Hypothesis Testing
•
•
•
•
•
•
1. Identify Parameter Of Interest:
•
For instance; determine the saltiness of a potato chips
2. State the Null Hypothesis (H0):
•
Standard that you are testing against, like the given average students test scores
3. Alternative Hypothesis (H1):
•
Specify an appropriate alternative hypothesis
4. Test Statistic
•
Equation you are going to use for each test. Z = X-m/(s/n^.5)
6. Computations
•
Plug and chug
7. Conclusion
•
Decide whether the Null Hypothesis should be rejected and report and that in the problem context.
Z-Test
• When do you use it?
•
•
Known mean and known variance
Gives the probability density of when something is going to happen
• Most of the time an alpha value will be given to you
•
If not, assume 0.05
Example
• Tom likes candy, his favorite is peanut butter cups. He’s been eating peanut
butter cups everyday, and Tom thinks the peanut butter cup company is
filling the bag with less peanut butter cups than they claim. He takes a
sample of 8 bags and find the average amount of peanut butter cups per
bag is 32 and they claim its 35. The standard deviation is 2.4. Are they filling
the bags less, let α = 0.05.
solution
• Z = (x-µ)/(σ/ 𝑛)
• Z= -3.54
• Reject the null hypothesis
Type II Error
•
•
•
•
When you fail to reject the null hypothesis when it is wrong then you have
committed a type II error
b = f(Z0)
Power = 1 - b
For instance:
Say you have a pop. of 50 beads with an average diameter of 10 mm (actual average
diameter). However, your sample of 10 beads has an average of 15 mm. You want to
confirm that a null hypothesis of 15 inches is correct. If you fail to reject the null you
messed up.
T-Test
• Unknown variance and known mean
• You need to determine the sample variance
• You need to know degrees of freedom
• That will be n-1, (n is the sample size)
• The same as the Z-test except with degrees of freedom and
sample variance
Example 4-7
•
•
•
•
An experiment was performed in which 15 golf club drivers produced by a particular
club maker were selected at random and their coefficients od restitution measured.
It is of interest to determine if there is evidence (with α=0.05) to support a claim
that the mean coefficient of restitution exceeds 0.82. n = 15.
Observations
X= 0.83725
S= 0.02456
Solution
• T = (x-µ)/(S/ 𝑛)
• T = 2.72
• 14 degrees of freedom
• P < 0.05
• Reject null hypothesis, the mean coefficient of restitution exceeds 0.82.
C2-Test
• This is a test on the sample variance
• Much the same as T-test
• Must know the sample variance, as well as the actual variance
• This tests variance, NOT standard deviation
Example 4-10
• A random sample of 20 liquid detergent bottles results in a sample variance
of fill volume of s^2= 0.0153. if the variance of fill volume exceeds 0.01 an
unacceptable proportion of bottles will be under filled and overfilled. Is
there evidence in the sample data to suggest that the manufacturer has a
problem with under and over filled bottles? α=0.05
Solution
𝑛−1 𝑠 2
σ2
• =
• 𝑥 2 = 29.07
• Significance of 0.05 and DOF=19, 𝑥 2 = 30.14
• Fail to reject null, evidence is not strong enough to show the variance of fill
𝑥2
volume exceeds 0.01.
Download