Z-scores notes

advertisement
5, 8, 13, 17, 22, 24, 25, 27, 29, 30
8, 10, 22, 24, 25, 25, 26, 27,
45, 72

Graph & Describe
22, 27, 33, 39, 57, 88, 110
Modified Boxplot
Mild outliers are represented by shaded
circles.
 Extreme outliers are represented by open
circles
 Whiskers are only extended to largest
values that are not outliers.

Create a Modified Boxplot
56
54
75
64
76
22
81
78
66
87
62
80
68
72
59
45
An article on peanut butter reported the following scores (quality
ratings on a scale of 0 to 100) for various brands. Construct a
comparative stem-and-leaf plot and compare the graphs.

Creamy:
56
65
40
44
45
50
62
40
56
36
56
30
39
68
22
53
41
50
30

Crunchy:
62
62
80
53
52
47
75
50
56
42
34
62
47
42
40
36
34
75
20, 22, 23, 24, 24, 25, 25, 27, 35

Are there any outliers?

Draw a skeleton boxplot.

Draw a modified boxplot.
Chebyshev’s & The
Empirical Rule
Describing Data in terms of the
Standard Deviation.
Test Mean = 80
St. Dev. = 5
Chebyshev’s Rule
The percent of observations that are within
k standard deviations of the mean is at
least
1

100 1  2  %
 k 
Facts about Chebyshev

Applicable to any data set – whether it is
symmetric or skewed.

Many times there are more than 75% - this
is a very conservative estimation.
# St. Dev.
2
3
4
4.472
5
10
1

100 1  2 
 k 
% w/in k st. dev. of mean
Interpret using Chebyshev
Test Mean = 80
St. Dev. = 5
1. What percent are between 75 and 85?
2. What percent are between 60 and 100?
Collect wrist measurements (in)
Create distribution
 Find st. dev & mean.
 What percent is within 1 deviation of mean

Practice Problems
1. Using Chebyshev, solve the following problem for a
distribution with a mean of 80 and a st. dev. Of 10.
a. At least what percentage of values will fall between 60
and 100?
b. At least what percentage of values will fall between 65
and 95?
Normal Distributions

These are special density curves.

They have the same overall shape
 Symmetric
 Single-Peaked
 Bell-Shaped

They are completely described by giving its
mean () and its standard deviation ().

We abbreviate it N(,)
Normal Curves….
•Changing the mean without changing the standard
deviation simply moves the curve horizontally.
•The Standard deviation controls the spread of a Normal
Curve.
Standard Deviation

It’s the natural measure of spread for Normal
distributions.

It can be located by eye on a Normal curve.
 It’s
the point at which the curve changes from concave
down to concave up.
Why is the Normal Curve Important?

They are good descriptions for some real data
such as
 Test
scores like SAT, IQ
 Repeated careful measurements of the same quantity
 Characteristics of biological populations (height)

They are good approximations to the results of
many kinds of chance outcomes

They are used in many statistical inference
procedures.
Empirical Rule
Can only be used if the data can be
reasonably described by a normal curve.
 Approximately

 68%
of the data is within 1 st. dev. of mean
 95% of the data is within 2 st. dev. of mean
 99.7% of data is within 3 st. dev. of mean
Empirical Rule


What percent do you think……
www.whfreeman.com/tps4e
Empirical Rule (68-95-99.7 Rule)

In the Normal distribution with mean ()
and standard deviation ():
1 of  ≈ 68% of the observations
 Within 2 of  ≈ 95% of the observations
 Within 3 of  ≈ 99.7% of the observations
 Within
The distribution of batting average (proportion of hits) for the 432
Major League Baseball players with at least 100 plate appearances
in the 2009 season is normally distributed defined N(0.261, 0.034).

Sketch a Normal density curve for this distribution of batting
averages. Label the points that are 1, 2, and 3 standard
deviations from the mean.

What percent of the batting averages are above 0.329?

What percent are between 0.227 and .295?
Scores on the Wechsler adult Intelligence Scale (a
standard IQ test) for the 20 to 34 age group are
approximately Normally distributed. N(110, 25).

What percent are between 85 and 135?

What percent are below 185?

What percent are below 60?
2. A sample of the hourly wages of employees who work in
restaurants in a large city has a mean of $5.02 and a st.
dev. of $0.09.
a. Using Chebyshev’s, find the range in which at least
75% of the data will fall.
b. Using the Empirical rule, find the range in which at
least 68% of the data will fall.
The mean of a distribution is 50 and the standard deviation is
6. Using the empirical rule, find the percentage that will fall
between 38 and 62.
A sample of the labor costs per hour to assemble a certain
product has a mean of $2.60 and a standard deviation of
$0.15, using Chebyshev’s, find the values in which at least
88.89% of the data will lie.
Measures of
Position
Percentiles
Z-scores
The following represents my results
when playing an online sudoku
game…at www.websudoku.com.
0 min
30 min
Introduction


A student gets a test back with a score of 78 on
it.
A 10th-grader scores 46 on the PSAT Writing
test
Isolated numbers don’t always provide enough
information…what we want to know is where we
stand.
Where Do I Stand?
Let’s make a dotplot of our heights from 58
to 78 inches.
 How many people in the class have
heights less than you?
 What percent of the dents in the class
have heights less than yours?

 This
is your percentile in the distribution of
heights
Finishing….

Calculate the mean and standard deviation.

Where does your height fall in relation to the
mean: above or below?

How many standard deviations above or below
the mean is it?
 This
is the z-score for your height.
Let’s discuss

What would happen to the class’s height
distribution if you converted each data value
from inches to centimeters. (2.54cm = 1 in)

How would this change of units affect the
measures of center, spread, and location
(percentile & z-score) that you calculated.
National Center for Health
Statistics

Look at Clinical Growth Charts at
www.cdc.gov/nchs
Percentiles

Value such that r% of the observations in
the data set fall at or below that value.

If you are at the 75th percentile, then 75%
of the students had heights less than
yours.
Test scores on last AP Test. Jenny made
an 86. How did she perform relative to her
classmates?
6
7
7
8
8
9
7
2334
5777899
00123334
569
03
Her score was greater than
21 of the 25 observations.
Since 21 of the 25, or 84%,
of the scores are below
hers, Jenny is at the 84th
percentile in the class’s test
score distribution.
Find the percentiles for
the following students….
6
7
7
8
8
9

Mary, who earned a 74.

Two students who earned scores of 80.
7
2334
5777899
00123334
569
03
Cumulative Relative Frequency Table:
Age of First 44 Presidents When They Were Inaugurated
Age
Frequency
Relative
frequency
Cumulative
frequency
Cumulative
relative frequency
40-44
2
2/44 = 4.5%
2
2/44 =
4.5%
45-49
7
7/44 = 15.9%
9
9/44 = 20.5%
50-54
13
13/44 = 29.5%
22
22/44 = 50.0%
55-59
12
12/44 = 34%
34
34/44 = 77.3%
60-64
7
7/44 = 15.9%
41
41/44 = 93.2%
65-69
3
3/44 = 6.8%
44
44/44 = 100%
Cumulative Relative Frequency
Graph:
Cumulative relative frequency (%)
100
80
60
40
20
0
40
45
50 at inauguration
55 60 65
Age
70
Interpreting…
When does it slow down?
Why?
100
Cumulative relative frequency (%)
Why does it get very steep
beginning at age 50?
80
60
What percent were
inaugurated before age 70?
40
20
What’s the IQR?
0
40
45
50 at inauguration
55 60 65
Age
70
Obama was 47….

Interpreting Cumulative Relative Frequency Graphs
11
47
58
Describing Location in a
Distribution
Use the graph from page 88 to answer
the following questions.
Was Barack Obama, who was
inaugurated at age 47, unusually
young?
65 and interpret the 65th
Estimate
percentile of the distribution
What is the relationship between
percentiles and quartiles?
Z-Score – (standardized score)
It represents the number of deviations
from the mean.
 If it’s positive, then it’s above the mean.
 If it’s negative, then it’s below the mean.
 It standardized measurements since it’s in
terms of st. deviation.

Discovery:
Mean = 90
St. dev = 10
Find z score for
80
95
73
Z-Score Formula
x  mean
z
standard deviation
Compare…using z-score.
History Test
Math Test
Mean = 92
Mean = 80
St. Dev = 3
St. Dev = 5
My Score = 95
My Score = 90
Compare
Math: mean = 70
x = 62
s=6
English: mean = 80
x = 72
s=3
Be Careful!
Being better is relative to the situation.
What if I wanted to compare race times?
Find the following percentiles.
X
3
4
5
6
7
8
9
10
Rel.
Freq
0.05
0.12
0.23
0.08
0.02
0.18
0.24
0.08
1. 40th percentile?
C.F.
0.05
0.17
0.4
0.45
0.5
0.68
0.92
1
2. 17th percentile?
3. 70th percentile?
4. 25th percentile?
Homework

Worksheet
Download