Standard Scores and the Normal Distribution

advertisement
Basic Statistics
Standard Scores and the
Normal Distribution
Agenda
Standard Scores:
Normal Distribution:
Where does an
observation fall in
a distribution of
scores?
What is “normal”
and what does it
mean?
All data can be arrayed in a
“distribution” of data. We
describe a set of data by
describing the characteristics of
its distribution. We do this with
tables, graphs, and/or numerical
summary measures.
Consider the Following Distribution:
Score
X
Frequency
f
1
2
3
4
5
1
2
4
2
1
We have described it here with a table. However, it
could also describe it with a graph. What type of
graph would be appropriate?
f
Let’s use a Histogram
4-3--
2--
Score
X
Freq
f
1
2
3
4
5
1
2
4
2
1
1-0-1
2
3
4
5
X
It is analogous to stacking 10 boxes (one for each score) on
top of one another to form the histogram.
We could also use
Numerical Descriptive Measures
Mode = 3.0
Median = 3.0
X 1 2  2  3  3  3  3  4  4  5

Mean 

 3.0
n
10
2
XX

Variance 
 1.333
n 1


StandardDeviation  Variance  1.333  1.15
These measures describe the full distribution of scores. We
will turn now to describing a single score in the distribution
Suppose Robert makes a 4 on the quiz and
we want to describe his performance.
f
Score
X
Freq
f
1
2
3
4
5
1
2
4
2
1
Robert’s 4 would be
one of the two 4s in
the distribution above.
1
2
3
4
5
X
Robert’s score is represented by either the
red or green rectangle in the graph above.
To better understand Robert’s score, we
can add “relative frequency” and
“cumulative frequency columns
Score
Frequency
X
f
Relative
Frequency
rf
1
2
3
4
5
1
2
4
2
1
1/10 = .1
2/10 = .2
4/10 = .4
2/10 = .2
1/10 = .1
10
Cumulative
Frequency
cf
.1
.3
.7
.9
1.0
Cumulative
Percentage
c%
10
30
70
90
100
Relative Frequency = f/sum
Example (rf for 4): 2/10 = .2
Score
Frequency
X
f
Relative
Frequency
rf
1
2
3
4
5
1
2
4
2
1
1/10 = .1
2/10 = .2
4/10 = .4
2/10 = .2
1/10 = .1
Cumulative
Frequency
cf
.1
.3
.7
.9
1.0
Cumulative
Percentage
c%
10
30
70
90
100
This table enables us to better interpret Robert’s score. In
fact, we know that he scored as well as or better than 90% of
the students who took the quiz.
Since this is the definition of a “Percentile Rank” (the
percentage of scores equal to or below a particular score), we
not only can better interpret the score, we can compute it.
We can see this graphically using the histogram.
rf
.40 -.30 –
.20 –
.10 –
Represents 90%
of Total Distribution
.00 –
1
2
3
4
5
X
Continuous Distributions
Placement of a score within a continuous
distribution is analogous to a discrete
distribution that we just reviewed.
However, the computation is not the
same. It is quite easy to identify
percentile ranks for scores that fall an
exact number of standard deviations
above or below (or on) the mean, but we
do not yet have the tools for others.
RECALL THE EMPIRICAL
RULE
• For any symmetrical, bell-shaped
distribution:
• approximately 68% of the
observations will lie within  1s of
the mean (m);
• approximately 95% within  2s of
the mean (m); and
• approximately 99.7% within  3s of
the mean (m).
Recall that a normal distribution has the
following percentages of scores within 1, 2,
or 3 standard deviations.
68%
95%
99%
20 30 40 50 60 70 80
Mean=50
SD=10
Normal Distribution
m-3s m-2s m-1s
m
m+1s m+2s m+3
s
2nd Percentile
16th Percentile
50th Percentile
84th Percentile
98th Percentile
99th Percentile
Ordinal vs. Interval Standard Scores
A percentile rank is
based on an area
transformation and
results in an ordinal
score.
We can perform a linear
transformation (math term)
and maintain the same
type of measurement
scale.
Consider a distribution with a mean of 50 and a standard deviation of 10.
|
30
|
40
|
50
|
60
|
70
Suppose we transform to a scale with a mean of 0 and SD of 1.
|
-2
|
-1
|
0
|
1
|
2
This type of scale change is referred to as a linear
transformation. That means, we maintain all of the
relationships in the data, but change the scale by
changing the mean and standard deviation.
Suppose X has a distribution with a mean
of 80 and a standard deviation of 20.
The scores are
“standard scores”
and are depicted
by a “z”
X - 80
z=
20
First, we subtract the mean
Then, we divide by the
standard deviation
If we did that to every score in the
distribution, the distribution of new scores
would have a mean of 0 and a SD of 1.
This is the most common type of standard
score and is often called a z-score. Every
other standard score can be derived from a
z-score.
X X
z
s
To summarize, we can convert any
distribution to a standard (or z)
distribution by subtracting the mean of
the distribution from every score and
dividing the results by the standard
deviation. This will not alter the shape
of the distribution, it will look just like
the original distribution.
An Example
Suppose the distribution of the number of times it
takes people to complete a task has a mean of 32
and a standard deviation of 5. It took Bob 38
minutes to complete the task. What is his z-score?
zBob
XBob  X

S
38  32 6

  1.2
5
5
It took Mary only 29 minutes. What was her z-score?
z Mary 
X Mary  X
S
29  32  3


 0.6
5
5
How to determine the raw score when given the
z score and the mean and SD of the distribution.
1. Begin with a z-score.
2. Multiply by the SD.
3. Add the mean.
In the previous example, Bob got a
z-score of 1.2. Confirm his raw (X) score.
X = 5 x 1.2 + 32 = 6 + 32 = 38
Thus, Bob’s raw score was 38.
Just like we can transform any
distribution to a new distribution
with a mean of 0 and a standard
deviation of 1 (a z-distribution), we
can transform any z-distribution to
a new distribution with any mean
and standard deviation we desire.
In general, we can create a distribution
with any mean and SD with the
following formula:
W = (new SD) · z + (new Mean)
or
W  Sw  z  W
Some Common Score Scales
GRE: Mean = 500
SD = 100
GRE = 100 · z + 500
MAT Mean = 50
SD = 10
MAT = 10 · z+ 50
IQ: Mean = 100 SD = 15
IQ = 15 · z + 100
Stanine: Mean = 5
SD = 2
Stanine = 2 · z + 5*
*Rounded to whole number with minimum of 1 and maximum of 9.
Area Transformations for
Normal Distributions
If the area under a curve is adjusted so that the total area is 1.00, then the
proportion of the area under the curve represents the proportion of scores in
that interval. We can see that for the previous example:
rf
.4—
.1
.3—
.1
.2—
.1
.1
.1
.1
.1
.1
.1
1
2
.1—
.0—
3
4
For example, the proportion
of students who made a 2 on
the quiz is equal to the area
in the boxes representing a
2 or 2/10 = .2.
Or, the proportion of students
who scored 4 or less is .90 or
90%. This also indicates that
this student had a percentile
5
X rank of 90.
For a Normal Distribution
The principle is exactly the same for a
normal as any continuous distribution.
However, finding the area under a
continuous curve requires a level of
mathematics not required for this course.
Not to worry! Someone has computed all
possible areas for the standard normal
distribution (z-scores) and placed them in a
table.
Assume the Total Area Under the
Normal Curve is 1.00
1.00
z
How do we determine the area under the curve for a
z-score of 1.00? The proportion of the area under the
curve represents the proportion of scores in that interval.
0
1.0
z
What is the area under the curve below 1.0?
In order to determine this value, we must learn
how to use Table A, Areas under the Normal
Curve (Page 525) .
0
1.0
z
What is the area under the curve below 1.0?
Table A has three columns. A is the z score; B is
the area between the mean (z=0) and the
selected z score; and C is the area beyond the
selected z score. Note that for z = 1.00, area in
column B is .3413 and for column C it is .1587
.84
.1587
0
1.0
z
What is the area under the curve below 1.0?
Since the total area under the curve is 1.00 and the
area you do not want is .1587, the red shaded area
(area below 1.00) is 1.00 - .1587 = .8413 ≈ .84
We could convert this to a percentage by multiplying by
100. Thus, a z-score of one would also be a percentile
rank of 84.
Find the percentile rank of a student whose
z-score is -1.22
-1.22
0
z
First, we must use the Areas Under the Normal
Curve Table to determine the area under the curve
between the z-score and the tail of the distribution.
How can we use the table
when it does not include
negative z-values?
Since the normal distribution
is symmetrical, the upper end
(+ end) and the lower end
(- end) are exactly the same!
Find the percentile rank of a student
whose z-score is -1.22
-1.22
0
z
Thus, the area below (column C) for z = -1.22 is
.1112 or .11 rounded off. Multiplying by 100, we
find that the percentile rank is 11.
The Normal Curve Area Table can also be
used to find a z-score when the percentile
rank of a student is known.
To do this, the area under the curve below
the z-score is noted. If the percentile rank
is less than or equal to 50, you can go
directly to Column C in Table A and
determine the corresponding z-score.
What is the z-score for a student who
had a percentile rank of 33?
What is the z-score for a student
who had a percentile rank of 33?
First, look in Table A and find .3300 or as
close as possible in Column C. Note that
the z score is 0.44.
Since the percentile is less than 50, the z
score must be negative.
Therefore, the answer is -.44.
VERY IMPORTANT!!!!!
Note that the z-score is negative. All
z-scores for percentiles less that 50
will be negative as they will be to the
left of the mean on the graph.
—
Z
|
0
50th %tile
+
What if a percentile is greater
than 50? How do we find its
equivalent z-score?
The simple answer is that we
subtract the percentile from
100 and proceed just as for a
percentile of less than 50,
except that the z-score would
be positive.
However, it is easier to see when we use
a graph. What is the z-score for a
percentile of 78?
78%
22%
0
?
50
78
z
%tile
The 78th percentile has to be to the right of the 50th percentile. Thus, the area
to the right of the 78th percentile is 100 – 78 = 22 or .2200 in Column C.
Summary
•To find a z-score from a distribution with
mean X and standard deviation S.
XX
z
S
•To rescale a z-score to a new distribution
with a new mean W and a new standard
deviation S.
W  Sw  z  W
Summary Continued
• To convert score to percentile rank in a normal
distribution.
1. Convert score to z-score
2. Draw picture of a normal distribution
3. Place z-score on picture
4. Determine area in tail of the distribution
5. If z-score is positive, subtract area from from 1.00
and convert to percent.
6. If z-score is negative, convert area to percent.
Summary Continued
• To convert a percentile rank to another scale.
1. Draw picture of a normal distribution
2. Place percentile rank in approximate position
3. Determine area in tail.
4. Find z-score in Normal Curve Area Table
5. Determine sign of z-score (- if PR<50, + if PR>50)
6. Convert z-score to proper scale
i.e., use X  S  z  X
A Comparison of Standard Scores
z-scores
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
Percentiles
1
2
16
50
84
98
99
MAT Scores
20
30
40
50
60
70
80
GRE Scores
200
300
400
500
600
700
800
Download