Statistics: A Gentle Introduction By Frederick L. Coolidge, Ph.D. Chapter 2

advertisement
Statistics:
A Gentle Introduction
By Frederick L. Coolidge, Ph.D.
Sage Publications
Chapter 2
Descriptive Statistics:
Understanding Distributions of
Numbers
Chapter 2
1
0730 Q1 Results N=20



1|5
2|1124456679
3|001124779
Chapter 2
2
0900 Q1 Results N=32




1|249
2|0335567799
3|2224444445566889
4|001
Chapter 2
3
Overview

Graphs and tables






What’s the point?
The nasty tricks of the trade
Types of distributions
Grouping data
Cumulative frequency distributions
Stem-and-leaf plot
Chapter 2
4
Graphs and Tables
What’s the point?

What’s the point?

Document the sources of statistical data
and its characteristics.


Chapter 2
Where did you get it?
What is it measuring?
5
Graphs and Tables
What’s the point?

Make appropriate comparisons.




Chapter 2
Compare similar data.
Make the point more clearly.
Make data more understandable.
Eliminate doubt.
6
Frequency Distributions




A table reporting the number of
observations falling into each category of the
variable;
Frequency count for data value is # of times
value occurs in data set;
Ungrouped frequency distribution lists the data
values w/frequency count with which each
value occurs;
Relative frequency for any class is obtained by
dividing frequency for that class by total # of
Cumulative Frequency(CF) and
Cumulative Relative Freq(CRF)



CF- a specific value in a frequency table is
sum of frequencies for all values at or below
the given value;
CRF- the sum of the relative frequencies for
all values at or below the given value
expressed as a proportion;
Grouped Frequency distribution is obtained
by constructing intervals for data and listing
frequency count in each interval
MathAnxiety
Relative Cumulative Cumulative
Scores
Freq Freq
Freq
Relative Freq
1
1
0.05
1
0.05
2
2
0.09
3
0.14
3
3
0.14
6
0.28
4
4
0.18
10
0.46
5
5
0.23
15
0.69
6
0
0
15
0.69
7
2
0.09
17
0.78
8
3
0.14
20
0.92
9
1
0.05
21
0.97
10
1
0.05
22
1.02
MathAnxietyScore7:30class(Grouped
Freq Distribution
Class Intervals
.5-2.5
2.5-4.5
4.5-6.5
6.5-8.5
8.5-10.5
F
3
7
5
5
2
CF
3
10
15
20
22
RF CRF
0.136 0.1364
0.318 0.4546
0.227 0.6819
0.227 0.9092
0.091 1.0002
Histogram Math Anxiety Scores
.30
.25
.20
.15
.10
.5
.5
Chapter 2
2.5
4.5
6.5 8.5
10.5
11
“Blacks More Pessimistic than whites economic
opportunities”
What Govts
Role in
improving
economic
position of
minorities
NonHispanic
Whites(%)
Blacks(%)
Hispanics
Major Role
32
51
16
68
22
9
67
21
8
Minor Role
No Role
Laws Covering Sales of Firearms:
Increase Restrictions( 2000)?
Men(N=493)
Women(N=538)
More
Less
Same
No opinion
256
387
39
11
193
129
5
11
Men and Firearm Restrictions: Frequency
Distribution(N=493)
F
CF
RF
CRF
More
256
256
.52
.52
Less
39
295
.08
.60
Same
193
488
.39
.99
No opinion
5
493
.01
1
Women and Firearm Restrictions:
Frequency Distribution(N=538)
F
CF
RF
CRF
More
387
387
.719
.719
Less
11
398
.020
.739
Same
129
527
.239
.978
No opinion
11
538
.020
.998
Graphs and Tables
What’s the point?

Demonstrate the mechanisms of cause and
effect and express the mechanisms
quantitatively.

Chapter 2
If you vary the cause and the results change in
a predictable and uniform manner, then you
make a stronger case for cause and effect.
16
Graphs and Tables
What’s the point?

Recognize the inherent multivariate (more
than one cause) nature of the problem.

Is there anything with just one cause?

Chapter 2
Temperature of boiling water:
 Altitude of water
 What is in the water (salt)?
17
Graphs and Tables
What’s the point?

Inspect and evaluate alternative
hypotheses.

Cigarette smoking is related to a lower
incidence of Alzheimer’s disease.


Chapter 2
Is it the cigarettes?
Is it the dying at an earlier age, before Alzheimer’s is
diagnosable?
18
Graphs and Tables
The nasty tricks of the trade

The nasty tricks of the trade






Adjust the scale to make the point
Show only part of the scale
Omit the units of measure
Change the scale along the graph
Include too much junk
Not enough to bother graphing
Chapter 2
19
Graphs and Tables
The nasty tricks of the trade
Is Brand One really
any better than the
others?
Chapter 2
20
Stem-and-leaf plot

Presents the frequency of data points
without losing important information.
Data set: 25, 27, 29
Stem  2 579  Leaves
Chapter 2
21
Stem-and-leaf plot


The first digit is the stem
The second digit is each leaf
25
27
29
Stem  2 579  Leaves
Chapter 2
22
Stem-and-leaf plot


The first digit is the stem
The second digit is each leaf
25
27
29
Stem  2 579  Leaves
Chapter 2
23
Stem-and-leaf plot
Let’s try it
Data set: 30, 32, 32, 34, 37, 37, 39
Data set: 5, 9, 10, 11, 11, 23, 25, 27

Chapter 2
24
Types of Distributions
Frequency Distribution

Frequency distribution

Showing what you have

Chapter 2
A way to illustrate how many of each thing.
25
Types of Distributions
Frequency Distribution
Chapter 2
26
Types of Distributions
Normal Distribution

Normal distribution


Also known as the bell-shaped curve
An illustration of the expectation of what
most types of data will look like


Chapter 2
A few data points at each extreme
Most data points in the middle area
27
Types of Distributions
Normal Distribution
Chapter 2
28
Types of Distributions
Positively Skewed Distribution

Not all data are created equal

Positive skew

Chapter 2
Many data points near the origin of the graph
29
Types of Distributions
Negatively Skewed Distribution

Negative skew

Chapter 2
Many data points away from the origin of the
graph
30
Types of Distributions
Bimodal Distribution

Bimodal

Chapter 2
Two areas under the curve with many data
points
31
Types of Distributions
Non-normal Distributions

Nonnormal distributions


But not abnormal
Platykurtic: flat like a plate
Chapter 2
32
Bi-Modal Distribution: Spring 2010 Quiz
Scores
F
CF
RF
CRF
10-16
5
5
.227
.227
17-23
3
8
.136
.363
24-30
2
10
.090
.453
31-37
8
18
.363
.816
38-44
4
22
.181
.997
Types of Distributions
Non-normal Distributions

Leptokurtic: up & down (like leaping)

Bimodal: lumpy
Chapter 2
34
Grouping data

A way of organizing data so that they
are manageable.
Which is easier to understand?
3, 1, 7, 4, 1, 2, 3, 5, 4, 9
or
1, 1, 2, 3, 3, 4, 4, 5, 7, 9
Chapter 2
35
Grouping data
Tips for grouping data

Tips for grouping lots of data

Choose interval widths that reduce your
data to 5 to 10 intervals.
5
Chapter 2
10
15
20
25
30
35
36
Grouping data
Tips for grouping data

Choose meaningful intervals.

5
Which is easier to understand at a glance?
10
15
20
25
30
35
13
16
19
22
or
4
Chapter 2
7
10
37
Grouping data
Tips for grouping data

Interval widths must be the same.
5
10
15
20
25
30
35
30
33
35
NOT
5
Chapter 2
10
20
22
38
Grouping data
Tips for grouping data

Intervals cannot overlap.
5-10
11-15
16-20
21-25
26-30
31-35
36-40
25-30
30-35
35
NOT
5-10
Chapter 2
10-15
14-20
20-26
39
Grouping data
An example

The data are displayed using



A frequency table of individual data points
A frequency table by intervals
Graph of data by intervals
Chapter 2
40
Grouping data
An example
Chapter 2
41
Grouping data
An example
Chapter 2
42
Grouping data
An example
Chapter 2
43
Freq Distribution Using Stated
limits
Age Category
Freq
CF
20-29
7
7
30-39
7
14
40-49
12
26
50-59
3
29
60-69
3
32
70-79
6
38
80-89
2
40
Total
40
Chapter 2
44
Problem w/ Stated Limits



Gap of one between adjacent intervals
Problem for scores with fractional
values; where classify a woman 49.25
years old? Here age would actually fall
between intervals 40-49 and 50-59!!
Real limits extend upper and lower
limits by .5
Chapter 2
45
Freq Distribution Using Real
Upper and Lower limits
Age Category
Freq
CF
19.5-29.5
7
7
29.5-39.5
7
14
39.5-49.5
12
26
49.5-59.5
3
29
59.5-69.5
3
32
69.5-79.5
6
38
79.5-89.5
2
40
Total
40
Chapter 2
46
Upper/Lower limits &Fractional Values



Scores falling exactly at upper real limit
or lower real limit are rounded to
closest even number; EX=59.5
rounded to 60 and included in interval
59.5-69.5
Where would you classify respondent
49.25 years? How about 59.4?
Chapter 2
47
Cumulative Frequency Distribution

Cumulative frequency distribution

Shows how many cases (data points) have
been accounted for out of the total number
of cases (data points).
Chapter 2
48
Cumulative Frequency Distribution

How many data points have accounted for as each
group is displayed.
Chapter 2
49
Cumulative Frequency Distribution

Cumulative frequencies can also be illustrated
using percentages.
Chapter 2
50
Cumulative Frequency Distribution

Cumulative distributions can help give a
reference point for an individual score.

Percentile


What percentage scored above or below the
score of interest
Quartile

Divides the scores into four groups

Chapter 2
25%: 1st, 2nd, 3rd, 4th
51
Cumulative Frequency Distribution
Chapter 2
52
Statistics:
A Gentle Introduction
By Frederick L. Coolidge, Ph.D.
Sage Publications
Chapter 2
Descriptive Statistics:
Understanding Distributions of
Numbers
Chapter 2
53
Download