Frequency Distributions and Graphs

advertisement
Basic Statistics
Frequency Distributions
& Graphs
Structure of Research
(The Scientific Method)
Reviewing
Information
Identify
the
Problem
A
Systematic
Approach
Drawing
Analyzing
Conclusions
Data
Collecting
Data
STRUCTURE OF STATISTICS
TABULAR
DESCRIPTIVE
GRAPHICAL
NUMERICAL
STATISTICS
CONFIDENCE
INTERVALS
INFERENTIAL
TESTS OF
HYPOTHESIS
STRUCTURE OF STATISTICS
Now, we will look at the tabular
and graphical approaches.
DESCRIPTIVE
TABULAR
GRAPHICAL
NUMERICAL
STATISTICS
CONFIDENCE
INTERVALS
INFERENTIAL
TESTS OF
HYPOTHESIS
Step 1
QUESTIONNAIRE
A Self-Concept Scale
Scale:
1=Strongly Disagree
4=Neither Agree nor Disagree
7=Strongly Agree
ITEMS:
1. I usually achieve what I want when I work hard for it.
2. Once I make a plan, I am almost certain to make it
work.
.
.
.
10. Almost anything is possible for me if I really want it.
Step 2
Scores of 100 college students on the self-concept questionnaire.
Step 3
A possible first step in organizing data for interpretation is to
arrange the scores by size, usually from highest to lowest.
RELATIVE FREQUENCY DISTRIBUTION
 The relative frequency of a class is
obtained by dividing the class frequency
by the total frequency.
Grouped Frequency Distribution
Use to present the data as a graph or as table
Grouping and Loss of Information
More usable/
comprehensible
information
tradeoff
Ease of
communication
Precise Information
Accuracy
GRAPHIC PRESENTATION OF A
FREQUENCY DISTRIBUTION
•Histogram vs. Bar Graph
•Polygons (Line Graphs)
•Frequency/Relative Freq
•Cumulative Distributions
•Percentiles
•Stem-and-Leaf Displays
HISTOGRAM
The Histogram is a series of column, each having as its
base one class interval as its height the number of cases,
or frequency, in that class.
FOR WATER USAGE (1,000 GALLONS)
Percent
Frequency
25%
20%
15%
ordinate
10%
5%
score
abscissa
Histogram is a graphing technique that is appropriate for
quantitative data.
To avoid having the figure appear too flat or too steep, it is
usually well to arrange the scales so that the height of the
histogram is
2/3 to 3/4 of its width.
Percent
25%
20%
15%
10%
5%
South
North West
male
female
When one is comparing two distributions that are
based on unequal numbers of observations,
percentages are preferable.
FREQUENCY POLYGON
In the polygon a point is located above the midpoint of
each class interval to represent the frequency in that
class. These points are then joined by straight lines.
15
10
5
0
5
10 15 20 25 30 35
40
The lowest class interval midpoints have zero frequencies.
Frequency polygons are closed at both ends.
Describing Distributions
normal
Positively skewed
Y
Y
Y
X
X
Rectangular
Y
Negatively skewed
X
Bimodal
The Y-axis represents
frequency, and the
X-axis represents the
numerical value of
the observations
Y
X
X
THE BAR GRAPH
A Bar graph is used to present the frequencies of the
categories of qualitative variable. A conventional bar graph
looks exactly like a histogram except for the wider spaces
between the bars.
A bar chart can be used to depict any of the levels of
measurement (nominal, ordinal, interval, or ratio).
Construct a bar chart for the number of persons with
AIDS per 100,000 population for selected metropolitan
areas of July 1990.
City
Atlanta, GA
Austin, TX
Dallas, TX
Houston, TX
New York, NY
San Francisco, CA
Washington, DC
West Palm Beach, FL
Number with AIDS per
100,000 Population
922
245
711
1,245
6,565
1,935
1,059
353
Source: Dept. of Health & Human Services.
BAR CHART FOR THE AIDS DATA
ATLANTA
AUSTIN
DALLAS
HOUSTON
NY, NY.
SAN. FRAN.
WASH., D.C.
W. P. BEACH
Cumulative Percentage Curve
Frequency and percentage polygons can be readily
converted into cumulative percentage curve. The cumulative
percentage or Ogive Curve is the most common type of
cumulative distribution.
Cumulative
percent
IQ score
Step 1: Percent (110--119) = 363/2200 = 0.165
Step 2: 0.165x100 = 16.50%
Step 3: 73.77% + 16.50%=90.27%
Y
Cumulative
percentage
P45=100
IQ score
Percentile and percentile score
X
Percentiles are points in a distribution at or below which a given percent of the cases lie.
P45 corresponds to an IQ 100 score of approximately 100; therefore 55 % of the IQ
scores exceed 100.
THE LINE GRAPH
A line graph is used to show a picture of the relationship
between two variables.
A point on a line graph represents the value on the Y
variable that goes with the corresponding value on the X
variable.
200
Co unt
100
0
8
12
14
15
Ed ucation al Level (years)
16
17
18
19
20
21
Stem-and-Leaf Plots
 When summarizing the data by a group
frequency distribution, some information is lost
since we would only have the classes and the
frequency counts for the classes. We will not
know what are the actual values in the classes.
 A stem-and-leaf display offsets this loss of
information.
 The stem is/are the leading digit(s).
 The leaf is the trailing digit (units digit).
 The stem is placed to the left of a vertical line and the
leaf to the right.
The Dean of the College of Education
reports the following number of students in
the 15 sections of basic statistics offered
this semester. Construct a stem-and-leaf
chart for the data.
• 27, 36, 29, 21, 24, 26, 32, 30, 36, 30, 28, 23, 17, 41, 19.
STEM
1
2
3
4
LEAF
79
1346789
00266
1
Another advantage
of a stem-and-leaf
display is that it is
easily reproduced
with a line printer.
PIE CHART
 A pie chart is especially useful in
displaying a relative frequency
(percentage) distribution.
 A circle is divided proportionally to the
relative frequency (percentage) and
portions of the circle are allocated for the
the different groups.
EXAMPLE
A sample of 200 college students were asked to
indicate their favorite soft drink. The results of the
survey are given on the next slide. Draw pie chart
for this information.
PIE CHART FOR THE TASTE TEST
Coca-Cola
Others
Pepsi
Dr. Pepper
Seven-Up
Download