Pictorial and Tabular Methods

advertisement
Pictorial and Tabular Methods
Stem-and-Leaf Displays
1. Select one or more leading digits for the stem values. The trailing
digits become the leaves.
2. List possible stem values in a vertical column.
3. Record the leaf for every observation beside the corresponding
stem value.
4. Indicate the units for stems and leaves someplace in the display.
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
1 / 15
Pictorial and Tabular Methods
Example(Example 1.2 p5): The article ‘‘Effects of Aggregates
and Microfillers on the Flexural Properties of
Concrete’’ reported on a study of strength properties of high
performance concrete obtained by using superplasticizers and certain
binders. The accompanying data on flexural strength (in MPa)
appeared in the article cited:
5.9 7.2 7.3 6.3 8.1
6.8
7.0
7.6
6.8
6.5 7.0 6.3 7.9 9.0
8.2
8.7
7.8
9.7
7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7
We are interested in the average value of flexural strength for all
beams that could be made in this way.
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
2 / 15
Pictorial and Tabular Methods
5.9
6.5
7.4
7.2
7.0
7.7
7.3
6.3
9.7
6.3
7.9
7.8
8.1
9.0
7.7
6.8
8.2
11.6
7.0
8.7
11.3
7.6
7.8
11.8
6.8
9.7
10.7
The decimal point is at the |
5
6
7
8
9
10
11
|
|
|
|
|
|
|
9
33588
00234677889
127
077
7
368
Liang Zhang (UofU)
• identification of a typical value
• presence of any gaps in the data
• extent of symmetry in the
distribution of values
• number and location of peaks
• presence of any outlying values
Applied Statistics I
June 9, 2008
3 / 15
Pictorial and Tabular Methods
Remark:
1. Each data in the population must consist of at least two digits.
e.g. the stem-and-leaf display is not suitable for the data set
1,2,1,4,1,5,2,6,1,3,2,3
2. Ordering the leaves from smallest to largest is not necessary
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
4 / 15
Pictorial and Tabular Methods
The decimal point is at the |
5
6
7
8
9
10
11
|
|
|
|
|
|
|
9
38853
23060984787
127
077
7
638
Liang Zhang (UofU)
The decimal point is at the |
5
6
7
8
9
10
11
Applied Statistics I
|
|
|
|
|
|
|
9
33588
00234677889
127
077
7
368
June 9, 2008
5 / 15
Pictorial and Tabular Methods
Dotplots:
e.g. The dotplot for the previous example:
In a dotplot, each data is represented by a dot above the
corresponding location on a horizontal measurement scale. When a
value occurs more than once, there is a dot for each occurrence, and
these dots are stacked vertically.
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
6 / 15
Pictorial and Tabular Methods
Histograms
e.g. The histogram for the previous example:
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
7 / 15
Pictorial and Tabular Methods
Discrete & Continuous Variables:
A numerical variable is discrete if its set of possible values is either
finite or can be listed in an infinite sequence.
e.g. x = number of students in this classroom who drove to school
today
Usually arising from counting
A numerical variable is continuous if its possible values consist of an
entire interval on the number line.
e.g y = maximum hours a GE lamp can last
Usually arising from measuring
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
8 / 15
Pictorial and Tabular Methods
Frequency: the frequency of any particular data value is the number
of times that value occurs in the data set.
Relative Frequency: the relative frequency of a value is the fraction
of proportion of times the value occurs
relative frequency =
number of times the value occur
number of observations in the data set
e.g.
frequency of value 6.8:
2
2
relative frequency of the value 6.8: 27
= 0.074
Frequency Distribution: a tabulation of the frequencies and/or
relative frequencies.
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
9 / 15
Pictorial and Tabular Methods
Constructing a Histogram for a Data Set:
1. Divide the data set into a suitable number of class interval or classes;
2. Determine the frequency and relative frequency for each class;
3. Mark the class boundaries on a horizontal measurement axis;
4. Above each class interval, draw a rectangle whose height is the
corresponding relative frequency(or frequency)
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
10 / 15
Pictorial and Tabular Methods
Determine frequency and relative frequency for each class:
classes
frequency relative frequency
5.00 - 5.99
1
0.037
6.00 - 6.99
5
0.185
7.00 - 7.99
11
0.407
8.00 - 8.99
3
0.111
9.00 - 9.99
3
0.111
10.00 - 10.99
1
0.037
11.00 - 11.99
3
0.111
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
11 / 15
Pictorial and Tabular Methods
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
12 / 15
Pictorial and Tabular Methods
Remark:
1. For discrete data, we usually don’t have to determine the class
intervals.
2. There is no hard-and-fast rules for the choice of class intervals. A
reasonable rule of thumb is
√
number of classes = number of observation
3. Equal-width classes may not be a sensible choice if a data set
“stretches out” to one side or the other.
e.g.
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
13 / 15
Pictorial and Tabular Methods
Remark:
3. Equal-width classes may not be a sensible choice if a data set
“stretches out” to one side or the other.
e.g.
Use a few wider intervals near extreme observations and narrower
intervals in the region of high concentration.
rectangle height =
Liang Zhang (UofU)
relative frequency of the class
class width
Applied Statistics I
June 9, 2008
14 / 15
Pictorial and Tabular Methods
Shapes of Histograms:
Liang Zhang (UofU)
Applied Statistics I
June 9, 2008
15 / 15
Download