Week 11-13-06 1

advertisement
Week 11-13-06
1
2
Plot average heights of normal densities placed at each data
value, e.g. {10, 14}. It is like smearing each sample value, as
if it were a drop of paint, according to the thickness of a
normal density. Each normal integrates to one, as does their
average the “Sample Density Estimate” shown in dark.
3
4
Making the densities narrower
isolates different parts of the data
and reveals more detail.
5
6
Histograms lump data into
categories (the black boxes), not
as good for continuous data.
7
Form of each rectangle comprising a Probability Histogram.
Example: A sample of n = 40 finds three data values which
are at least 30 but less than 35 (interval [30, 35)).
height
QuickTime™ and a
(LZW) decompressor
= areTIFF
needed to see this picture.
= 3/(40 5)
area = w height = 3 / 40
** *
30
35
bin-width w = 35 - 30 = 5
Histograms may radically change
their shape in response to minor
changes of bin locations or widths.
8
Plot of average heights of 5 tents
placed at data {12, 21, 42, 8, 9}.
9
Narrower tents operate at higher
resolution but they may bring out
features that are illusory.
10
Population of N = 500 compared
with two samples of n = 30 each.
11
Population of N = 500 compared
with two samples of n = 30 each.
12
The same two samples of n = 30
each from the population of 500.
13
The same two samples of n = 30
each from the population of 500.
14
The same two samples of n = 30
each from the population of 500.
15
The same two samples of n = 30
each from the population of 500.
16
A sample of only n = 600 from a
population of N = 500 million.
(medium resolution)
17
A sample of only n = 600 from a
population of N = 500 million.
(MEDIUM resolution)
18
A sample of only n = 600 from a
population of N = 500 million.
(FINE resolution)
19
1. A density is controlled by the sd, referred to as
bandwidth, of the normal densities used to make it.
1a. You have to be content with the information revealed
by the population density at your chosen bandwidth.
1b. Small samples zero-in on coarse densities, i.e.
made at large bandwidth, fairly well .
1c. Samples in hundreds may perform remarkably well,
even at fine resolution, I.e. small bandwidth.
2. Histograms are notorious for being unstable for some
data. Yet, they remain popular. Learn to make them by
hand.
3. Learn to make a density for 2 to 4 data values by hand.
20
21
Download