Week 11-13-06 1 2 Plot average heights of normal densities placed at each data value, e.g. {10, 14}. It is like smearing each sample value, as if it were a drop of paint, according to the thickness of a normal density. Each normal integrates to one, as does their average the “Sample Density Estimate” shown in dark. 3 4 Making the densities narrower isolates different parts of the data and reveals more detail. 5 6 Histograms lump data into categories (the black boxes), not as good for continuous data. 7 Form of each rectangle comprising a Probability Histogram. Example: A sample of n = 40 finds three data values which are at least 30 but less than 35 (interval [30, 35)). height QuickTime™ and a (LZW) decompressor = areTIFF needed to see this picture. = 3/(40 5) area = w height = 3 / 40 ** * 30 35 bin-width w = 35 - 30 = 5 Histograms may radically change their shape in response to minor changes of bin locations or widths. 8 Plot of average heights of 5 tents placed at data {12, 21, 42, 8, 9}. 9 Narrower tents operate at higher resolution but they may bring out features that are illusory. 10 Population of N = 500 compared with two samples of n = 30 each. 11 Population of N = 500 compared with two samples of n = 30 each. 12 The same two samples of n = 30 each from the population of 500. 13 The same two samples of n = 30 each from the population of 500. 14 The same two samples of n = 30 each from the population of 500. 15 The same two samples of n = 30 each from the population of 500. 16 A sample of only n = 600 from a population of N = 500 million. (medium resolution) 17 A sample of only n = 600 from a population of N = 500 million. (MEDIUM resolution) 18 A sample of only n = 600 from a population of N = 500 million. (FINE resolution) 19 1. A density is controlled by the sd, referred to as bandwidth, of the normal densities used to make it. 1a. You have to be content with the information revealed by the population density at your chosen bandwidth. 1b. Small samples zero-in on coarse densities, i.e. made at large bandwidth, fairly well . 1c. Samples in hundreds may perform remarkably well, even at fine resolution, I.e. small bandwidth. 2. Histograms are notorious for being unstable for some data. Yet, they remain popular. Learn to make them by hand. 3. Learn to make a density for 2 to 4 data values by hand. 20 21