DENSITY CURVES and NORMAL DISTRIBUTIONS The histogram displays the Grade equivalent vocabulary scores for 7th graders on the Iowa Test of Basic Skills. The scores of students on this national test have a regular distribution. This histogram is mostly symmetric. Both tails fall of smoothly from the center peak. There are no obvious gaps or outliers. THE SMOOTH CURVE IS A GOOD DESCRIPTION OF THE OVERALL PATTERN To change from a histogram to a smooth curve Use PROPORTIONS of the observations that fall in each range rather than actual counts of observations. Each bar area will represent the proportion of observations in that class For the curve, the area under the curve represents the proportions of the observations. Adjust the scale of the graph so that the total area under the curve is equal to 1. DENSITY CURVES Describe the overall shape of distributions Idealized mathematical models for distributions Show patterns that are accurate enough for practical purposes Always on or above the horizontal axis The total area under the curve is exactly 1 Areas under the curve represent relative frequencies of observations The MEDIAN (M) is the point with half the observations on either side. The QUARTILES divide the area under the curve into quarters. The median is the POINT OF EQUAL AREAS. The MEAN (or arithmetic average) is the point at which the curve would balance if made of a solid material. DENSITY CURVES can be symmetrical or skewed. Remember that for symmetrical distributions, the Median and Mean are equal. Because a Density Curve is an Idealized Description of the distribution of data, we must distinguish between: The Mean x , and standard deviation (s ) ; computed from the actual observations & The mean (μ ) and standard deviation (σ ) of the idealized distribution. EXAMPLE: Consider the unusual density curve: Find the % of the data in the following intervals 0 < X < 0.6 ? 0.2 < X < 0.4 ? 0 < X < 0.8 ? THE NORMAL DISTRIBUTION Gauss used the Normal Distribution to analyze astronomical data in 1809. The normal curve is often called the Gaussian Distribution or more normally, “The Bell Shaped Curve” The normal curve is the most used statistical distribution because 1. Normality arises naturally in many biological, physical, and social measurement situations. 2. It is a good approximation of many distribution curves of chance occurrence (i.e. binomial) 3. Normality is important in statistical inference. The Normal Distribution is characterized by two parameters: The Mean ( μ ) - A measure of center or location. The mean can be any + value. The mean is in the same location as the median. The Standard Deviation ( σ ) – A measure of spread. The standard deviation must be a positive number. Together, the Mean and the Standard Deviation define a specific normal distribution. The mean and standard deviation determine the shape of the normal curve AND The standard deviation can be located visually by finding the INFLECTION POINTS on either side of the mean AND The INFLECTION POINTS of the curve are the places where the CONCAVITY changes. INFLECTION POINTS ON THE NORMAL CURVE In the normal distribution with a mean (mu) and standard deviation (sigma): 68% of all observations lie within one standard deviation of the mean 95% of all observations lie within two standard deviations of the mean 99.7% of all observations fall within three standard deviations of the mean. Because normal distributions are used so frequently, a short notation is often used to describe the parameters of mean and standard deviation. N ( μ ,σ ) For example: N ( 64.5, 2.5 ) indicates a normal distribution with a mean = 64.5 and a standard deviation of 2.5. Percentile is a familiar term because it is so frequently used in the reporting of standardized test scores. Percentiles are used when we are interested in seeing where an individual observation stands in relation to other observations in the distribution. An observations PERCENTILE is the percent of the distribution that lies to the LEFT of the observation HOMEWORK Chapter 2 2.6 – 2.9 2.11 – 2.18