14.3 - § 14.4

§ 14.3 Numerical Summaries of Data Numerical Summaries of a Data Set  In the last section we looked at ways to graphically represent a data set-today we will look at numerical ways to summarize similar information.  The are two major types of numerical summary: 1. Measures of location. 2. Measures of spread. Numerical Summaries of a Data Set  In the last section we looked at ways to graphically represent a data set-today we will look at numerical ways to summarize similar information.  The are two major types of numerical summary: 1. Measures of location. 2. Measures of spread. average/mean range The Average / Mean  The average or mean of a data set of size N is found by adding the numbers and dividing by N.  Or more formally, if the data set is { x1 , x2 , x3 , . . . , xN } then the mean is given by: x1 + x2 + x3 + . . . + xN N The Average / Mean  What about when we are given a frequency table?  Let’s look at the test scores from yesterday: Score 4 24 Frequency 1 1 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 The Average / Mean From a Entering Data and Finding the Mean on the TI-83: 1. Hit [Stat] Frequency Table Select “1: Edit…”the total of 2.Step 1: Calculate 3. data. Enter data into L1. If you are working from a the frequency table enter the corresponding frequencies Total = x1 into * f1 +L2x. 2 * f2+ x1 x2 Data 4. Go to the “List” menu ([2nd], [Stat]) x3* f3 + . . . + xk * f1k Freque f1 .f2 5. Select “3: mean( “ 6. You should now be on the ‘main’ screen. ncy  Step 2: Calculate N. Proceed as follows: (a) If you are working from just a list of data, N = f + f 2 + f3 + . . . + type “L11” ([2nd], [1]) , close the parentheses fk and hit [Enter]. (b) If you are working from a freq. table type  Step 3: Calculate “L1” followed by “,”the and “L2” ([2nd], [2]). . Then average. close the parentheses and hit enter. . . xk . . . fk Example: Average Salary  The average salary at a local computer manufacturer with 50 employees is $42,000.  The owner draws a yearly salary of $800,000.  What is the average salary of the other 49 employees? Example: 105 Exam Scores  Suppose you have averaged a 132 out of 150 on the first 3 exams in Math 105. What score would you need on the fourth exam to have an average of 135? Percentiles  The p th percentile of a data set is the value such that p percent of the numbers fall at or below the value.  The rest of the data falls at or above the value.  We will call the p th percent of N the locator, and write it as L . Example: Height Sorting Data on the TI-83: 1. Enter data into L1 as before. 2. Hit [Stat] 3. Select “2: SortA( “ 4. You should now be on the ‘main’ screen. Hit L1. ([2nd], 1) 5. Close the parentheses and hit enter. Finding the p th Percentile  Step 1: Sort the original data set by size. (Suppose {d1 , d2 , d3 , . . . , dN } is the sorted set)  Step 2: Compute the value of the locator. L = ( p /100 )( N )  Step 3: The p th percentile is: (a) The average of dL and dL+1 if L is a whole number. (b) dL+ if L is not a whole number. L+ is L rounded up. Percentiles: The Median and Quartiles  The 50th percentile, called the median, is the percentile that is most commonly used. The median will be written M.  The other two commonly used percentiles are the quartiles: The first quartile, written as Q1, is the 25th percentile. The third quartile, denoted Q3, is the 75th percentile. Example: Let’s examine the test scores again. . . Score 4 24 Frequency 1 1 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 Find the quartiles and the median. The Five-Number Summary  One way to give a nice profile of a data set is the “five-number summary,” which consists of: 1. The lowest value, called the Min. 2. The first quartile, Q1. 3. The median, M. 4. The third quartile, Q3. 5. The highest value, called the Max. Example: The Five-Number Summary for our test score example would look like this: Score 4 24 Frequency 1 1 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 The Five-Number Summary: Box Plots  We can also represent the FiveNumber Summary graphically in what is called a box plot or a box-and-whiskers plot. Min Q1 M Q3 Max Example: Here is the box plot for our test score example: Score 4 24 Frequency 1 1 Min = 4 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 Q1 = 36 M = 44 Q3 = 48 Max = 96 § 14.4 Measures of Spread Example - Find the average and median of the following data sets: • Set 1 = {45, 46, 47, 48, 49, 51, 52, 53, 54, 55} • Set 2 = {1, 12, 20, 31, 41, 59, 70, 78, 89, 99} The Range  One way to measure the spread of data is to examine the range, given by R = Max - Min  The problem with using the range is that outliers can severely affect it. Example: Looking again at our ‘test score’ example. . . Score 4 24 Frequency 1 1 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 We see that the range with the outliers (4 and 96) would be R = 96 - 4 = 92. However, without those pieces of data we would have R = 76 - 24 = 52. The Interquartile Range  In order to eliminate the problems caused by outliers, we could make use of the interquartile range--the difference between the third and first quartile: IQR = Q3 - Q1  This measure tells us where the middle 50% of the data is located. Example: Your instructor didn’t feel like making a different example. . . Score 4 24 Frequency 1 1 28 32 36 40 44 48 56 60 64 72 76 96 2 6 10 16 13 9 1 2 1 8 4 1 The IQR for this set of data is: IQR = Q3 - Q1 = 48 - 36 = 12 The Standard Deviation  The idea: Measure how spread out your data set is by examining how far each piece of information is from some fixed reference point.  The reference point we will use is the mean (average). The Standard Deviation  We could try to average the Deviations from the Mean: (Data value - Mean) Example: Once again, the test score data. . . Score ( x) (x 46.61) Freque ncy 4 24 28 32 36 40 44 48 56 60 64 72 76 96 42.6 1 1 22.6 1 1 18.6 2 1 14.6 6 1 10.6 10 1 6.6 16 1 2.6 13 1 1.3 9 9.3 9 13.3 9 17.3 9 25.3 9 29.3 9 49.3 9 9 1 2 1 8 4 1 The Standard Deviation  We could try to average the Deviations from the Mean: (Data value - Mean)  However, negative deviations and positive deviations will cancel each other out--in fact (assuming we don’t round off any of our figures) the average of the deviations from the mean will always be 0! The Standard Deviation  What would happen if we squared the deviations from the mean?  The squared deviations are always non-negative, so there would be no canceling.  The average of these squared deviations is called the variance, V. The Standard Deviation  Unfortunately, there is a problem with using the variance as well--the units of measure. For instance if we were studying people’s height in inches (in), the variance would appear in units of in2. The Standard Deviation  Unfortunately, there is a problem with using the variance as well--the units of measure. For instance if we were studying people’s height in inches (in), the variance would come be in units of in2.  The solution to our dilemma is simple-we will just take the square root of the variance to get the what is called the standard deviation, . Finding The Standard Deviation  Step 1: Find the average/mean of the data set. Call it A.  Step 2: For each number x in the data set find the deviation from the mean, x - A.  Step 3: Square each of the deviations found in Step 2.  Step 4: Find the average of the squared deviations found in step 3. This is the variance, V.  Step 5: Take the square root of the variance. This is the standard deviation, . Finding The Standard Deviation  Another way to find the Standard Deviation by hand is to use the following formula:  = √ N ∑ ( x i - A )2 i=1 N Finding The Standard Deviation Finding all of the information from 14.2-14.3 on the TI-83: 1. Enter data as shown previously. Quit to the main screen. 2. Hit [Stat] 3. Move right to the “CALC“ menu. 4. Select “1-Var Stats”. 5. Now on the main screen, type “L1”. (If you are using data from a frequency table also type “,” and “L2”) Hit [Enter]. 6. Interpret the information as follows: x is the mean/average, A; x is the Standard Deviation; n is the size of your data set; If you arrow down the Min, Max, Median and Example: Find the standard deviation for the following data set. {1, 6, 14, 19}

14.3 - § 14.4

Related documents

Products

Support

14.3 - § 14.4

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib